blob: 56393e260704acfd5f594d977b1c7f1d37cb81c4 [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
9tools, such as Puppet, Salt, and Ansible.
10
11The purpose of an ENC is to allow a system administrator to maintain an
12inventory of nodes to be managed, completely separately from the configuration
13of the automation tool. Usually, the external node classifier completely
martin f. krafft5ee69b32013-06-24 13:41:06 +020014replaces the tool-specific inventory (such as site.pp for Puppet,
martin f. krafft3924e892013-06-25 11:57:03 +020015ext_pillar/master_tops for Salt, or /etc/ansible/hosts).
martin f. krafft3c333222013-06-14 19:27:57 +020016
martin f. krafft62239892013-06-14 20:03:59 +020017reclass allows you to define your nodes through class inheritance, while
18always able to override details of classes further up the tree. Think of
19classes as feature sets, as commonalities between nodes, or as tags. Add to
20that the ability to nest classes (multiple inheritance is allowed,
21well-defined, and encouraged), and piece together your infrastructure from
22smaller bits, eliminating redundancy and exposing all important parameters to
23a single location, logically organised.
24
martin f. krafft3c333222013-06-14 19:27:57 +020025In general, the ENC fulfills two jobs:
26
27 - it provides information about groups of nodes and group memberships
28 - it gives access to node-specific information, such as variables
29
martin f. krafft5ee69b32013-06-24 13:41:06 +020030In this document, you will find an overview of the concepts of reclass and the
31way it works. Have a look at README.Salt and README.Ansible for information
32about integration of reclass with these tools.
martin f. krafft3c333222013-06-14 19:27:57 +020033
martin f. krafftd4833b32013-06-23 13:35:46 +020034Installation
35~~~~~~~~~~~~
36Before you can use reclass, you need to run make to configure the scripts to
37your system. Right now, this only involves setting the full path to the
38Python interpreter.
39
40 make
41
42If your Python interpreter is not /usr/bin/python and is also not in your
43$PATH, then you need to pass that to make, e.g.
44
45 make PYTHON=/opt/local/bin/python
46
martin f. krafft3c333222013-06-14 19:27:57 +020047reclass concepts
48~~~~~~~~~~~~~~~~
49reclass assumes a node-centric perspective into your inventory. This is
50obvious when you query reclass for node-specific information, but it might not
51be clear when you ask reclass to provide you with a list of groups. In that
52case, reclass loops over all nodes it can find in its database, reads all
53information it can find about the nodes, and finally reorders the result to
54provide a list of groups with the nodes they contain.
55
56Since the term 'groups' is somewhat ambiguous, it helps to start off with
57a short glossary of reclass-specific terminology:
58
59 node: A node, usually a computer in your infrastructure
60 class: A category, tag, feature, or role that applies to a node
61 Classes may be nested, i.e. there can be a class hierarchy
62 application: A specific set of behaviour to apply to members of a class
63 parameter: Node-specific variables, with inheritance throughout the class
64 hierarchy.
65
66A class consists of zero or more parent classes, zero or more applications,
67and any number of parameters.
68
69A node is almost equivalent to a class, except that it usually does not (but
70can) specify applications.
71
72When reclass parses a node (or class) definition and encounters a parent
73class, it recurses to this parent class first before reading any data of the
74node (or class). When reclass returns from the recursive, depth first walk, it
75then merges all information of the current node (or class) into the
76information it obtained during the recursion.
77
martin f. krafftff1cb062013-06-20 17:23:00 +020078Furthermore, a node (or class) may define a list of classes it derives from,
79in which case classes defined further down the list will be able to override
80classes further up the list.
81
martin f. krafft3c333222013-06-14 19:27:57 +020082Information in this context is essentially one of a list of applications or
83a list of parameters.
84
85The interaction between the depth-first walk and the delayed merging of data
86means that the node (and any class) may override any of the data defined by
87any of the parent classes (ancestors). This is in line with the assumption
88that more specific definitions ("this specific host") should have a higher
89precedence than more general definitions ("all webservers", which includes all
90webservers in Munich, which includes "this specific host", for example).
91
92Here's a quick example, showing how parameters accumulate and can get
93replaced.
94
95 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
96 have /etc/motd centrally-managed (through the 'motd' application), and the
97 unixnodes class definition provides a generic message-of-the-day to be put
98 into this file.
99
100 All debiannodes, which are descendants of unixnodes, should include the
101 Debian codename in this message, so the message-of-the-day is overwritten in
102 the debiannodes class.
103
104 The node 'quantum.example.org' will have a scheduled downtime this weekend,
105 so until Monday, an appropriate message-of-the-day is added to the node
106 definition.
107
martin f. krafftff1cb062013-06-20 17:23:00 +0200108 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200109 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200110 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200111
112At this point it should be noted that parameters whose values are lists or
113key-value pairs don't get overwritten by children classes or node definitions,
114but the information gets merged (recursively) instead.
115
116Similarly to parameters, applications also accumulate during the recursive
117walk through the class ancestry. It is possible for a node or child class to
118_remove_ an application added by a parent class, by prefixing the application
119with '~'.
120
121Finally, reclass happily lets you use multiple inheritance, and ensures that
122the resolution of parameters is still well-defined. Here's another example
123building upon the one about /etc/motd above:
124
125 'quantum.example.org' (which is back up and therefore its node definition no
126 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
127 is a child of the class 'hosted@munich'. This class is independent of the
128 'unixnode' hierarchy, 'quantum.example.org' derives from both.
129
130 In this example infrastructure, 'hosted@munich' is more specific than
131 'debiannodes' because there are plenty of Debian nodes at other sites (and
132 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
133 from 'hosted@munich' _after_ 'debiannodes'.
134
135 When an electricity outage is expected over the weekend in Munich, the admin
136 can change the message-of-the-day in the 'hosted@munich' class, and it will
137 apply to all hosts in Munich.
138
139 However, not all hosts in Munich have /etc/motd, because some of them are
140 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
141 'motd' application, those hosts have access to the message-of-the-day in the
142 node variables, but the message won't get used…
143
144 … unless, of course, 'windowsnodes' specified a Windows-specific application
145 to bring such notices to the attention of the user.
146
martin f. krafftff1cb062013-06-20 17:23:00 +0200147It's also trivial to ensure a certain order of class evaluation. Here's
148another example:
149
150 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
151
152 The 'backuppc.client' class defines the parameter to 'without-password',
153 because the BackupPC server might need to log in to the host as root.
154
155 Now, what happens if the admin accidentally provides the following two
156 classes?
157
158 - backuppc.client
159 - ssh.server
160
161 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
162
martin f. kraffta0db0702013-06-20 17:25:01 +0200163 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
164 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200165 ensuring that it gets parsed before 'backuppc.client'.
166
167 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200168 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200169
martin f. krafft3c333222013-06-14 19:27:57 +0200170reclass operations
171~~~~~~~~~~~~~~~~~~
172While reclass has been built to support different storage backends through
173plugins, currently only the 'yaml_fs' storage backend exists. This is a very
174simple, yet powerful, YAML-based backend, using flat files on the filesystem
175(as suggested by the _fs postfix).
176
177yaml_fs works with two directories, one for node definitions, and another for
178class definitions. It is possible to use a single directory for both, but that
179could get messy and is therefore not recommended.
180
181Files in those directories are YAML-files, specifying key-value pairs. The
182following three keys are read by reclass:
183
184 classes: a list of parent classes
185 appliations: a list of applications to append to the applications defined by
186 ancestors. If an application name starts with '~', it would
187 remove this application from the list, if it had already been
188 added but it does not prevent a future addition.
189 E.g. '~firewalled'
190 parameters: key-value pairs to set defaults in class definitions, override
191 existing data, or provide node-specific information in node
192 specifications.
193 By convention, parameters corresponding to an application
194 should be provided as subkey-value pairs, keyed by the name of
195 the application, e.g.
196
197 applications:
198 - ssh.server
199 parameters:
200 ssh.server:
201 permit_root_login: no
202
203reclass starts out reading a node definition file, obtains the list of
204classes, then reads the files corresponding to these classes, recursively
205reading parent classes, and finally merges the applications list (append
206unless
207
martin f. krafft9b2049e2013-06-14 20:05:08 +0200208Version control
209~~~~~~~~~~~~~~~
210I recommend you maintain your reclass inventory database in Git, right from
211the start.
212
martin f. krafft3c333222013-06-14 19:27:57 +0200213Usage
214~~~~~
215For information on how to use reclass directly, invoke reclass.py with --help
216and study the output.
217
martin f. krafft3924e892013-06-25 11:57:03 +0200218The three options --inventory-base-uri, --nodes-uri, and --classes-uri
219together specify the location of the inventory. If the base URI is specified,
220then it is prepended to the other two URIs, unless they are absolute URIs. If
221these two URIs are not specified, they default to 'nodes' and 'classes'.
222Therefore, if your inventory is in '/etc/reclass/nodes' and
223'/etc/reclass/classes', all you need to specify is the base URI as
224'/etc/reclass'.
225
martin f. krafft3c333222013-06-14 19:27:57 +0200226More commonly, however, use of reclass will happen indirectly, and through
martin f. krafft5ee69b32013-06-24 13:41:06 +0200227so-called adapters, e.g. /…/reclass/adapters/salt. The job of an adapter is to
228translate between different invocation paradigms, provide a sane set of
martin f. krafft3c333222013-06-14 19:27:57 +0200229default options, and massage the data from reclass into the format expected by
230the automation tool in use.
231
232Configuration file
233~~~~~~~~~~~~~~~~~~
234reclass can read some of its configuration from a file. The file is
235a YAML-file and simply defines key-value pairs.
236
237The configuration file can be used to set defaults for all the options that
238are otherwise configurable via the command-line interface, so please use the
239--help output of reclass for reference. The command-line option '--nodes-uri'
240corresponds to the key 'nodes_uri' in the configuration file. For example:
241
242 storage_type: yaml_fs
243 pretty_print: True
244 output: json
martin f. krafft3924e892013-06-25 11:57:03 +0200245 inventory_base_uri: /etc/reclass
martin f. krafft3c333222013-06-14 19:27:57 +0200246 nodes_uri: ../nodes
247
248reclass first looks in the current directory for the file called
249'reclass-config.yml' and if no such file is found, it looks "next to" the
250reclass script itself. Adapters implement their own lookup logic.
251
martin f. krafft3c333222013-06-14 19:27:57 +0200252Contributing to reclass
253~~~~~~~~~~~~~~~~~~~~~~~
254Conttributions to reclass are very welcome. Since I prefer to keep a somewhat
martin f. krafft243eb3d2013-07-03 15:28:34 +0200255clean history, I will not just merge pull request.
256
257You can submit pull requests, of course, and I'll rebase them onto HEAD before
258merging. Or send your patches using git-format-patch and git-send-e-mail to
259reclass@pobox.madduck.net.
martin f. krafft3c333222013-06-14 19:27:57 +0200260
261I have added rudimentary unit tests, and it would be nice if you could submit
262your changes with appropriate changes to the tests. To run tests, invoke
263./run_tests.py in the top-level checkout directory.
264
265If you have larger ideas, I'll be looking forward to discuss them with you.
266
martin f. kraffte39e8902013-06-14 22:12:17 +0200267 -- martin f. krafft <madduck@madduck.net> Fri, 14 Jun 2013 22:12:05 +0200