blob: 5aa0aca2b958aef1ff02fd99ae57fa7583acf1d4 [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
martin f. krafftff1bae82013-07-04 07:52:30 +02009tools, such as Puppet, Salt, and Ansible. It is also a stand-alone tool for
10merging data sources recursively.
martin f. krafft3c333222013-06-14 19:27:57 +020011
12The purpose of an ENC is to allow a system administrator to maintain an
13inventory of nodes to be managed, completely separately from the configuration
14of the automation tool. Usually, the external node classifier completely
martin f. krafft5ee69b32013-06-24 13:41:06 +020015replaces the tool-specific inventory (such as site.pp for Puppet,
martin f. krafft3924e892013-06-25 11:57:03 +020016ext_pillar/master_tops for Salt, or /etc/ansible/hosts).
martin f. krafft3c333222013-06-14 19:27:57 +020017
martin f. krafft62239892013-06-14 20:03:59 +020018reclass allows you to define your nodes through class inheritance, while
martin f. krafft30943272013-07-04 08:32:00 +020019always able to override details further up the tree (i.e. in more specific
20nodes). Think of classes as feature sets, as commonalities between nodes, or
21as tags. Add to that the ability to nest classes (multiple inheritance is
22allowed, well-defined, and encouraged), and piece together your infrastructure
23from smaller bits, eliminating redundancy and exposing all important
24parameters to a single location, logically organised.
martin f. krafft62239892013-06-14 20:03:59 +020025
martin f. krafft3c333222013-06-14 19:27:57 +020026In general, the ENC fulfills two jobs:
27
28 - it provides information about groups of nodes and group memberships
29 - it gives access to node-specific information, such as variables
30
martin f. krafft5ee69b32013-06-24 13:41:06 +020031In this document, you will find an overview of the concepts of reclass and the
32way it works. Have a look at README.Salt and README.Ansible for information
33about integration of reclass with these tools.
martin f. krafft3c333222013-06-14 19:27:57 +020034
martin f. krafftd4833b32013-06-23 13:35:46 +020035Installation
36~~~~~~~~~~~~
martin f. krafft012103e2013-07-03 20:02:02 +020037Before you can use reclass, you need to install it into a place where Python
38can find it. Unless you installed a package from your distribution, the
martin f. krafft30943272013-07-04 08:32:00 +020039following step should install the package to /usr/local:
martin f. krafftd4833b32013-06-23 13:35:46 +020040
martin f. krafft30943272013-07-04 08:32:00 +020041 $ python setup.py install
martin f. krafft012103e2013-07-03 20:02:02 +020042
43If you want to install to a different location, use --prefix like so:
44
martin f. krafft30943272013-07-04 08:32:00 +020045 $ python setup.py install --prefix=/opt/local
46
47Just make sure that the destination is in the Python module search path, which
48you can check like this:
49
50 $ python -c 'import sys; print sys.path'
martin f. krafft012103e2013-07-03 20:02:02 +020051
52More options can be found in the output of
53
martin f. krafft30943272013-07-04 08:32:00 +020054 $ python setup.py install --help
55 $ python setup.py --help
56 $ python setup.py --help-commands
57 $ python setup.py --help [cmd]
martin f. krafft012103e2013-07-03 20:02:02 +020058
59If you just want to run reclass from source, e.g. because you are going to be
60making and testing changes, install it in "development mode":
61
martin f. krafft30943272013-07-04 08:32:00 +020062 $ python setup.py develop
martin f. krafft012103e2013-07-03 20:02:02 +020063
martin f. krafft30943272013-07-04 08:32:00 +020064To uninstall (the rm call is necessary due to http://bugs.debian.org/714960):
martin f. krafft012103e2013-07-03 20:02:02 +020065
martin f. krafft30943272013-07-04 08:32:00 +020066 $ python setup.py develop --uninstall
67 $ rm /usr/local/bin/reclass*
martin f. krafft012103e2013-07-03 20:02:02 +020068
69Uninstallation currently isn't possible for packages installed to /usr/local
70as per the above method, unfortunately: http://bugs.python.org/issue4673.
71The following should do:
72
martin f. krafft30943272013-07-04 08:32:00 +020073 $ rm -r /usr/local/lib/python*/dist-packages/reclass* /usr/local/bin/reclass*
martin f. krafftd4833b32013-06-23 13:35:46 +020074
martin f. krafft3c333222013-06-14 19:27:57 +020075reclass concepts
76~~~~~~~~~~~~~~~~
77reclass assumes a node-centric perspective into your inventory. This is
78obvious when you query reclass for node-specific information, but it might not
79be clear when you ask reclass to provide you with a list of groups. In that
80case, reclass loops over all nodes it can find in its database, reads all
81information it can find about the nodes, and finally reorders the result to
82provide a list of groups with the nodes they contain.
83
84Since the term 'groups' is somewhat ambiguous, it helps to start off with
85a short glossary of reclass-specific terminology:
86
87 node: A node, usually a computer in your infrastructure
88 class: A category, tag, feature, or role that applies to a node
89 Classes may be nested, i.e. there can be a class hierarchy
martin f. krafft30943272013-07-04 08:32:00 +020090 application: A specific set of behaviour to apply
martin f. krafft3c333222013-06-14 19:27:57 +020091 parameter: Node-specific variables, with inheritance throughout the class
92 hierarchy.
93
94A class consists of zero or more parent classes, zero or more applications,
95and any number of parameters.
96
97A node is almost equivalent to a class, except that it usually does not (but
98can) specify applications.
99
100When reclass parses a node (or class) definition and encounters a parent
101class, it recurses to this parent class first before reading any data of the
102node (or class). When reclass returns from the recursive, depth first walk, it
103then merges all information of the current node (or class) into the
104information it obtained during the recursion.
105
martin f. krafftff1cb062013-06-20 17:23:00 +0200106Furthermore, a node (or class) may define a list of classes it derives from,
107in which case classes defined further down the list will be able to override
108classes further up the list.
109
martin f. krafft3c333222013-06-14 19:27:57 +0200110Information in this context is essentially one of a list of applications or
111a list of parameters.
112
113The interaction between the depth-first walk and the delayed merging of data
114means that the node (and any class) may override any of the data defined by
115any of the parent classes (ancestors). This is in line with the assumption
116that more specific definitions ("this specific host") should have a higher
117precedence than more general definitions ("all webservers", which includes all
118webservers in Munich, which includes "this specific host", for example).
119
120Here's a quick example, showing how parameters accumulate and can get
121replaced.
122
123 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
124 have /etc/motd centrally-managed (through the 'motd' application), and the
125 unixnodes class definition provides a generic message-of-the-day to be put
126 into this file.
127
128 All debiannodes, which are descendants of unixnodes, should include the
129 Debian codename in this message, so the message-of-the-day is overwritten in
130 the debiannodes class.
131
132 The node 'quantum.example.org' will have a scheduled downtime this weekend,
133 so until Monday, an appropriate message-of-the-day is added to the node
134 definition.
135
martin f. krafftff1cb062013-06-20 17:23:00 +0200136 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200137 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200138 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200139
140At this point it should be noted that parameters whose values are lists or
141key-value pairs don't get overwritten by children classes or node definitions,
142but the information gets merged (recursively) instead.
143
144Similarly to parameters, applications also accumulate during the recursive
145walk through the class ancestry. It is possible for a node or child class to
146_remove_ an application added by a parent class, by prefixing the application
147with '~'.
148
149Finally, reclass happily lets you use multiple inheritance, and ensures that
150the resolution of parameters is still well-defined. Here's another example
151building upon the one about /etc/motd above:
152
153 'quantum.example.org' (which is back up and therefore its node definition no
154 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
155 is a child of the class 'hosted@munich'. This class is independent of the
156 'unixnode' hierarchy, 'quantum.example.org' derives from both.
157
158 In this example infrastructure, 'hosted@munich' is more specific than
159 'debiannodes' because there are plenty of Debian nodes at other sites (and
160 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
161 from 'hosted@munich' _after_ 'debiannodes'.
162
163 When an electricity outage is expected over the weekend in Munich, the admin
164 can change the message-of-the-day in the 'hosted@munich' class, and it will
165 apply to all hosts in Munich.
166
167 However, not all hosts in Munich have /etc/motd, because some of them are
168 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
169 'motd' application, those hosts have access to the message-of-the-day in the
170 node variables, but the message won't get used
171
172 unless, of course, 'windowsnodes' specified a Windows-specific application
173 to bring such notices to the attention of the user.
174
martin f. krafftff1cb062013-06-20 17:23:00 +0200175It's also trivial to ensure a certain order of class evaluation. Here's
176another example:
177
178 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
179
180 The 'backuppc.client' class defines the parameter to 'without-password',
181 because the BackupPC server might need to log in to the host as root.
182
183 Now, what happens if the admin accidentally provides the following two
184 classes?
185
186 - backuppc.client
187 - ssh.server
188
189 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
190
martin f. kraffta0db0702013-06-20 17:25:01 +0200191 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
192 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200193 ensuring that it gets parsed before 'backuppc.client'.
194
195 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200196 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200197
martin f. krafft3c333222013-06-14 19:27:57 +0200198reclass operations
199~~~~~~~~~~~~~~~~~~
200While reclass has been built to support different storage backends through
201plugins, currently only the 'yaml_fs' storage backend exists. This is a very
202simple, yet powerful, YAML-based backend, using flat files on the filesystem
203(as suggested by the _fs postfix).
204
205yaml_fs works with two directories, one for node definitions, and another for
206class definitions. It is possible to use a single directory for both, but that
207could get messy and is therefore not recommended.
208
209Files in those directories are YAML-files, specifying key-value pairs. The
210following three keys are read by reclass:
211
212 classes: a list of parent classes
213 appliations: a list of applications to append to the applications defined by
214 ancestors. If an application name starts with '~', it would
215 remove this application from the list, if it had already been
216 added — but it does not prevent a future addition.
217 E.g. '~firewalled'
218 parameters: key-value pairs to set defaults in class definitions, override
219 existing data, or provide node-specific information in node
220 specifications.
221 By convention, parameters corresponding to an application
222 should be provided as subkey-value pairs, keyed by the name of
223 the application, e.g.
224
225 applications:
226 - ssh.server
227 parameters:
228 ssh.server:
229 permit_root_login: no
230
231reclass starts out reading a node definition file, obtains the list of
232classes, then reads the files corresponding to these classes, recursively
martin f. krafft30943272013-07-04 08:32:00 +0200233reading parent classes, and finally merges the applications list and the
234parameters.
235
236Merging of parameters is done recursively, meaning that lists and dictionaries
237are extended, rather than replaced. There is currently no way to remove or
238overwrite an existing value.
martin f. krafft3c333222013-06-14 19:27:57 +0200239
martin f. krafftfa27b9a2013-08-07 16:07:14 +0200240Finally, parameters may reference each other, including deep references, e.g.:
241
242 parameters:
243 location: Munich, Germany
244 motd:
245 header: This node sits in ${location}
246 for_demonstration: ${motd:header}
247 dict_reference: ${motd}
248
249After merging and interpolation, which happens automatically inside the
250storage modules, the 'for_demonstration' parameter will have a value of "This
251node sits in Munich, Germany".
252
253Types are preserved if the value contains nothing but a reference. Hence, the
254value of 'dict_reference' will actually be a dictionary.
255
martin f. krafft9b2049e2013-06-14 20:05:08 +0200256Version control
257~~~~~~~~~~~~~~~
258I recommend you maintain your reclass inventory database in Git, right from
259the start.
260
martin f. krafft3c333222013-06-14 19:27:57 +0200261Usage
262~~~~~
martin f. krafft30943272013-07-04 08:32:00 +0200263For information on how to use reclass directly, invoke reclass with --help and
264study the output.
martin f. krafft3c333222013-06-14 19:27:57 +0200265
martin f. krafft3924e892013-06-25 11:57:03 +0200266The three options --inventory-base-uri, --nodes-uri, and --classes-uri
267together specify the location of the inventory. If the base URI is specified,
268then it is prepended to the other two URIs, unless they are absolute URIs. If
269these two URIs are not specified, they default to 'nodes' and 'classes'.
270Therefore, if your inventory is in '/etc/reclass/nodes' and
271'/etc/reclass/classes', all you need to specify is the base URI as
martin f. krafft30943272013-07-04 08:32:00 +0200272'/etc/reclass' — which is actually the default (see reclass/defaults.py).
martin f. krafft3924e892013-06-25 11:57:03 +0200273
martin f. krafftda522872013-07-03 20:31:55 +0200274If you've installed reclass as per the above instructions, try to run it from
275the source directory like this:
276
martin f. krafft30943272013-07-04 08:32:00 +0200277 $ reclass -b examples/ --inventory
278 $ reclass -b examples/ --node localhost
martin f. krafftda522872013-07-03 20:31:55 +0200279
280Those data come from examples/nodes and examples/classes, and you can surely
281make your own way from here.
282
martin f. krafft3c333222013-06-14 19:27:57 +0200283More commonly, however, use of reclass will happen indirectly, and through
martin f. krafft5ee69b32013-06-24 13:41:06 +0200284so-called adapters, e.g. /…/reclass/adapters/salt. The job of an adapter is to
285translate between different invocation paradigms, provide a sane set of
martin f. krafft3c333222013-06-14 19:27:57 +0200286default options, and massage the data from reclass into the format expected by
martin f. krafft30943272013-07-04 08:32:00 +0200287the automation tool in use. Please have a look at the respective README files
288for these adapters.
martin f. krafft3c333222013-06-14 19:27:57 +0200289
290Configuration file
291~~~~~~~~~~~~~~~~~~
292reclass can read some of its configuration from a file. The file is
293a YAML-file and simply defines key-value pairs.
294
295The configuration file can be used to set defaults for all the options that
296are otherwise configurable via the command-line interface, so please use the
297--help output of reclass for reference. The command-line option '--nodes-uri'
298corresponds to the key 'nodes_uri' in the configuration file. For example:
299
300 storage_type: yaml_fs
301 pretty_print: True
302 output: json
martin f. krafft3924e892013-06-25 11:57:03 +0200303 inventory_base_uri: /etc/reclass
martin f. krafft3c333222013-06-14 19:27:57 +0200304 nodes_uri: ../nodes
305
306reclass first looks in the current directory for the file called
martin f. krafft30943272013-07-04 08:32:00 +0200307'reclass-config.yml' (see reclass/defaults.py) and if no such file is found,
308it looks in $HOME, then in /etc/reclass, and then "next to" the reclass script
309itself, i.e. if the script is symlinked to /srv/provisioning/reclass, then the
310the script will try to access /srv/provisioning/reclass-config.yml.
martin f. krafft3c333222013-06-14 19:27:57 +0200311
martin f. krafft30943272013-07-04 08:32:00 +0200312Note that yaml_fs is currently the only supported storage_type, and it's the
313default if you don't set it.
314
315Adapters may implement their own lookup logic, of course, so make sure to read
316their READMEs.
317
martin f. krafftfa27b9a2013-08-07 16:07:14 +0200318 -- martin f. krafft <madduck@madduck.net> Wed, 07 Aug 2013 16:21:04 +0200