blob: 3bea51beda27ada3ea122b35f7a5cb58135d04db [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
martin f. krafftff1bae82013-07-04 07:52:30 +02009tools, such as Puppet, Salt, and Ansible. It is also a stand-alone tool for
10merging data sources recursively.
martin f. krafft3c333222013-06-14 19:27:57 +020011
12The purpose of an ENC is to allow a system administrator to maintain an
13inventory of nodes to be managed, completely separately from the configuration
14of the automation tool. Usually, the external node classifier completely
martin f. krafft5ee69b32013-06-24 13:41:06 +020015replaces the tool-specific inventory (such as site.pp for Puppet,
martin f. krafft3924e892013-06-25 11:57:03 +020016ext_pillar/master_tops for Salt, or /etc/ansible/hosts).
martin f. krafft3c333222013-06-14 19:27:57 +020017
martin f. krafft62239892013-06-14 20:03:59 +020018reclass allows you to define your nodes through class inheritance, while
martin f. krafft30943272013-07-04 08:32:00 +020019always able to override details further up the tree (i.e. in more specific
20nodes). Think of classes as feature sets, as commonalities between nodes, or
21as tags. Add to that the ability to nest classes (multiple inheritance is
22allowed, well-defined, and encouraged), and piece together your infrastructure
23from smaller bits, eliminating redundancy and exposing all important
24parameters to a single location, logically organised.
martin f. krafft62239892013-06-14 20:03:59 +020025
martin f. krafft3c333222013-06-14 19:27:57 +020026In general, the ENC fulfills two jobs:
27
28 - it provides information about groups of nodes and group memberships
29 - it gives access to node-specific information, such as variables
30
martin f. krafft5ee69b32013-06-24 13:41:06 +020031In this document, you will find an overview of the concepts of reclass and the
32way it works. Have a look at README.Salt and README.Ansible for information
33about integration of reclass with these tools.
martin f. krafft3c333222013-06-14 19:27:57 +020034
martin f. krafftd4833b32013-06-23 13:35:46 +020035Installation
36~~~~~~~~~~~~
martin f. krafft012103e2013-07-03 20:02:02 +020037Before you can use reclass, you need to install it into a place where Python
38can find it. Unless you installed a package from your distribution, the
martin f. krafft30943272013-07-04 08:32:00 +020039following step should install the package to /usr/local:
martin f. krafftd4833b32013-06-23 13:35:46 +020040
martin f. krafft30943272013-07-04 08:32:00 +020041 $ python setup.py install
martin f. krafft012103e2013-07-03 20:02:02 +020042
43If you want to install to a different location, use --prefix like so:
44
martin f. krafft30943272013-07-04 08:32:00 +020045 $ python setup.py install --prefix=/opt/local
46
47Just make sure that the destination is in the Python module search path, which
48you can check like this:
49
50 $ python -c 'import sys; print sys.path'
martin f. krafft012103e2013-07-03 20:02:02 +020051
52More options can be found in the output of
53
martin f. krafft30943272013-07-04 08:32:00 +020054 $ python setup.py install --help
55 $ python setup.py --help
56 $ python setup.py --help-commands
57 $ python setup.py --help [cmd]
martin f. krafft012103e2013-07-03 20:02:02 +020058
59If you just want to run reclass from source, e.g. because you are going to be
60making and testing changes, install it in "development mode":
61
martin f. krafft30943272013-07-04 08:32:00 +020062 $ python setup.py develop
martin f. krafft012103e2013-07-03 20:02:02 +020063
martin f. krafft30943272013-07-04 08:32:00 +020064To uninstall (the rm call is necessary due to http://bugs.debian.org/714960):
martin f. krafft012103e2013-07-03 20:02:02 +020065
martin f. krafft30943272013-07-04 08:32:00 +020066 $ python setup.py develop --uninstall
67 $ rm /usr/local/bin/reclass*
martin f. krafft012103e2013-07-03 20:02:02 +020068
69Uninstallation currently isn't possible for packages installed to /usr/local
70as per the above method, unfortunately: http://bugs.python.org/issue4673.
71The following should do:
72
martin f. krafft30943272013-07-04 08:32:00 +020073 $ rm -r /usr/local/lib/python*/dist-packages/reclass* /usr/local/bin/reclass*
martin f. krafftd4833b32013-06-23 13:35:46 +020074
martin f. krafft3c333222013-06-14 19:27:57 +020075reclass concepts
76~~~~~~~~~~~~~~~~
77reclass assumes a node-centric perspective into your inventory. This is
78obvious when you query reclass for node-specific information, but it might not
79be clear when you ask reclass to provide you with a list of groups. In that
80case, reclass loops over all nodes it can find in its database, reads all
81information it can find about the nodes, and finally reorders the result to
82provide a list of groups with the nodes they contain.
83
84Since the term 'groups' is somewhat ambiguous, it helps to start off with
85a short glossary of reclass-specific terminology:
86
87 node: A node, usually a computer in your infrastructure
88 class: A category, tag, feature, or role that applies to a node
89 Classes may be nested, i.e. there can be a class hierarchy
martin f. krafft30943272013-07-04 08:32:00 +020090 application: A specific set of behaviour to apply
martin f. krafft3c333222013-06-14 19:27:57 +020091 parameter: Node-specific variables, with inheritance throughout the class
92 hierarchy.
93
94A class consists of zero or more parent classes, zero or more applications,
95and any number of parameters.
96
97A node is almost equivalent to a class, except that it usually does not (but
98can) specify applications.
99
100When reclass parses a node (or class) definition and encounters a parent
101class, it recurses to this parent class first before reading any data of the
102node (or class). When reclass returns from the recursive, depth first walk, it
103then merges all information of the current node (or class) into the
104information it obtained during the recursion.
105
martin f. krafftff1cb062013-06-20 17:23:00 +0200106Furthermore, a node (or class) may define a list of classes it derives from,
107in which case classes defined further down the list will be able to override
108classes further up the list.
109
martin f. krafft3c333222013-06-14 19:27:57 +0200110Information in this context is essentially one of a list of applications or
111a list of parameters.
112
113The interaction between the depth-first walk and the delayed merging of data
114means that the node (and any class) may override any of the data defined by
115any of the parent classes (ancestors). This is in line with the assumption
116that more specific definitions ("this specific host") should have a higher
117precedence than more general definitions ("all webservers", which includes all
118webservers in Munich, which includes "this specific host", for example).
119
120Here's a quick example, showing how parameters accumulate and can get
121replaced.
122
123 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
124 have /etc/motd centrally-managed (through the 'motd' application), and the
125 unixnodes class definition provides a generic message-of-the-day to be put
126 into this file.
127
128 All debiannodes, which are descendants of unixnodes, should include the
129 Debian codename in this message, so the message-of-the-day is overwritten in
130 the debiannodes class.
131
132 The node 'quantum.example.org' will have a scheduled downtime this weekend,
133 so until Monday, an appropriate message-of-the-day is added to the node
134 definition.
135
martin f. krafftff1cb062013-06-20 17:23:00 +0200136 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200137 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200138 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200139
140At this point it should be noted that parameters whose values are lists or
141key-value pairs don't get overwritten by children classes or node definitions,
142but the information gets merged (recursively) instead.
143
144Similarly to parameters, applications also accumulate during the recursive
145walk through the class ancestry. It is possible for a node or child class to
146_remove_ an application added by a parent class, by prefixing the application
147with '~'.
148
149Finally, reclass happily lets you use multiple inheritance, and ensures that
150the resolution of parameters is still well-defined. Here's another example
151building upon the one about /etc/motd above:
152
153 'quantum.example.org' (which is back up and therefore its node definition no
154 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
155 is a child of the class 'hosted@munich'. This class is independent of the
156 'unixnode' hierarchy, 'quantum.example.org' derives from both.
157
158 In this example infrastructure, 'hosted@munich' is more specific than
159 'debiannodes' because there are plenty of Debian nodes at other sites (and
160 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
161 from 'hosted@munich' _after_ 'debiannodes'.
162
163 When an electricity outage is expected over the weekend in Munich, the admin
164 can change the message-of-the-day in the 'hosted@munich' class, and it will
165 apply to all hosts in Munich.
166
167 However, not all hosts in Munich have /etc/motd, because some of them are
168 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
169 'motd' application, those hosts have access to the message-of-the-day in the
170 node variables, but the message won't get used
171
172 unless, of course, 'windowsnodes' specified a Windows-specific application
173 to bring such notices to the attention of the user.
174
martin f. krafftff1cb062013-06-20 17:23:00 +0200175It's also trivial to ensure a certain order of class evaluation. Here's
176another example:
177
178 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
179
180 The 'backuppc.client' class defines the parameter to 'without-password',
181 because the BackupPC server might need to log in to the host as root.
182
183 Now, what happens if the admin accidentally provides the following two
184 classes?
185
186 - backuppc.client
187 - ssh.server
188
189 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
190
martin f. kraffta0db0702013-06-20 17:25:01 +0200191 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
192 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200193 ensuring that it gets parsed before 'backuppc.client'.
194
195 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200196 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200197
martin f. krafft3c333222013-06-14 19:27:57 +0200198reclass operations
199~~~~~~~~~~~~~~~~~~
200While reclass has been built to support different storage backends through
201plugins, currently only the 'yaml_fs' storage backend exists. This is a very
202simple, yet powerful, YAML-based backend, using flat files on the filesystem
203(as suggested by the _fs postfix).
204
205yaml_fs works with two directories, one for node definitions, and another for
206class definitions. It is possible to use a single directory for both, but that
207could get messy and is therefore not recommended.
208
209Files in those directories are YAML-files, specifying key-value pairs. The
210following three keys are read by reclass:
211
212 classes: a list of parent classes
213 appliations: a list of applications to append to the applications defined by
214 ancestors. If an application name starts with '~', it would
215 remove this application from the list, if it had already been
216 added — but it does not prevent a future addition.
217 E.g. '~firewalled'
218 parameters: key-value pairs to set defaults in class definitions, override
219 existing data, or provide node-specific information in node
220 specifications.
221 By convention, parameters corresponding to an application
222 should be provided as subkey-value pairs, keyed by the name of
223 the application, e.g.
224
225 applications:
226 - ssh.server
227 parameters:
228 ssh.server:
229 permit_root_login: no
230
231reclass starts out reading a node definition file, obtains the list of
232classes, then reads the files corresponding to these classes, recursively
martin f. krafft30943272013-07-04 08:32:00 +0200233reading parent classes, and finally merges the applications list and the
234parameters.
235
236Merging of parameters is done recursively, meaning that lists and dictionaries
237are extended, rather than replaced. There is currently no way to remove or
238overwrite an existing value.
martin f. krafft3c333222013-06-14 19:27:57 +0200239
martin f. krafft9b2049e2013-06-14 20:05:08 +0200240Version control
241~~~~~~~~~~~~~~~
242I recommend you maintain your reclass inventory database in Git, right from
243the start.
244
martin f. krafft3c333222013-06-14 19:27:57 +0200245Usage
246~~~~~
martin f. krafft30943272013-07-04 08:32:00 +0200247For information on how to use reclass directly, invoke reclass with --help and
248study the output.
martin f. krafft3c333222013-06-14 19:27:57 +0200249
martin f. krafft3924e892013-06-25 11:57:03 +0200250The three options --inventory-base-uri, --nodes-uri, and --classes-uri
251together specify the location of the inventory. If the base URI is specified,
252then it is prepended to the other two URIs, unless they are absolute URIs. If
253these two URIs are not specified, they default to 'nodes' and 'classes'.
254Therefore, if your inventory is in '/etc/reclass/nodes' and
255'/etc/reclass/classes', all you need to specify is the base URI as
martin f. krafft30943272013-07-04 08:32:00 +0200256'/etc/reclass' — which is actually the default (see reclass/defaults.py).
martin f. krafft3924e892013-06-25 11:57:03 +0200257
martin f. krafftda522872013-07-03 20:31:55 +0200258If you've installed reclass as per the above instructions, try to run it from
259the source directory like this:
260
martin f. krafft30943272013-07-04 08:32:00 +0200261 $ reclass -b examples/ --inventory
262 $ reclass -b examples/ --node localhost
martin f. krafftda522872013-07-03 20:31:55 +0200263
264Those data come from examples/nodes and examples/classes, and you can surely
265make your own way from here.
266
martin f. krafft3c333222013-06-14 19:27:57 +0200267More commonly, however, use of reclass will happen indirectly, and through
martin f. krafft5ee69b32013-06-24 13:41:06 +0200268so-called adapters, e.g. /…/reclass/adapters/salt. The job of an adapter is to
269translate between different invocation paradigms, provide a sane set of
martin f. krafft3c333222013-06-14 19:27:57 +0200270default options, and massage the data from reclass into the format expected by
martin f. krafft30943272013-07-04 08:32:00 +0200271the automation tool in use. Please have a look at the respective README files
272for these adapters.
martin f. krafft3c333222013-06-14 19:27:57 +0200273
274Configuration file
275~~~~~~~~~~~~~~~~~~
276reclass can read some of its configuration from a file. The file is
277a YAML-file and simply defines key-value pairs.
278
279The configuration file can be used to set defaults for all the options that
280are otherwise configurable via the command-line interface, so please use the
281--help output of reclass for reference. The command-line option '--nodes-uri'
282corresponds to the key 'nodes_uri' in the configuration file. For example:
283
284 storage_type: yaml_fs
285 pretty_print: True
286 output: json
martin f. krafft3924e892013-06-25 11:57:03 +0200287 inventory_base_uri: /etc/reclass
martin f. krafft3c333222013-06-14 19:27:57 +0200288 nodes_uri: ../nodes
289
290reclass first looks in the current directory for the file called
martin f. krafft30943272013-07-04 08:32:00 +0200291'reclass-config.yml' (see reclass/defaults.py) and if no such file is found,
292it looks in $HOME, then in /etc/reclass, and then "next to" the reclass script
293itself, i.e. if the script is symlinked to /srv/provisioning/reclass, then the
294the script will try to access /srv/provisioning/reclass-config.yml.
martin f. krafft3c333222013-06-14 19:27:57 +0200295
martin f. krafft30943272013-07-04 08:32:00 +0200296Note that yaml_fs is currently the only supported storage_type, and it's the
297default if you don't set it.
298
299Adapters may implement their own lookup logic, of course, so make sure to read
300their READMEs.
301
302 -- martin f. krafft <madduck@madduck.net> Thu, 04 Jul 2013 22:20:20 +0200