blob: ca3de45fb4138e1468d6458271acd89187807f5c [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass — recursive external node classification
3=============================================================
4reclass is © 2007–2013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
9tools, such as Puppet, Salt, and Ansible.
10
11The purpose of an ENC is to allow a system administrator to maintain an
12inventory of nodes to be managed, completely separately from the configuration
13of the automation tool. Usually, the external node classifier completely
14replaces the tool-specific inventory (such as site.pp for Puppet, or
15/etc/ansible/hosts).
16
martin f. krafft62239892013-06-14 20:03:59 +020017reclass allows you to define your nodes through class inheritance, while
18always able to override details of classes further up the tree. Think of
19classes as feature sets, as commonalities between nodes, or as tags. Add to
20that the ability to nest classes (multiple inheritance is allowed,
21well-defined, and encouraged), and piece together your infrastructure from
22smaller bits, eliminating redundancy and exposing all important parameters to
23a single location, logically organised.
24
martin f. krafft3c333222013-06-14 19:27:57 +020025In general, the ENC fulfills two jobs:
26
27 - it provides information about groups of nodes and group memberships
28 - it gives access to node-specific information, such as variables
29
30While reclass was born into a Puppet environment and has also been used with
31Salt, the version you have in front of you is a rewrite from scratch, which
32was targetted at Ansible. However, care was taken to make the code flexible
33enough to allow it to be used from Salt, Puppet, and maybe even other tools as
34well.
35
36In this document, you will find an overview of the concepts of reclass, the
37way it works, and how it can be tied in with Ansible.
38
martin f. krafftd4833b32013-06-23 13:35:46 +020039Installation
40~~~~~~~~~~~~
41Before you can use reclass, you need to run make to configure the scripts to
42your system. Right now, this only involves setting the full path to the
43Python interpreter.
44
45 make
46
47If your Python interpreter is not /usr/bin/python and is also not in your
48$PATH, then you need to pass that to make, e.g.
49
50 make PYTHON=/opt/local/bin/python
51
martin f. krafft3c333222013-06-14 19:27:57 +020052Quick start — Ansible
53~~~~~~~~~~~~~~~~~~~~~
54The following steps should get you up and running quickly. Generally, we will
55be working in /etc/ansible. However, if you are using a source-code checkout
56of Ansible, you might also want to work inside the ./hacking directory
57instead.
58
59Or you can also just look into ./examples/ansible of your reclass checkout,
60where the following steps have already been prepared.
61
62/…/reclass refers to the location of your reclass checkout.
63
64 1. Symlink /…/reclass/adapters/ansible to /etc/ansible/hosts (or
65 ./hacking/hosts)
66
67 2. Copy the two directories 'nodes' and 'classes' from the example
68 subdirectory in the reclass checkout to /etc/ansible
69
70 If you prefer to put those directories elsewhere, you can create
71 /etc/ansible/reclass-config.yml with contents such as
72
73 storage_type: yaml_fs
74 nodes_uri: /srv/reclass/nodes
75 classes_uri: /srv/reclass/classes
76
77 Note that yaml_fs is currently the only supported storage_type, and it's
78 the default if you don't set it.
79
80 3. Check out your inventory by invoking
81
82 ./hosts --list
83
84 which should return 5 groups in JSON-format, and each group has exactly
85 one member 'localhost'.
86
87 4. See the node information for 'localhost':
88
89 ./hosts --host localhost
90
91 This should print a set of keys and values, including a greeting,
92 a colour, and a sub-class called 'RECLASS'.
93
94 5. Execute some ansible commands, e.g.
95
96 ansible -i hosts \* --list-hosts
97 ansible -i hosts \* -m ping
98 ansible -i hosts \* -m debug -a 'msg="${greeting}"'
99 ansible -i hosts \* -m setup
100 ansible-playbook -i hosts test.yml
101
102 6. You can also invoke reclass directly, which gives a slightly different
103 view onto the same data, i.e. before it has been adapted for Ansible:
104
105 /…/reclass.py --pretty-print --inventory
106 /…/reclass.py --pretty-print --nodeinfo localhost
107
108reclass concepts
109~~~~~~~~~~~~~~~~
110reclass assumes a node-centric perspective into your inventory. This is
111obvious when you query reclass for node-specific information, but it might not
112be clear when you ask reclass to provide you with a list of groups. In that
113case, reclass loops over all nodes it can find in its database, reads all
114information it can find about the nodes, and finally reorders the result to
115provide a list of groups with the nodes they contain.
116
117Since the term 'groups' is somewhat ambiguous, it helps to start off with
118a short glossary of reclass-specific terminology:
119
120 node: A node, usually a computer in your infrastructure
121 class: A category, tag, feature, or role that applies to a node
122 Classes may be nested, i.e. there can be a class hierarchy
123 application: A specific set of behaviour to apply to members of a class
124 parameter: Node-specific variables, with inheritance throughout the class
125 hierarchy.
126
127A class consists of zero or more parent classes, zero or more applications,
128and any number of parameters.
129
130A node is almost equivalent to a class, except that it usually does not (but
131can) specify applications.
132
133When reclass parses a node (or class) definition and encounters a parent
134class, it recurses to this parent class first before reading any data of the
135node (or class). When reclass returns from the recursive, depth first walk, it
136then merges all information of the current node (or class) into the
137information it obtained during the recursion.
138
martin f. krafftff1cb062013-06-20 17:23:00 +0200139Furthermore, a node (or class) may define a list of classes it derives from,
140in which case classes defined further down the list will be able to override
141classes further up the list.
142
martin f. krafft3c333222013-06-14 19:27:57 +0200143Information in this context is essentially one of a list of applications or
144a list of parameters.
145
146The interaction between the depth-first walk and the delayed merging of data
147means that the node (and any class) may override any of the data defined by
148any of the parent classes (ancestors). This is in line with the assumption
149that more specific definitions ("this specific host") should have a higher
150precedence than more general definitions ("all webservers", which includes all
151webservers in Munich, which includes "this specific host", for example).
152
153Here's a quick example, showing how parameters accumulate and can get
154replaced.
155
156 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
157 have /etc/motd centrally-managed (through the 'motd' application), and the
158 unixnodes class definition provides a generic message-of-the-day to be put
159 into this file.
160
161 All debiannodes, which are descendants of unixnodes, should include the
162 Debian codename in this message, so the message-of-the-day is overwritten in
163 the debiannodes class.
164
165 The node 'quantum.example.org' will have a scheduled downtime this weekend,
166 so until Monday, an appropriate message-of-the-day is added to the node
167 definition.
168
martin f. krafftff1cb062013-06-20 17:23:00 +0200169 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200170 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200171 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200172
173At this point it should be noted that parameters whose values are lists or
174key-value pairs don't get overwritten by children classes or node definitions,
175but the information gets merged (recursively) instead.
176
177Similarly to parameters, applications also accumulate during the recursive
178walk through the class ancestry. It is possible for a node or child class to
179_remove_ an application added by a parent class, by prefixing the application
180with '~'.
181
182Finally, reclass happily lets you use multiple inheritance, and ensures that
183the resolution of parameters is still well-defined. Here's another example
184building upon the one about /etc/motd above:
185
186 'quantum.example.org' (which is back up and therefore its node definition no
187 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
188 is a child of the class 'hosted@munich'. This class is independent of the
189 'unixnode' hierarchy, 'quantum.example.org' derives from both.
190
191 In this example infrastructure, 'hosted@munich' is more specific than
192 'debiannodes' because there are plenty of Debian nodes at other sites (and
193 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
194 from 'hosted@munich' _after_ 'debiannodes'.
195
196 When an electricity outage is expected over the weekend in Munich, the admin
197 can change the message-of-the-day in the 'hosted@munich' class, and it will
198 apply to all hosts in Munich.
199
200 However, not all hosts in Munich have /etc/motd, because some of them are
201 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
202 'motd' application, those hosts have access to the message-of-the-day in the
203 node variables, but the message won't get used…
204
205 … unless, of course, 'windowsnodes' specified a Windows-specific application
206 to bring such notices to the attention of the user.
207
martin f. krafftff1cb062013-06-20 17:23:00 +0200208It's also trivial to ensure a certain order of class evaluation. Here's
209another example:
210
211 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
212
213 The 'backuppc.client' class defines the parameter to 'without-password',
214 because the BackupPC server might need to log in to the host as root.
215
216 Now, what happens if the admin accidentally provides the following two
217 classes?
218
219 - backuppc.client
220 - ssh.server
221
222 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
223
martin f. kraffta0db0702013-06-20 17:25:01 +0200224 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
225 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200226 ensuring that it gets parsed before 'backuppc.client'.
227
228 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200229 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200230
martin f. krafft3c333222013-06-14 19:27:57 +0200231reclass operations
232~~~~~~~~~~~~~~~~~~
233While reclass has been built to support different storage backends through
234plugins, currently only the 'yaml_fs' storage backend exists. This is a very
235simple, yet powerful, YAML-based backend, using flat files on the filesystem
236(as suggested by the _fs postfix).
237
238yaml_fs works with two directories, one for node definitions, and another for
239class definitions. It is possible to use a single directory for both, but that
240could get messy and is therefore not recommended.
241
242Files in those directories are YAML-files, specifying key-value pairs. The
243following three keys are read by reclass:
244
245 classes: a list of parent classes
246 appliations: a list of applications to append to the applications defined by
247 ancestors. If an application name starts with '~', it would
248 remove this application from the list, if it had already been
249 added — but it does not prevent a future addition.
250 E.g. '~firewalled'
251 parameters: key-value pairs to set defaults in class definitions, override
252 existing data, or provide node-specific information in node
253 specifications.
254 By convention, parameters corresponding to an application
255 should be provided as subkey-value pairs, keyed by the name of
256 the application, e.g.
257
258 applications:
259 - ssh.server
260 parameters:
261 ssh.server:
262 permit_root_login: no
263
264reclass starts out reading a node definition file, obtains the list of
265classes, then reads the files corresponding to these classes, recursively
266reading parent classes, and finally merges the applications list (append
267unless
268
martin f. krafft9b2049e2013-06-14 20:05:08 +0200269Version control
270~~~~~~~~~~~~~~~
271I recommend you maintain your reclass inventory database in Git, right from
272the start.
273
martin f. krafft3c333222013-06-14 19:27:57 +0200274Usage
275~~~~~
276For information on how to use reclass directly, invoke reclass.py with --help
277and study the output.
278
279More commonly, however, use of reclass will happen indirectly, and through
280so-called adapters, e.g. /…/reclass/adapters/ansible. The job of an adapter is
281to translate between different invocation paradigms, provide a sane set of
282default options, and massage the data from reclass into the format expected by
283the automation tool in use.
284
285Configuration file
286~~~~~~~~~~~~~~~~~~
287reclass can read some of its configuration from a file. The file is
288a YAML-file and simply defines key-value pairs.
289
290The configuration file can be used to set defaults for all the options that
291are otherwise configurable via the command-line interface, so please use the
292--help output of reclass for reference. The command-line option '--nodes-uri'
293corresponds to the key 'nodes_uri' in the configuration file. For example:
294
295 storage_type: yaml_fs
296 pretty_print: True
297 output: json
298 nodes_uri: ../nodes
299
300reclass first looks in the current directory for the file called
301'reclass-config.yml' and if no such file is found, it looks "next to" the
302reclass script itself. Adapters implement their own lookup logic.
303
304Integration with Ansible
305~~~~~~~~~~~~~~~~~~~~~~~~
306The integration between reclass and Ansible is performed through an adapter,
307and needs not be of our concern too much.
308
309However, Ansible has no concept of "nodes", "applications", "parameters", and
310"classes". Therefore it is necessary to explain how those correspond to
311Ansible. Crudely, the following mapping exists:
312
313 nodes hosts
314 classes groups
315 applications playbooks
316 parameters host_vars
317
318reclass does not provide any group_vars because of its node-centric
319perspective. While class definitions include parameters, those are inherited
320by the node definitions and hence become node_vars.
321
322reclass also does not provide playbooks, nor does it deal with any of the
323related Ansible concepts, i.e. vars_files, vars, tasks, handlers, roles, etc..
324
325 Let it be said at this point that you'll probably want to stop using
326 host_vars, group_vars and vars_files altogether, and if only because you
327 should no longer need them, but also because the variable precedence rules
328 of Ansible are full of surprises, at least to me.
329
330reclass' Ansible adapter massage the reclass output into Ansible-usable data,
331namely:
332
333 - Every class in the ancestry of a node becomes a group to Ansible. This is
334 mainly useful to be able to target nodes during interactive use of
335 Ansible, e.g.
336
337 ansible debiannode@wheezy -m command -a 'apt-get upgrade'
338 → upgrade all Debian nodes running wheezy
339
340 ansible ssh.server -m command -a 'invoke-rc.d ssh restart'
341 → restart all SSH server processes
342
343 ansible mailserver -m command -a 'tail -n1000 /var/log/mail.err'
344 → obtain the last 1,000 lines of all mailserver error log files
345
346 The attentive reader might stumble over the use of singular words, whereas
347 it might make more sense to address all 'mailserver*s*' with this tool.
348 This is convention and up to you. I prefer to think of my node as
349 a (singular) mailserver when I add 'mailserver' to its parent classes.
350
351 - Every entry in the list of a host's applications might well correspond to
352 an Ansible playbook. Therefore, reclass creates a (Ansible-)group for
martin f. krafft9a9b0ac2013-06-21 21:24:18 +0200353 every application, and adds '_hosts' to the name. This postfix can be
354 configured with a CLI option (--applications-postfix) or in the
355 configuration file (applications_postfix).
martin f. krafft3c333222013-06-14 19:27:57 +0200356
357 For instance, the ssh.server class adds the ssh.server application to
358 a node's application list. Now the admin might create an Ansible playbook
359 like so:
360
361 - name: SSH server management
362 hosts: ssh.server_hosts ← SEE HERE
363 tasks:
364 - name: install SSH package
365 action: …
366
367
368 There's a bit of redundancy in this, but unfortunately Ansible playbooks
369 hardcode the nodes to which a playbook applies.
370
martin f. krafftb608e6d2013-06-14 22:10:43 +0200371 It's now trivial to apply this playbook across your infrastructure:
372
373 ansible-playbook ssh.server.yml
374
375 My suggested way to use Ansible site-wide is then to create a 'site'
martin f. krafft3c333222013-06-14 19:27:57 +0200376 playbook that includes all the other playbooks (which shall hopefully be
377 based on Ansible roles), and then to invoke Ansible like this:
378
379 ansible-playbook site.yml
380
381 or, if you prefer only to reconfigure a subset of nodes, e.g. all
382 webservers:
383
384 ansible-playbook site.yml --limit webserver
385
386 Again, if the singular word 'webserver' puts you off, change the
387 convention as you wish.
388
martin f. krafftb608e6d2013-06-14 22:10:43 +0200389 And if anyone comes up with a way to directly connect groups in the
390 inventory with roles, thereby making it unnecessary to write playbook
391 files (containing redundant information), please tell me!
392
martin f. krafft3c333222013-06-14 19:27:57 +0200393 - Parameters corresponding to a node become host_vars for that host.
394
martin f. krafft6e9dcba2013-06-16 15:21:09 +0200395It is possible to include Jinja2-style variables like you would in Ansible,
396in parameter values. This is especially powerful in combination with the
397recursive merging, e.g.
398
399 parameters:
400 motd:
401 greeting: Welcome to {{ ansible_fqdn }}!
402 closing: This system is part of {{ realm }}
403
404Now you just need to specify realm somewhere. The reference can reside in
405a parent class, while the variable is defined e.g. in the node.
406
martin f. krafft3c333222013-06-14 19:27:57 +0200407Contributing to reclass
408~~~~~~~~~~~~~~~~~~~~~~~
409Conttributions to reclass are very welcome. Since I prefer to keep a somewhat
410clean history, I will not merge pull requests. Please send your patches using
411git-format-patch and git-send-e-mail to reclass@pobox.madduck.net.
412
413I have added rudimentary unit tests, and it would be nice if you could submit
414your changes with appropriate changes to the tests. To run tests, invoke
415./run_tests.py in the top-level checkout directory.
416
417If you have larger ideas, I'll be looking forward to discuss them with you.
418
martin f. kraffte39e8902013-06-14 22:12:17 +0200419 -- martin f. krafft <madduck@madduck.net> Fri, 14 Jun 2013 22:12:05 +0200