blob: 566f562ce327ea6642290fbcd8948d79d79e1e88 [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
6=============================================================
7
8reclass is an "external node classifier" (ENC) as can be used with automation
9tools, such as Puppet, Salt, and Ansible.
10
11The purpose of an ENC is to allow a system administrator to maintain an
12inventory of nodes to be managed, completely separately from the configuration
13of the automation tool. Usually, the external node classifier completely
14replaces the tool-specific inventory (such as site.pp for Puppet, or
15/etc/ansible/hosts).
16
17In general, the ENC fulfills two jobs:
18
19 - it provides information about groups of nodes and group memberships
20 - it gives access to node-specific information, such as variables
21
22While reclass was born into a Puppet environment and has also been used with
23Salt, the version you have in front of you is a rewrite from scratch, which
24was targetted at Ansible. However, care was taken to make the code flexible
25enough to allow it to be used from Salt, Puppet, and maybe even other tools as
26well.
27
28In this document, you will find an overview of the concepts of reclass, the
29way it works, and how it can be tied in with Ansible.
30
31Quick start Ansible
32~~~~~~~~~~~~~~~~~~~~~
33The following steps should get you up and running quickly. Generally, we will
34be working in /etc/ansible. However, if you are using a source-code checkout
35of Ansible, you might also want to work inside the ./hacking directory
36instead.
37
38Or you can also just look into ./examples/ansible of your reclass checkout,
39where the following steps have already been prepared.
40
41/…/reclass refers to the location of your reclass checkout.
42
43 1. Symlink /…/reclass/adapters/ansible to /etc/ansible/hosts (or
44 ./hacking/hosts)
45
46 2. Copy the two directories 'nodes' and 'classes' from the example
47 subdirectory in the reclass checkout to /etc/ansible
48
49 If you prefer to put those directories elsewhere, you can create
50 /etc/ansible/reclass-config.yml with contents such as
51
52 storage_type: yaml_fs
53 nodes_uri: /srv/reclass/nodes
54 classes_uri: /srv/reclass/classes
55
56 Note that yaml_fs is currently the only supported storage_type, and it's
57 the default if you don't set it.
58
59 3. Check out your inventory by invoking
60
61 ./hosts --list
62
63 which should return 5 groups in JSON-format, and each group has exactly
64 one member 'localhost'.
65
66 4. See the node information for 'localhost':
67
68 ./hosts --host localhost
69
70 This should print a set of keys and values, including a greeting,
71 a colour, and a sub-class called 'RECLASS'.
72
73 5. Execute some ansible commands, e.g.
74
75 ansible -i hosts \* --list-hosts
76 ansible -i hosts \* -m ping
77 ansible -i hosts \* -m debug -a 'msg="${greeting}"'
78 ansible -i hosts \* -m setup
79 ansible-playbook -i hosts test.yml
80
81 6. You can also invoke reclass directly, which gives a slightly different
82 view onto the same data, i.e. before it has been adapted for Ansible:
83
84 /…/reclass.py --pretty-print --inventory
85 /…/reclass.py --pretty-print --nodeinfo localhost
86
87reclass concepts
88~~~~~~~~~~~~~~~~
89reclass assumes a node-centric perspective into your inventory. This is
90obvious when you query reclass for node-specific information, but it might not
91be clear when you ask reclass to provide you with a list of groups. In that
92case, reclass loops over all nodes it can find in its database, reads all
93information it can find about the nodes, and finally reorders the result to
94provide a list of groups with the nodes they contain.
95
96Since the term 'groups' is somewhat ambiguous, it helps to start off with
97a short glossary of reclass-specific terminology:
98
99 node: A node, usually a computer in your infrastructure
100 class: A category, tag, feature, or role that applies to a node
101 Classes may be nested, i.e. there can be a class hierarchy
102 application: A specific set of behaviour to apply to members of a class
103 parameter: Node-specific variables, with inheritance throughout the class
104 hierarchy.
105
106A class consists of zero or more parent classes, zero or more applications,
107and any number of parameters.
108
109A node is almost equivalent to a class, except that it usually does not (but
110can) specify applications.
111
112When reclass parses a node (or class) definition and encounters a parent
113class, it recurses to this parent class first before reading any data of the
114node (or class). When reclass returns from the recursive, depth first walk, it
115then merges all information of the current node (or class) into the
116information it obtained during the recursion.
117
118Information in this context is essentially one of a list of applications or
119a list of parameters.
120
121The interaction between the depth-first walk and the delayed merging of data
122means that the node (and any class) may override any of the data defined by
123any of the parent classes (ancestors). This is in line with the assumption
124that more specific definitions ("this specific host") should have a higher
125precedence than more general definitions ("all webservers", which includes all
126webservers in Munich, which includes "this specific host", for example).
127
128Here's a quick example, showing how parameters accumulate and can get
129replaced.
130
131 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
132 have /etc/motd centrally-managed (through the 'motd' application), and the
133 unixnodes class definition provides a generic message-of-the-day to be put
134 into this file.
135
136 All debiannodes, which are descendants of unixnodes, should include the
137 Debian codename in this message, so the message-of-the-day is overwritten in
138 the debiannodes class.
139
140 The node 'quantum.example.org' will have a scheduled downtime this weekend,
141 so until Monday, an appropriate message-of-the-day is added to the node
142 definition.
143
144 When the 'motd' application runs, it retrieves the appropriate
145 message-of-the-day and writes it into /etc/motd.
146
147At this point it should be noted that parameters whose values are lists or
148key-value pairs don't get overwritten by children classes or node definitions,
149but the information gets merged (recursively) instead.
150
151Similarly to parameters, applications also accumulate during the recursive
152walk through the class ancestry. It is possible for a node or child class to
153_remove_ an application added by a parent class, by prefixing the application
154with '~'.
155
156Finally, reclass happily lets you use multiple inheritance, and ensures that
157the resolution of parameters is still well-defined. Here's another example
158building upon the one about /etc/motd above:
159
160 'quantum.example.org' (which is back up and therefore its node definition no
161 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
162 is a child of the class 'hosted@munich'. This class is independent of the
163 'unixnode' hierarchy, 'quantum.example.org' derives from both.
164
165 In this example infrastructure, 'hosted@munich' is more specific than
166 'debiannodes' because there are plenty of Debian nodes at other sites (and
167 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
168 from 'hosted@munich' _after_ 'debiannodes'.
169
170 When an electricity outage is expected over the weekend in Munich, the admin
171 can change the message-of-the-day in the 'hosted@munich' class, and it will
172 apply to all hosts in Munich.
173
174 However, not all hosts in Munich have /etc/motd, because some of them are
175 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
176 'motd' application, those hosts have access to the message-of-the-day in the
177 node variables, but the message won't get used
178
179 unless, of course, 'windowsnodes' specified a Windows-specific application
180 to bring such notices to the attention of the user.
181
182reclass operations
183~~~~~~~~~~~~~~~~~~
184While reclass has been built to support different storage backends through
185plugins, currently only the 'yaml_fs' storage backend exists. This is a very
186simple, yet powerful, YAML-based backend, using flat files on the filesystem
187(as suggested by the _fs postfix).
188
189yaml_fs works with two directories, one for node definitions, and another for
190class definitions. It is possible to use a single directory for both, but that
191could get messy and is therefore not recommended.
192
193Files in those directories are YAML-files, specifying key-value pairs. The
194following three keys are read by reclass:
195
196 classes: a list of parent classes
197 appliations: a list of applications to append to the applications defined by
198 ancestors. If an application name starts with '~', it would
199 remove this application from the list, if it had already been
200 added but it does not prevent a future addition.
201 E.g. '~firewalled'
202 parameters: key-value pairs to set defaults in class definitions, override
203 existing data, or provide node-specific information in node
204 specifications.
205 By convention, parameters corresponding to an application
206 should be provided as subkey-value pairs, keyed by the name of
207 the application, e.g.
208
209 applications:
210 - ssh.server
211 parameters:
212 ssh.server:
213 permit_root_login: no
214
215reclass starts out reading a node definition file, obtains the list of
216classes, then reads the files corresponding to these classes, recursively
217reading parent classes, and finally merges the applications list (append
218unless
219
220Usage
221~~~~~
222For information on how to use reclass directly, invoke reclass.py with --help
223and study the output.
224
225More commonly, however, use of reclass will happen indirectly, and through
226so-called adapters, e.g. /…/reclass/adapters/ansible. The job of an adapter is
227to translate between different invocation paradigms, provide a sane set of
228default options, and massage the data from reclass into the format expected by
229the automation tool in use.
230
231Configuration file
232~~~~~~~~~~~~~~~~~~
233reclass can read some of its configuration from a file. The file is
234a YAML-file and simply defines key-value pairs.
235
236The configuration file can be used to set defaults for all the options that
237are otherwise configurable via the command-line interface, so please use the
238--help output of reclass for reference. The command-line option '--nodes-uri'
239corresponds to the key 'nodes_uri' in the configuration file. For example:
240
241 storage_type: yaml_fs
242 pretty_print: True
243 output: json
244 nodes_uri: ../nodes
245
246reclass first looks in the current directory for the file called
247'reclass-config.yml' and if no such file is found, it looks "next to" the
248reclass script itself. Adapters implement their own lookup logic.
249
250Integration with Ansible
251~~~~~~~~~~~~~~~~~~~~~~~~
252The integration between reclass and Ansible is performed through an adapter,
253and needs not be of our concern too much.
254
255However, Ansible has no concept of "nodes", "applications", "parameters", and
256"classes". Therefore it is necessary to explain how those correspond to
257Ansible. Crudely, the following mapping exists:
258
259 nodes hosts
260 classes groups
261 applications playbooks
262 parameters host_vars
263
264reclass does not provide any group_vars because of its node-centric
265perspective. While class definitions include parameters, those are inherited
266by the node definitions and hence become node_vars.
267
268reclass also does not provide playbooks, nor does it deal with any of the
269related Ansible concepts, i.e. vars_files, vars, tasks, handlers, roles, etc..
270
271 Let it be said at this point that you'll probably want to stop using
272 host_vars, group_vars and vars_files altogether, and if only because you
273 should no longer need them, but also because the variable precedence rules
274 of Ansible are full of surprises, at least to me.
275
276reclass' Ansible adapter massage the reclass output into Ansible-usable data,
277namely:
278
279 - Every class in the ancestry of a node becomes a group to Ansible. This is
280 mainly useful to be able to target nodes during interactive use of
281 Ansible, e.g.
282
283 ansible debiannode@wheezy -m command -a 'apt-get upgrade'
284 upgrade all Debian nodes running wheezy
285
286 ansible ssh.server -m command -a 'invoke-rc.d ssh restart'
287 restart all SSH server processes
288
289 ansible mailserver -m command -a 'tail -n1000 /var/log/mail.err'
290 obtain the last 1,000 lines of all mailserver error log files
291
292 The attentive reader might stumble over the use of singular words, whereas
293 it might make more sense to address all 'mailserver*s*' with this tool.
294 This is convention and up to you. I prefer to think of my node as
295 a (singular) mailserver when I add 'mailserver' to its parent classes.
296
297 - Every entry in the list of a host's applications might well correspond to
298 an Ansible playbook. Therefore, reclass creates a (Ansible-)group for
299 every application, and adds '_hosts' to the name.
300
301 For instance, the ssh.server class adds the ssh.server application to
302 a node's application list. Now the admin might create an Ansible playbook
303 like so:
304
305 - name: SSH server management
306 hosts: ssh.server_hosts SEE HERE
307 tasks:
308 - name: install SSH package
309 action:
310
311
312 There's a bit of redundancy in this, but unfortunately Ansible playbooks
313 hardcode the nodes to which a playbook applies.
314
315 The suggested way to use Ansible site-wide is then to create a 'site'
316 playbook that includes all the other playbooks (which shall hopefully be
317 based on Ansible roles), and then to invoke Ansible like this:
318
319 ansible-playbook site.yml
320
321 or, if you prefer only to reconfigure a subset of nodes, e.g. all
322 webservers:
323
324 ansible-playbook site.yml --limit webserver
325
326 Again, if the singular word 'webserver' puts you off, change the
327 convention as you wish.
328
329 - Parameters corresponding to a node become host_vars for that host.
330
331Contributing to reclass
332~~~~~~~~~~~~~~~~~~~~~~~
333Conttributions to reclass are very welcome. Since I prefer to keep a somewhat
334clean history, I will not merge pull requests. Please send your patches using
335git-format-patch and git-send-e-mail to reclass@pobox.madduck.net.
336
337I have added rudimentary unit tests, and it would be nice if you could submit
338your changes with appropriate changes to the tests. To run tests, invoke
339./run_tests.py in the top-level checkout directory.
340
341If you have larger ideas, I'll be looking forward to discuss them with you.
342
343 -- martin f. krafft <madduck@madduck.net> Fri, 14 Jun 2013 19:30:19 +0200