blob: 352a104c226705ff7a7497a5a5af45e1edfde4be [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
9tools, such as Puppet, Salt, and Ansible.
10
11The purpose of an ENC is to allow a system administrator to maintain an
12inventory of nodes to be managed, completely separately from the configuration
13of the automation tool. Usually, the external node classifier completely
martin f. krafft5ee69b32013-06-24 13:41:06 +020014replaces the tool-specific inventory (such as site.pp for Puppet,
martin f. krafft3924e892013-06-25 11:57:03 +020015ext_pillar/master_tops for Salt, or /etc/ansible/hosts).
martin f. krafft3c333222013-06-14 19:27:57 +020016
martin f. krafft62239892013-06-14 20:03:59 +020017reclass allows you to define your nodes through class inheritance, while
18always able to override details of classes further up the tree. Think of
19classes as feature sets, as commonalities between nodes, or as tags. Add to
20that the ability to nest classes (multiple inheritance is allowed,
21well-defined, and encouraged), and piece together your infrastructure from
22smaller bits, eliminating redundancy and exposing all important parameters to
23a single location, logically organised.
24
martin f. krafft3c333222013-06-14 19:27:57 +020025In general, the ENC fulfills two jobs:
26
27 - it provides information about groups of nodes and group memberships
28 - it gives access to node-specific information, such as variables
29
martin f. krafft5ee69b32013-06-24 13:41:06 +020030In this document, you will find an overview of the concepts of reclass and the
31way it works. Have a look at README.Salt and README.Ansible for information
32about integration of reclass with these tools.
martin f. krafft3c333222013-06-14 19:27:57 +020033
martin f. krafftd4833b32013-06-23 13:35:46 +020034Installation
35~~~~~~~~~~~~
martin f. krafft012103e2013-07-03 20:02:02 +020036Before you can use reclass, you need to install it into a place where Python
37can find it. Unless you installed a package from your distribution, the
38following step:
martin f. krafftd4833b32013-06-23 13:35:46 +020039
martin f. krafft012103e2013-07-03 20:02:02 +020040 python setup.py install
martin f. krafftd4833b32013-06-23 13:35:46 +020041
martin f. krafft012103e2013-07-03 20:02:02 +020042This will install the package to /usr/local, which is likely in your Python
43path. You can check this using
martin f. krafftd4833b32013-06-23 13:35:46 +020044
martin f. krafft012103e2013-07-03 20:02:02 +020045 python -c 'import sys; print sys.path'
46
47If you want to install to a different location, use --prefix like so:
48
49 python setup.py install --prefix=/opt/local
50
51More options can be found in the output of
52
53 python setup.py install --help
54 python setup.py --help
55 python setup.py --help-commands
56 python setup.py --help [cmd]
57
58If you just want to run reclass from source, e.g. because you are going to be
59making and testing changes, install it in "development mode":
60
61 python setup.py develop
62
63To uninstall:
64
65 python setup.py develop --uninstall
66
67Uninstallation currently isn't possible for packages installed to /usr/local
68as per the above method, unfortunately: http://bugs.python.org/issue4673.
69The following should do:
70
71 rm -r /usr/local/lib/python*/dist-packages/reclass* /usr/local/bin/reclass
martin f. krafftd4833b32013-06-23 13:35:46 +020072
martin f. krafft3c333222013-06-14 19:27:57 +020073reclass concepts
74~~~~~~~~~~~~~~~~
75reclass assumes a node-centric perspective into your inventory. This is
76obvious when you query reclass for node-specific information, but it might not
77be clear when you ask reclass to provide you with a list of groups. In that
78case, reclass loops over all nodes it can find in its database, reads all
79information it can find about the nodes, and finally reorders the result to
80provide a list of groups with the nodes they contain.
81
82Since the term 'groups' is somewhat ambiguous, it helps to start off with
83a short glossary of reclass-specific terminology:
84
85 node: A node, usually a computer in your infrastructure
86 class: A category, tag, feature, or role that applies to a node
87 Classes may be nested, i.e. there can be a class hierarchy
88 application: A specific set of behaviour to apply to members of a class
89 parameter: Node-specific variables, with inheritance throughout the class
90 hierarchy.
91
92A class consists of zero or more parent classes, zero or more applications,
93and any number of parameters.
94
95A node is almost equivalent to a class, except that it usually does not (but
96can) specify applications.
97
98When reclass parses a node (or class) definition and encounters a parent
99class, it recurses to this parent class first before reading any data of the
100node (or class). When reclass returns from the recursive, depth first walk, it
101then merges all information of the current node (or class) into the
102information it obtained during the recursion.
103
martin f. krafftff1cb062013-06-20 17:23:00 +0200104Furthermore, a node (or class) may define a list of classes it derives from,
105in which case classes defined further down the list will be able to override
106classes further up the list.
107
martin f. krafft3c333222013-06-14 19:27:57 +0200108Information in this context is essentially one of a list of applications or
109a list of parameters.
110
111The interaction between the depth-first walk and the delayed merging of data
112means that the node (and any class) may override any of the data defined by
113any of the parent classes (ancestors). This is in line with the assumption
114that more specific definitions ("this specific host") should have a higher
115precedence than more general definitions ("all webservers", which includes all
116webservers in Munich, which includes "this specific host", for example).
117
118Here's a quick example, showing how parameters accumulate and can get
119replaced.
120
121 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
122 have /etc/motd centrally-managed (through the 'motd' application), and the
123 unixnodes class definition provides a generic message-of-the-day to be put
124 into this file.
125
126 All debiannodes, which are descendants of unixnodes, should include the
127 Debian codename in this message, so the message-of-the-day is overwritten in
128 the debiannodes class.
129
130 The node 'quantum.example.org' will have a scheduled downtime this weekend,
131 so until Monday, an appropriate message-of-the-day is added to the node
132 definition.
133
martin f. krafftff1cb062013-06-20 17:23:00 +0200134 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200135 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200136 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200137
138At this point it should be noted that parameters whose values are lists or
139key-value pairs don't get overwritten by children classes or node definitions,
140but the information gets merged (recursively) instead.
141
142Similarly to parameters, applications also accumulate during the recursive
143walk through the class ancestry. It is possible for a node or child class to
144_remove_ an application added by a parent class, by prefixing the application
145with '~'.
146
147Finally, reclass happily lets you use multiple inheritance, and ensures that
148the resolution of parameters is still well-defined. Here's another example
149building upon the one about /etc/motd above:
150
151 'quantum.example.org' (which is back up and therefore its node definition no
152 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
153 is a child of the class 'hosted@munich'. This class is independent of the
154 'unixnode' hierarchy, 'quantum.example.org' derives from both.
155
156 In this example infrastructure, 'hosted@munich' is more specific than
157 'debiannodes' because there are plenty of Debian nodes at other sites (and
158 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
159 from 'hosted@munich' _after_ 'debiannodes'.
160
161 When an electricity outage is expected over the weekend in Munich, the admin
162 can change the message-of-the-day in the 'hosted@munich' class, and it will
163 apply to all hosts in Munich.
164
165 However, not all hosts in Munich have /etc/motd, because some of them are
166 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
167 'motd' application, those hosts have access to the message-of-the-day in the
168 node variables, but the message won't get used
169
170 unless, of course, 'windowsnodes' specified a Windows-specific application
171 to bring such notices to the attention of the user.
172
martin f. krafftff1cb062013-06-20 17:23:00 +0200173It's also trivial to ensure a certain order of class evaluation. Here's
174another example:
175
176 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
177
178 The 'backuppc.client' class defines the parameter to 'without-password',
179 because the BackupPC server might need to log in to the host as root.
180
181 Now, what happens if the admin accidentally provides the following two
182 classes?
183
184 - backuppc.client
185 - ssh.server
186
187 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
188
martin f. kraffta0db0702013-06-20 17:25:01 +0200189 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
190 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200191 ensuring that it gets parsed before 'backuppc.client'.
192
193 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200194 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200195
martin f. krafft3c333222013-06-14 19:27:57 +0200196reclass operations
197~~~~~~~~~~~~~~~~~~
198While reclass has been built to support different storage backends through
199plugins, currently only the 'yaml_fs' storage backend exists. This is a very
200simple, yet powerful, YAML-based backend, using flat files on the filesystem
201(as suggested by the _fs postfix).
202
203yaml_fs works with two directories, one for node definitions, and another for
204class definitions. It is possible to use a single directory for both, but that
205could get messy and is therefore not recommended.
206
207Files in those directories are YAML-files, specifying key-value pairs. The
208following three keys are read by reclass:
209
210 classes: a list of parent classes
211 appliations: a list of applications to append to the applications defined by
212 ancestors. If an application name starts with '~', it would
213 remove this application from the list, if it had already been
214 added — but it does not prevent a future addition.
215 E.g. '~firewalled'
216 parameters: key-value pairs to set defaults in class definitions, override
217 existing data, or provide node-specific information in node
218 specifications.
219 By convention, parameters corresponding to an application
220 should be provided as subkey-value pairs, keyed by the name of
221 the application, e.g.
222
223 applications:
224 - ssh.server
225 parameters:
226 ssh.server:
227 permit_root_login: no
228
229reclass starts out reading a node definition file, obtains the list of
230classes, then reads the files corresponding to these classes, recursively
231reading parent classes, and finally merges the applications list (append
232unless
233
martin f. krafft9b2049e2013-06-14 20:05:08 +0200234Version control
235~~~~~~~~~~~~~~~
236I recommend you maintain your reclass inventory database in Git, right from
237the start.
238
martin f. krafft3c333222013-06-14 19:27:57 +0200239Usage
240~~~~~
241For information on how to use reclass directly, invoke reclass.py with --help
242and study the output.
243
martin f. krafft3924e892013-06-25 11:57:03 +0200244The three options --inventory-base-uri, --nodes-uri, and --classes-uri
245together specify the location of the inventory. If the base URI is specified,
246then it is prepended to the other two URIs, unless they are absolute URIs. If
247these two URIs are not specified, they default to 'nodes' and 'classes'.
248Therefore, if your inventory is in '/etc/reclass/nodes' and
249'/etc/reclass/classes', all you need to specify is the base URI as
250'/etc/reclass'.
251
martin f. krafft3c333222013-06-14 19:27:57 +0200252More commonly, however, use of reclass will happen indirectly, and through
martin f. krafft5ee69b32013-06-24 13:41:06 +0200253so-called adapters, e.g. /…/reclass/adapters/salt. The job of an adapter is to
254translate between different invocation paradigms, provide a sane set of
martin f. krafft3c333222013-06-14 19:27:57 +0200255default options, and massage the data from reclass into the format expected by
256the automation tool in use.
257
258Configuration file
259~~~~~~~~~~~~~~~~~~
260reclass can read some of its configuration from a file. The file is
261a YAML-file and simply defines key-value pairs.
262
263The configuration file can be used to set defaults for all the options that
264are otherwise configurable via the command-line interface, so please use the
265--help output of reclass for reference. The command-line option '--nodes-uri'
266corresponds to the key 'nodes_uri' in the configuration file. For example:
267
268 storage_type: yaml_fs
269 pretty_print: True
270 output: json
martin f. krafft3924e892013-06-25 11:57:03 +0200271 inventory_base_uri: /etc/reclass
martin f. krafft3c333222013-06-14 19:27:57 +0200272 nodes_uri: ../nodes
273
274reclass first looks in the current directory for the file called
275'reclass-config.yml' and if no such file is found, it looks "next to" the
276reclass script itself. Adapters implement their own lookup logic.
277
martin f. krafft3c333222013-06-14 19:27:57 +0200278Contributing to reclass
279~~~~~~~~~~~~~~~~~~~~~~~
280Conttributions to reclass are very welcome. Since I prefer to keep a somewhat
martin f. krafft243eb3d2013-07-03 15:28:34 +0200281clean history, I will not just merge pull request.
282
283You can submit pull requests, of course, and I'll rebase them onto HEAD before
284merging. Or send your patches using git-format-patch and git-send-e-mail to
285reclass@pobox.madduck.net.
martin f. krafft3c333222013-06-14 19:27:57 +0200286
287I have added rudimentary unit tests, and it would be nice if you could submit
288your changes with appropriate changes to the tests. To run tests, invoke
289./run_tests.py in the top-level checkout directory.
290
291If you have larger ideas, I'll be looking forward to discuss them with you.
292
martin f. kraffte39e8902013-06-14 22:12:17 +0200293 -- martin f. krafft <madduck@madduck.net> Fri, 14 Jun 2013 22:12:05 +0200