blob: 35e7478d3300c2b88e89a1faa812fabf6dedc4d2 [file] [log] [blame]
martin f. krafft3c333222013-06-14 19:27:57 +02001=============================================================
2 reclass recursive external node classification
3=============================================================
4reclass is © 20072013 martin f. krafft <madduck@madduck.net>
5and available under the terms of the Artistic Licence 2.0
martin f. kraffte39e8902013-06-14 22:12:17 +02006'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
martin f. krafft3c333222013-06-14 19:27:57 +02007
8reclass is an "external node classifier" (ENC) as can be used with automation
martin f. krafftff1bae82013-07-04 07:52:30 +02009tools, such as Puppet, Salt, and Ansible. It is also a stand-alone tool for
10merging data sources recursively.
martin f. krafft3c333222013-06-14 19:27:57 +020011
12The purpose of an ENC is to allow a system administrator to maintain an
13inventory of nodes to be managed, completely separately from the configuration
14of the automation tool. Usually, the external node classifier completely
martin f. krafft5ee69b32013-06-24 13:41:06 +020015replaces the tool-specific inventory (such as site.pp for Puppet,
martin f. krafft3924e892013-06-25 11:57:03 +020016ext_pillar/master_tops for Salt, or /etc/ansible/hosts).
martin f. krafft3c333222013-06-14 19:27:57 +020017
martin f. krafft62239892013-06-14 20:03:59 +020018reclass allows you to define your nodes through class inheritance, while
19always able to override details of classes further up the tree. Think of
20classes as feature sets, as commonalities between nodes, or as tags. Add to
21that the ability to nest classes (multiple inheritance is allowed,
22well-defined, and encouraged), and piece together your infrastructure from
23smaller bits, eliminating redundancy and exposing all important parameters to
24a single location, logically organised.
25
martin f. krafft3c333222013-06-14 19:27:57 +020026In general, the ENC fulfills two jobs:
27
28 - it provides information about groups of nodes and group memberships
29 - it gives access to node-specific information, such as variables
30
martin f. krafft5ee69b32013-06-24 13:41:06 +020031In this document, you will find an overview of the concepts of reclass and the
32way it works. Have a look at README.Salt and README.Ansible for information
33about integration of reclass with these tools.
martin f. krafft3c333222013-06-14 19:27:57 +020034
martin f. krafftd4833b32013-06-23 13:35:46 +020035Installation
36~~~~~~~~~~~~
martin f. krafft012103e2013-07-03 20:02:02 +020037Before you can use reclass, you need to install it into a place where Python
38can find it. Unless you installed a package from your distribution, the
39following step:
martin f. krafftd4833b32013-06-23 13:35:46 +020040
martin f. krafft012103e2013-07-03 20:02:02 +020041 python setup.py install
martin f. krafftd4833b32013-06-23 13:35:46 +020042
martin f. krafft012103e2013-07-03 20:02:02 +020043This will install the package to /usr/local, which is likely in your Python
44path. You can check this using
martin f. krafftd4833b32013-06-23 13:35:46 +020045
martin f. krafft012103e2013-07-03 20:02:02 +020046 python -c 'import sys; print sys.path'
47
48If you want to install to a different location, use --prefix like so:
49
50 python setup.py install --prefix=/opt/local
51
52More options can be found in the output of
53
54 python setup.py install --help
55 python setup.py --help
56 python setup.py --help-commands
57 python setup.py --help [cmd]
58
59If you just want to run reclass from source, e.g. because you are going to be
60making and testing changes, install it in "development mode":
61
62 python setup.py develop
63
64To uninstall:
65
66 python setup.py develop --uninstall
67
68Uninstallation currently isn't possible for packages installed to /usr/local
69as per the above method, unfortunately: http://bugs.python.org/issue4673.
70The following should do:
71
72 rm -r /usr/local/lib/python*/dist-packages/reclass* /usr/local/bin/reclass
martin f. krafftd4833b32013-06-23 13:35:46 +020073
martin f. krafft3c333222013-06-14 19:27:57 +020074reclass concepts
75~~~~~~~~~~~~~~~~
76reclass assumes a node-centric perspective into your inventory. This is
77obvious when you query reclass for node-specific information, but it might not
78be clear when you ask reclass to provide you with a list of groups. In that
79case, reclass loops over all nodes it can find in its database, reads all
80information it can find about the nodes, and finally reorders the result to
81provide a list of groups with the nodes they contain.
82
83Since the term 'groups' is somewhat ambiguous, it helps to start off with
84a short glossary of reclass-specific terminology:
85
86 node: A node, usually a computer in your infrastructure
87 class: A category, tag, feature, or role that applies to a node
88 Classes may be nested, i.e. there can be a class hierarchy
89 application: A specific set of behaviour to apply to members of a class
90 parameter: Node-specific variables, with inheritance throughout the class
91 hierarchy.
92
93A class consists of zero or more parent classes, zero or more applications,
94and any number of parameters.
95
96A node is almost equivalent to a class, except that it usually does not (but
97can) specify applications.
98
99When reclass parses a node (or class) definition and encounters a parent
100class, it recurses to this parent class first before reading any data of the
101node (or class). When reclass returns from the recursive, depth first walk, it
102then merges all information of the current node (or class) into the
103information it obtained during the recursion.
104
martin f. krafftff1cb062013-06-20 17:23:00 +0200105Furthermore, a node (or class) may define a list of classes it derives from,
106in which case classes defined further down the list will be able to override
107classes further up the list.
108
martin f. krafft3c333222013-06-14 19:27:57 +0200109Information in this context is essentially one of a list of applications or
110a list of parameters.
111
112The interaction between the depth-first walk and the delayed merging of data
113means that the node (and any class) may override any of the data defined by
114any of the parent classes (ancestors). This is in line with the assumption
115that more specific definitions ("this specific host") should have a higher
116precedence than more general definitions ("all webservers", which includes all
117webservers in Munich, which includes "this specific host", for example).
118
119Here's a quick example, showing how parameters accumulate and can get
120replaced.
121
122 All unixnodes (i.e. nodes who have the 'unixnodes' class in their ancestry)
123 have /etc/motd centrally-managed (through the 'motd' application), and the
124 unixnodes class definition provides a generic message-of-the-day to be put
125 into this file.
126
127 All debiannodes, which are descendants of unixnodes, should include the
128 Debian codename in this message, so the message-of-the-day is overwritten in
129 the debiannodes class.
130
131 The node 'quantum.example.org' will have a scheduled downtime this weekend,
132 so until Monday, an appropriate message-of-the-day is added to the node
133 definition.
134
martin f. krafftff1cb062013-06-20 17:23:00 +0200135 When the 'motd' application runs, it receives the appropriate
martin f. kraffta0db0702013-06-20 17:25:01 +0200136 message-of-the-day (from 'quantum.example.org' when run on that node) and
martin f. krafftff1cb062013-06-20 17:23:00 +0200137 writes it into /etc/motd.
martin f. krafft3c333222013-06-14 19:27:57 +0200138
139At this point it should be noted that parameters whose values are lists or
140key-value pairs don't get overwritten by children classes or node definitions,
141but the information gets merged (recursively) instead.
142
143Similarly to parameters, applications also accumulate during the recursive
144walk through the class ancestry. It is possible for a node or child class to
145_remove_ an application added by a parent class, by prefixing the application
146with '~'.
147
148Finally, reclass happily lets you use multiple inheritance, and ensures that
149the resolution of parameters is still well-defined. Here's another example
150building upon the one about /etc/motd above:
151
152 'quantum.example.org' (which is back up and therefore its node definition no
153 longer contains a message-of-the-day) is at a site in Munich. Therefore, it
154 is a child of the class 'hosted@munich'. This class is independent of the
155 'unixnode' hierarchy, 'quantum.example.org' derives from both.
156
157 In this example infrastructure, 'hosted@munich' is more specific than
158 'debiannodes' because there are plenty of Debian nodes at other sites (and
159 some non-Debian nodes in Munich). Therefore, 'quantum.example.org' derives
160 from 'hosted@munich' _after_ 'debiannodes'.
161
162 When an electricity outage is expected over the weekend in Munich, the admin
163 can change the message-of-the-day in the 'hosted@munich' class, and it will
164 apply to all hosts in Munich.
165
166 However, not all hosts in Munich have /etc/motd, because some of them are
167 'windowsnodes'. Since the 'windowsnodes' ancestry does not specify the
168 'motd' application, those hosts have access to the message-of-the-day in the
169 node variables, but the message won't get used
170
171 unless, of course, 'windowsnodes' specified a Windows-specific application
172 to bring such notices to the attention of the user.
173
martin f. krafftff1cb062013-06-20 17:23:00 +0200174It's also trivial to ensure a certain order of class evaluation. Here's
175another example:
176
177 The 'ssh.server' class defines the 'permit_root_login' parameter to 'no'.
178
179 The 'backuppc.client' class defines the parameter to 'without-password',
180 because the BackupPC server might need to log in to the host as root.
181
182 Now, what happens if the admin accidentally provides the following two
183 classes?
184
185 - backuppc.client
186 - ssh.server
187
188 Theoretically, this would mean 'permit_root_login' gets set to 'no'.
189
martin f. kraffta0db0702013-06-20 17:25:01 +0200190 However, since all 'backuppc.client' nodes need 'ssh.server' (at least in
191 most setups), the class 'backuppc.client' itself derives from 'ssh.server',
martin f. krafftff1cb062013-06-20 17:23:00 +0200192 ensuring that it gets parsed before 'backuppc.client'.
193
194 When reclass returns to the node and encounters the 'ssh.server' class
martin f. kraffta0db0702013-06-20 17:25:01 +0200195 defined there, it simply skips it, as it's already been processed.
martin f. krafftff1cb062013-06-20 17:23:00 +0200196
martin f. krafft3c333222013-06-14 19:27:57 +0200197reclass operations
198~~~~~~~~~~~~~~~~~~
199While reclass has been built to support different storage backends through
200plugins, currently only the 'yaml_fs' storage backend exists. This is a very
201simple, yet powerful, YAML-based backend, using flat files on the filesystem
202(as suggested by the _fs postfix).
203
204yaml_fs works with two directories, one for node definitions, and another for
205class definitions. It is possible to use a single directory for both, but that
206could get messy and is therefore not recommended.
207
208Files in those directories are YAML-files, specifying key-value pairs. The
209following three keys are read by reclass:
210
211 classes: a list of parent classes
212 appliations: a list of applications to append to the applications defined by
213 ancestors. If an application name starts with '~', it would
214 remove this application from the list, if it had already been
215 added — but it does not prevent a future addition.
216 E.g. '~firewalled'
217 parameters: key-value pairs to set defaults in class definitions, override
218 existing data, or provide node-specific information in node
219 specifications.
220 By convention, parameters corresponding to an application
221 should be provided as subkey-value pairs, keyed by the name of
222 the application, e.g.
223
224 applications:
225 - ssh.server
226 parameters:
227 ssh.server:
228 permit_root_login: no
229
230reclass starts out reading a node definition file, obtains the list of
231classes, then reads the files corresponding to these classes, recursively
232reading parent classes, and finally merges the applications list (append
233unless
234
martin f. krafft9b2049e2013-06-14 20:05:08 +0200235Version control
236~~~~~~~~~~~~~~~
237I recommend you maintain your reclass inventory database in Git, right from
238the start.
239
martin f. krafft3c333222013-06-14 19:27:57 +0200240Usage
241~~~~~
242For information on how to use reclass directly, invoke reclass.py with --help
243and study the output.
244
martin f. krafft3924e892013-06-25 11:57:03 +0200245The three options --inventory-base-uri, --nodes-uri, and --classes-uri
246together specify the location of the inventory. If the base URI is specified,
247then it is prepended to the other two URIs, unless they are absolute URIs. If
248these two URIs are not specified, they default to 'nodes' and 'classes'.
249Therefore, if your inventory is in '/etc/reclass/nodes' and
250'/etc/reclass/classes', all you need to specify is the base URI as
251'/etc/reclass'.
252
martin f. krafftda522872013-07-03 20:31:55 +0200253If you've installed reclass as per the above instructions, try to run it from
254the source directory like this:
255
256 reclass -b examples/ --inventory
257 reclass -b examples/ --node localhost
258
259Those data come from examples/nodes and examples/classes, and you can surely
260make your own way from here.
261
martin f. krafft3c333222013-06-14 19:27:57 +0200262More commonly, however, use of reclass will happen indirectly, and through
martin f. krafft5ee69b32013-06-24 13:41:06 +0200263so-called adapters, e.g. /…/reclass/adapters/salt. The job of an adapter is to
264translate between different invocation paradigms, provide a sane set of
martin f. krafft3c333222013-06-14 19:27:57 +0200265default options, and massage the data from reclass into the format expected by
266the automation tool in use.
267
268Configuration file
269~~~~~~~~~~~~~~~~~~
270reclass can read some of its configuration from a file. The file is
271a YAML-file and simply defines key-value pairs.
272
273The configuration file can be used to set defaults for all the options that
274are otherwise configurable via the command-line interface, so please use the
275--help output of reclass for reference. The command-line option '--nodes-uri'
276corresponds to the key 'nodes_uri' in the configuration file. For example:
277
278 storage_type: yaml_fs
279 pretty_print: True
280 output: json
martin f. krafft3924e892013-06-25 11:57:03 +0200281 inventory_base_uri: /etc/reclass
martin f. krafft3c333222013-06-14 19:27:57 +0200282 nodes_uri: ../nodes
283
284reclass first looks in the current directory for the file called
285'reclass-config.yml' and if no such file is found, it looks "next to" the
286reclass script itself. Adapters implement their own lookup logic.
287
martin f. krafft46c06522013-07-03 20:03:22 +0200288 -- martin f. krafft <madduck@madduck.net> Fri, 03 Jul 2013 19:57:05 +0200