blob: c7b511d6f8e1acd1e646070e763a903a88108402 [file] [log] [blame]
Petr Ruzickaba0a49d2018-01-31 22:16:20 +01001
2==================================
3Watchdog Formula
4==================================
5
Petr Ruzicka802abc22018-03-19 08:49:43 +01006The Linux kernel can reset the system if serious problems are detected. This can
7be implemented via special watchdog hardware, or via a slightly less reliable
8software-only watchdog inside the kernel. Either way, there needs to be a daemon
9that tells the kernel the system is working fine. If the daemon stops doing that,
10the system is reset.
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010011
Petr Ruzicka802abc22018-03-19 08:49:43 +010012watchdog is such a daemon. It opens `/dev/watchdog`, and keeps writing to it
13often enough to keep the kernel from resetting, at least once per minute. Each
14write delays the reboot time another minute. After a minute of inactivity the
15watchdog hardware will cause the reset. In the case of the software watchdog the
16ability to reboot will depend on the state of the machines and interrupts.
17
azvyagintsev8731ac42018-04-30 14:03:00 +030018This formula installs and configure watchdog daemon
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010019
20Sample Pillars
21==============
22
23Single watchdog service
24
25.. code-block:: yaml
26
27 watchdog:
28 server:
Petr Ruzicka7a333c32018-02-01 14:02:59 +010029 admin: root
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010030 enabled: true
Petr Ruzicka7a333c32018-02-01 14:02:59 +010031 interval: 1
32 log_dir: /var/log/watchdog
33 realtime: yes
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010034 timeout: 60
Petr Ruzicka7a333c32018-02-01 14:02:59 +010035 device: /dev/watchdog
36
azvyagintsev8731ac42018-04-30 14:03:00 +030037
38Sample Pillars with kernel module
39=================================
40
41Salt Stack will automatically detect the necessary kernel module which needs to be loaded (ex. hpwdt, iTCO_wdt).
42If the hardware model is not predefined in map.jinja the default watchdog driver is used: softdog
43You may specify the kernel parameters if needed:
44
45.. code-block:: yaml
46
47 watchdog:
48 server:
49 admin: root
50 enabled: true
51 interval: 1
52 log_dir: /var/log/watchdog
53 realtime: yes
54 timeout: 60
55 device: /dev/watchdog
56 module: softdog
57 ......
58 ......
59 linux:
60 system:
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010061 kernel:
azvyagintsev8731ac42018-04-30 14:03:00 +030062 module:
63 softdog:
64 option:
65 soft_panic: 1
66
Pavel Cizinskyb71fc892018-12-12 12:03:39 +010067INFO: extra formula [salt-formula-linux](https://gerrit.mcp.mirantis.com/salt-formulas/linux) required.
azvyagintsev8731ac42018-04-30 14:03:00 +030068
69In that case, apply command should also care about linux state. For example:
70
71
72.. code-block:: bash
73
74 salt "kvm0*" -l debug state.apply watchdog.server,linux.system.kernel -l debug
Petr Ruzickaba0a49d2018-01-31 22:16:20 +010075
76
77More Information
78================
79
80https://github.com/torvalds/linux/blob/master/Documentation/watchdog/watchdog-api.txt
azvyagintsev8731ac42018-04-30 14:03:00 +030081Those formula also support json-schema definition with all options.
82Please refer to "watchdog/schemas/\*.yaml" for more information.
83
Petr Ruzickade2a51b2018-03-08 13:44:59 +010084To-Do
Pavel Cizinskyb71fc892018-12-12 12:03:39 +010085=====
Petr Ruzickade2a51b2018-03-08 13:44:59 +010086
Petr Ruzicka802abc22018-03-19 08:49:43 +010087Remove the part in `watchdog/server.sls` about the Ubuntu Xenial bug once it's fixed in upstream:
Petr Ruzickade2a51b2018-03-08 13:44:59 +010088https://bugs.launchpad.net/ubuntu/+source/watchdog/+bug/1448924