4fdb8e9d0ba398dad4e663482d1ce31e94b1f585 - salt-formulas/heka

commit	4fdb8e9d0ba398dad4e663482d1ce31e94b1f585	[log] [tgz]
author	Aleš Komárek <github@newt.cz>	Sat Nov 26 12:11:16 2016 +0100
committer	GitHub <noreply@github.com>	Sat Nov 26 12:11:16 2016 +0100
tree	56ab696cdbba2d2ec4a90f70e682284235747fb6
parent	41a41d43c5cc5960ae0930c9332b62bc0446b77d [diff]

Stacklight (#71)

* Stacklight integration

* Round 2

* Variable service_name is missing for systemd file

* preserve_data and ticker_interval are not strings

preserve_data is a boolean, and ticker_interval is a number, so their values
shouldn't have quotes.

* Use "ignore missing" with the j2 include statement

* Added cache dir

* Use module_dir instead of module_directory

This fixes a bug where module_directory is used as the variable name instead of
module_dir.

* Use the proper module directory

The stacklight module dir is /usr/share/lma_collector/common, not
/usr/share/lma_collector_modules. This fixes it.

* Add the extra_fields.lua module

This commit adds the extra_fields Lua module. The extra fields table defined in
this module is empty right now. Eventually, this file will be a Jinja2 template
and the content of the extra fields table will be generated based on the user
configuration.

* Regex encoder fix

* Fix the decoder configuration

This commit uses proper decoder names in heka/meta/heka.yml. It also removes
the aggregator input for now, because it does not have an associated decoder.

* Make Heka send metrics to InfluxDB

* Add HTTP metrics filter to log_collector

* Add logs counter filter to log_collector

* Templatize extra_fields.lua file

* Make InfluxDB time precision configurable

* Configure Elasticsearch output through Pillar

* Use influxdb_time_precision for InfluxDB output

This uses influxdb_time_precision set on metric_collector for configuring the
time precision in the InfluxDB output. This is to use just one parameter for
both the InfluxDB accumulator filter and InfluxDB output.

* Increase maximum open files limit to 102400

* Add alarming support

* Revert "[WIP] Add alarming support"

* Remove the aggregator output for now

This removes the aggregator output for now, as the aggregator doesn't work for
now. This is to avoid output errors in Heka.

* Do not place Heka logs in /var/log/upstart

With this commit all the Heka logs are sent to /var/log/<heka_service>.log.
Previously, stdout was sent to /var/log/<heka_service>.log and stderr was sent
to /var/log/upstart/<heka_service>.log, which was confusing to the operator.

* Remove http check input plugin

Because it is not used anymore.

* Add alarming support

* Make the aggregator load heka/meta/heka.yml

Currently _service.sls does not load aggregator metadata from
heka/meta/heka.yml. This commit fixes that.

* Use filter_by to merge node grains data

* Make the output/tcp.toml template extendable

* Add an aggregator.toml output template

This template extends the tcp.toml output template.

* Add generic timezone support to decoders

This change add a new parameter 'adjust_timezone' for the sandbox
decoder. This parameter should be set to true when the data to be
decoded doesn't contain the proper timezone information.

* Add a run_lua_tests.sh script

This script will be used to run the Lua tests (yet to be added).

To run the script:

    cd tests
    ./run_lua_tests.sh

* Copy Lua tests from fuel-plugin-lma-collector

* Fix the afd tests

* Fix the gse tests

* Add aggregator config to support metadata

* Fix the definition of the remote_collector service

This change removes unneeded plugins and adds the ones that are
otherwise required.

* Fix state dependency

* Add monitoring of the Heka processes

* Set influxdb_time_precision in aggregator class

* Disable the heka service completely

Without this patch `service heka status` reports that the heka service is
running. For example:

root@ctl01:/etc/init.d# /etc/init.d/heka status
 * hekad is running

* Define the highest_severity policy

* Generate the gse_policies Lua module

* Generate gse topology module for each alarm cluster

* Generate gse filter toml for each cluster alarm

* Adapt GSE Lua code

* Remove gse cluster_field parameter

This parameter is not needed anymore. Heka's message_matchers are now used to
match input messages.

* Support dimensions in gse metrics

* Do not rely on pacemaker_local_resource_active

* Define the majority_of_members policy

* Define the availability_of_members policy

* Configure outputs in support metadata

* Fix bug in map.jinja

Fix a bug in map.jinja where the filter_by for the metric_collector modified
the influxdb_defaults dict re-used for the remote_collector. The filter_by
function does deep merges, so some caution is required.

* Cleaning useless default map keys

* Make remote collector send only afd metrics to influx

* Add aggregator output to remote collector

* Extend collectd decoder to support vrrp metrics

* Update map.jinja

* Update collectd decoder to parse ntpd metrics

* Redefine alerting property

The alerting property can be one of 'disabled', 'enabled' or
'enabled_with_notification'

* Fix the gse_policies structure

The structure of the generated gse_policies.lua file is not correct. This
commit fixes that.

* Add Nagios output for metric_collector

The patch embeds the Lua sandbox encoder for Nagios.

* Add Nagios output for the aggregator

* Send only alarm-related data to mine

* Fix the grains_for_mine function

* Fix flake8 in heka_alarming.py

* Configure Hekad poolsize by pillar data

The poolsize must be increased depending on the number of filters.
Typically, the metric_collector on controller nodes and the aggregator on
monitoring node(s) should probably use poolsize=200.

* Make Heka service watch Lua dir

In this way the service will restart when the content of
/usr/share/lma_collector changes.

* Enable collection of notifications

* Add missing hostname variable in GSE code

* Add a log decoder for Galera

* Simplify message matchers

This removes the "Field[aggregator] == NIL" part in the Heka message matchers.

We used to use a scribbler decoder to tag input messages coming in through the
aggregator input. We now have a dedicated Heka "aggregator" instance, so this
mechanism is not necessary anymore.

* Update collectd decoder for nginx metrics

* Return an err message when set_member_status fails

With this commit an explicit error message is displayed in the Heka logs when
set_member_status fails because the cluster has "group_by" set to "hostname"
and an input message with no "hostname" field is received.

This addresses a comment from @SwannCroiset in #51.

* Add contrail log parsers

* Fix the heka grains for the aggregator/remote_collector

Previously, the heka salt grains of the node running aggregator/remote_collector
get all the metric_collector alarms from all nodes (/etc/salt/grains.d/heka).
The resulting mines data is then wrong for the monitoring node, while that
situtation fortunately has no impact regarding metric_collector alarm
configurations, the Nagios service leverging mine data get a wrong list of
alarms for the monitoring node.

This patch fixes the issue with minimal changes but it appears that the logic
behind _service.sls state is not optimal and become hard to understand.
This state is executed several times with different contexts for every heka
'server' types and is not indempotent, indeed the /etc/salt/grains.d/heka file
content is different between 'local' servers (metric|log)_collector and
'remote' servers remote_collector|aggregator.

* Fix issue in lma_alarm.lua template

* Add a log decoder for GlusterFS

* Fix collectd Lua decoder for system metrics

The regression has been introduced by 74ad71d41.

* Update collectd decoder for disk metrics

The disk plugin shipping with the 5.5. version of collectd (installed on
Xenial) provides new metrics: disk_io_time and disk_weighted_io_time.

* Use a dimension key for the Nagios host displaying alarm clusters

* Add redis log parser

* Add zookeeper log parser

* Add cassandra log parser

* Set actual swap_size in collectd decoder

Salt does not create Swap-related grains, but the "ps" module has
a "swap_memory" function that can be used to get Swap data. This commit
uses that function to set swap_size in the collectd decoder.

* Send annotations to InfluxDB

* Add ifmap log parser

* Support remote_collector and aggregator in cluster

When deployed in a cluster, the remote_collector and aggregator
services are only started when the node holds the virtual IP address.

* Add an os_telemetry_collector service

os_telemetry_collector implements reading of Сeilometer samples
from RabbitMQ and pulling them to InfluxDB (samples) and
ElasticSearch (resources)

* heka server role, backward compat

.gitignore[diff]
README.rst[diff]
_modules/heka_alarming.py[Added - diff]
debian/changelog[diff]
heka/_common.sls[Added - diff]
heka/_service.sls[Added - diff]
heka/aggregator.sls[Added - diff]
heka/ceilometer_collector.sls[Added - diff]
heka/files/00-hekad.toml[Deleted - diff]
heka/files/decoder/multidecoder.toml[Deleted - diff]
heka/files/decoder/payloadregex.toml[Deleted - diff]
heka/files/decoder/protobuf.toml[Deleted - diff]
heka/files/decoder/rsyslog.toml[Deleted - diff]
heka/files/decoder/sandbox.toml[Deleted - diff]
heka/files/encoder/es-json.toml[Deleted - diff]
heka/files/encoder/es-payload.toml[Deleted - diff]
heka/files/extra_fields.lua[Added - diff]
heka/files/gse_policies.lua[Added - diff]
heka/files/gse_topology.lua[Added - diff]
heka/files/heka.grain[Added - diff]
heka/files/heka.service[diff]
heka/files/input/amqp.toml[Deleted - diff]
heka/files/input/logstreamer.toml[Deleted - diff]
heka/files/input/process.toml[Deleted - diff]
heka/files/lma_alarm.lua[Added - diff]
heka/files/lua/common/accumulator.lua[Added - diff]
heka/files/lua/common/afd.lua[Added - diff]
heka/files/lua/common/afd_alarm.lua[Added - diff]
heka/files/lua/common/afd_alarms.lua[Added - diff]
heka/files/lua/common/afd_rule.lua[Added - diff]
heka/files/lua/common/ceilometer.lua[Added - diff]
heka/files/lua/common/contrail_patterns.lua[Added - diff]
heka/files/lua/common/elasticsearch_resources.lua[Added - diff]
heka/files/lua/common/gse.lua[Added - diff]
heka/files/lua/common/gse_cluster.lua[Added - diff]
heka/files/lua/common/gse_constants.lua[Added - diff]
heka/files/lua/common/gse_policy.lua[Added - diff]
heka/files/lua/common/gse_utils.lua[Added - diff]
heka/files/lua/common/influxdb.lua[Added - diff]
heka/files/lua/common/java_patterns.lua[Added - diff]
heka/files/lua/common/lma_utils.lua[Added - diff]
heka/files/lua/common/patterns.lua[Added - diff]
heka/files/lua/common/redis_patterns.lua[Added - diff]
heka/files/lua/common/resources.lua[Added - diff]
heka/files/lua/common/samples.lua[Added - diff]
heka/files/lua/common/table_utils.lua[Added - diff]
heka/files/lua/common/value_matching.lua[Added - diff]
heka/files/lua/decoders/cassandra.lua[Added - diff]
heka/files/lua/decoders/collectd.lua[Added - diff]
heka/files/lua/decoders/contrail_api_stdout_log.lua[Added - diff]
heka/files/lua/decoders/contrail_collector_log.lua[Added - diff]
heka/files/lua/decoders/contrail_log.lua[Added - diff]
heka/files/lua/decoders/contrail_supervisor_log.lua[Added - diff]
heka/files/lua/decoders/galera.lua[Added - diff]
heka/files/lua/decoders/generic_syslog.lua[Added - diff]
heka/files/lua/decoders/glusterfs.lua[Added - diff]
heka/files/lua/decoders/ifmap.lua[Added - diff]
heka/files/lua/decoders/keystone_wsgi_log.lua[Added - diff]
heka/files/lua/decoders/libvirt_log.lua[Added - diff]
heka/files/lua/decoders/metering.lua[Added - diff]
heka/files/lua/decoders/metric.lua[Added - diff]
heka/files/lua/decoders/mysql_log.lua[Added - diff]
heka/files/lua/decoders/noop.lua[Added - diff]
heka/files/lua/decoders/notification.lua[Added - diff]
heka/files/lua/decoders/openstack_log.lua[Added - diff]
heka/files/lua/decoders/ovs_log.lua[Added - diff]
heka/files/lua/decoders/pacemaker_log.lua[Added - diff]
heka/files/lua/decoders/pacemaker_resources.lua[Added - diff]
heka/files/lua/decoders/rabbitmq.lua[Added - diff]
heka/files/lua/decoders/redis.lua[Added - diff]
heka/files/lua/decoders/zookeeper.lua[Added - diff]
heka/files/lua/encoders/es_ceilometer_resources.lua[Added - diff]
heka/files/lua/encoders/status_nagios.lua[Added - diff]
heka/files/lua/encoders/status_smtp.lua[Added - diff]
heka/files/lua/filters/afd.lua[Added - diff]
heka/files/lua/filters/afd_api_backends.lua[Added - diff]
heka/files/lua/filters/afd_workers.lua[Added - diff]
heka/files/lua/filters/gse_cluster_filter.lua[Added - diff]
heka/files/lua/filters/hdd_errors_counter.lua[Added - diff]
heka/files/lua/filters/heka_monitoring.lua[Added - diff]
heka/files/lua/filters/http_metrics_aggregator.lua[Added - diff]
heka/files/lua/filters/influxdb_accumulator.lua[Added - diff]
heka/files/lua/filters/influxdb_annotation.lua[Added - diff]
heka/files/lua/filters/instance_state.lua[Added - diff]
heka/files/lua/filters/logs_counter.lua[Added - diff]
heka/files/lua/filters/resource_creation_time.lua[Added - diff]
heka/files/lua/filters/watchdog.lua[Added - diff]
heka/files/lua/outputs/lastfile.lua[Added - diff]
heka/files/output/dashboard.toml[Deleted - diff]
heka/files/output/elasticsearch.toml[Deleted - diff]
heka/files/output/logoutput.toml[Deleted - diff]
heka/files/service_wrapper[Added - diff]
heka/files/toml/decoder/multidecoder.toml[Added - diff]
heka/files/toml/decoder/payloadregex.toml[Added - diff]
heka/files/toml/decoder/protobuf.toml[Added - diff]
heka/files/toml/decoder/sandbox.toml[Added - diff]
heka/files/toml/encoder/elasticsearch.toml[Added - diff]
heka/files/toml/encoder/payload.toml[Added - diff]
heka/files/toml/encoder/protobuf.toml[Renamed from heka/files/encoder/protobuf.toml - diff]
heka/files/toml/encoder/rst.toml[Renamed from heka/files/encoder/RstEncoder.toml - diff]
heka/files/toml/encoder/sandbox.toml[Added - diff]
heka/files/toml/filter/afd_alarm.toml[Added - diff]
heka/files/toml/filter/gse_alarm_cluster.toml[Added - diff]
heka/files/toml/filter/sandbox.toml[Added - diff]
heka/files/toml/global.toml[Added - diff]
heka/files/toml/input/amqp.toml[Added - diff]
heka/files/toml/input/http.toml[Added - diff]
heka/files/toml/input/logstreamer.toml[Added - diff]
heka/files/toml/input/process.toml[Added - diff]
heka/files/toml/input/tcp.toml[Added - diff]
heka/files/toml/output/amqp.toml[Renamed from heka/files/output/amqp.toml - diff]
heka/files/toml/output/dashboard.toml[Added - diff]
heka/files/toml/output/elasticsearch.toml[Added - diff]
heka/files/toml/output/http.toml[Added - diff]
heka/files/toml/output/log.toml[Added - diff]
heka/files/toml/output/tcp.toml[Added - diff]
heka/files/toml/splitter/regex.toml[Added - diff]
heka/files/toml/splitter/token.toml[Added - diff]
heka/init.sls[diff]
heka/log_collector.sls[Added - diff]
heka/map.jinja[diff]
heka/meta/collectd.yml[Added - diff]
heka/meta/heka.yml[Added - diff]
heka/metric_collector.sls[Added - diff]
heka/remote_collector.sls[Added - diff]
heka/server.sls[diff]
metadata/service/aggregator/cluster.yml[Added - diff]
metadata/service/aggregator/single.yml[Added - diff]
metadata/service/ceilometer_collector/single.yml[Added - diff]
metadata/service/log_collector/single.yml[Added - diff]
metadata/service/metric_collector/single.yml[Added - diff]
metadata/service/remote_collector/cluster.yml[Added - diff]
metadata/service/remote_collector/single.yml[Added - diff]
metadata/service/support.yml[diff]
tests/lua/mocks/extra_fields.lua[Added - diff]
tests/lua/test_accumulator.lua[Added - diff]
tests/lua/test_afd.lua[Added - diff]
tests/lua/test_afd_alarm.lua[Added - diff]
tests/lua/test_gse.lua[Added - diff]
tests/lua/test_gse_cluster_policy.lua[Added - diff]
tests/lua/test_gse_utils.lua[Added - diff]
tests/lua/test_influxdb.lua[Added - diff]
tests/lua/test_lma_utils.lua[Added - diff]
tests/lua/test_patterns.lua[Added - diff]
tests/lua/test_table_utils.lua[Added - diff]
tests/lua/test_value_matching.lua[Added - diff]
tests/run_lua_tests.sh[Added - diff]

147 files changed

tree: 56ab696cdbba2d2ec4a90f70e682284235747fb6