telegraf

  • By cmars
  • Latest version (#0)
  • trusty
  • Stable

Description

Telegraf is an agent written in Go for collecting metrics from the system it's running on, or from other services, and writing them into InfluxDB or other outputs.

Design goals are to have a minimal memory footprint with a plugin system so that developers in the community can easily add support for collecting metrics from well known services (like Hadoop, Postgres, or Redis) and third party APIs (like Mailchimp, AWS CloudWatch, or Google Analytics).

New input and output plugins are designed to be easy to contribute, we'll eagerly accept pull requests and will manage the set of plugins that Telegraf supports. See the contributing guide for instructions on writing new plugins.


Overview

This is a subordinate charm to deploy telegraf metrics agent to collect metrics from all services deployed in the environment.

For details about telegraf see: https://github.com/influxdata/telegraf

Usage

Deploy telegraf alonside your service, and also a time series storage (in this case, influxdb)

juju deploy telegraf 
juju deploy influxdb 
juju deploy some-service

Add the relations:

juju add-relation telegraf:juju-info some-service:juju-info 
juju add-relation telegraf:influxdb-api influxdb:api

Configuration

By default there is no output plugin configured, but a basic set of input plugins are setup, which can be overriden with inputs_config charm config.

To configure any of the (default or via relation) plugins, the extra_options charm config can be used. It's string in yaml format, for example:

inputs:
    cpu:
        percpu: false
        fielddrop: ["time_*"]
    disk:
        mount_points: ["/"]
        ignore_fs: ["tmpfs", "devtmpfs"]
    elasticsearch:
        local: false
        cluster_health: true
    postgresql:
        databases: ["foo", "bar"]
        tagpass: 
            db: ["template", "postgres"]
outputs:
    influxdb:
        precision: ms

This extra options will only be applied to plugins defined in templates/base_inputs.conf and any other plugins configured via relations.

Apache input

For the apache input plugin, the charm provides the apache relation which uses apache-website interface. Current apache charm disables mod_status and in order to telegraf apache input to work 'status' should be removed from the list of disable_modules in the apache charm config.

Postgresql input

Due to a bug/regression in the new postgresql-charm in order to get actual postgresql metrics, two relations need to be established between telegraf and the postgresql service, first a plain juju-info relation to get telegraf setup and then a regular postgresql/db one. e.g:

juju add-relation telegraf:juju-info postgresql:juju-info
juju add-relation telegraf:postgresql postgresql:db

Output

The only output plugin supported via relation is influxdb, any other output plugin needs to be configured manually (via juju set)

To use a different metrics storage, e.g: graphite. the plugin configuration needs to be set as a base64 string in outputs_config configuration.

For exmaple, save the following config to a file:

[[outputs.graphite]]
  servers = ["10.0.3.231:2003"]
  prefix = "juju_local.devel.telegraf"
  timeout = 10

And then

juju set telegraf outputs_config="$(cat graphite-output.conf | base64)"

This will make telegraf agents to send the metrics to the graphite instance.

Contact Information

Configuration

outputs_config
(string) [outputs.xxx] sections as a string
interval
(string) Default data collection interval for all plugins
10s
collection_jitter
(string) Collection jitter is used to jitter the collection by a random amount. Each plugin will sleep for a random time within jitter before collecting. This can be used to avoid many plugins querying things like sysfs at the same time, which can have a measurable effect on the system.
0s
package_name
(string) Filename of telegraf deb package. If this matches the name of a file in the files charm directory the package will be installed from there, otherwise it will try to install it from the repository provided by apt_repository.
telegraf
tags
(string) Comma separated list of global tags. ie, 'dc=us-east-1,rack=1a' will tag all metrics with dc=us-east-1 and rack=1a
extra_options
(string) YAML with extra options for out|inputs managed by relations or in the default config. example: inputs: cpu: percpu: false fielddrop: ["time_*"] disk: mount_points: ["/"] ignore_fs: ["tmpfs", "devtmpfs"] elasticsearch: local: false cluster_health: true postgresql: databases: ["foo", "bar"] tagpass: db: ["template", "postgres"] outputs: influxdb: precision: ms
apt_repository
(string) An apt sources.list line for a repository containing the telegraf package
deb http://ppa.launchpad.net/telegraf-devs/ppa/ubuntu trusty main
hostname
(string) Override default hostname, if empty use os.Hostname() Supports using UNIT_NAME as the value, and the charm will use a sanitized unit name, e.g: service_name-0
UNIT_NAME
quiet
(boolean) Run telegraf in quiet mode
extra_plugins
(string) Extra plugins, manually configured. This is expected to be a string and will be saved "as is" in /etc/telegraf/telegraf.d/extra_plugins.conf
prometheus_output_port
(string) If set prometheus output plugin will be configured to listen on the provided port. If set to string "default" the charm will use default port (9103)
flush_interval
(string) Default data flushing interval for all outputs. You should not set this below interval. Maximum flush_interval will be flush_interval + flush_jitter
10s
apt_repository_key
(string) GPG key for apt_repository
C94406F5
inputs_config
(string) [inputs.xxx] sections as a string, this override default input plugins.
round_interval
(boolean) Rounds collection interval to 'interval' ie, if interval="10s" then always collect on :00, :10, :20, etc.
True
debug
(boolean) Run telegraf in debug mode
metric_buffer_limit
(int) Telegraf will cache metric_buffer_limit metrics for each output, and will flush this buffer on a successful write.
10000
flush_jitter
(string) Jitter the flush interval by a random amount. This is primarily to avoid large write spikes for users running a large number of telegraf instances. ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
0s