Data Collection

Configuration File

This article describes the basic concepts of Fluentd's configuration file.

If you want to know V1 configuration format, please jump to V1 Format section.

Introduction: The Life of a Fluentd Event

Here is a brief overview of the life of a Fluentd event to help you understand the rest of this page:

The configuration file allows the user to control the input and output behavior of Fluentd by (1) selecting input and output plugins and (2) specifying the plugin parameters. The file is required for Fluentd to operate properly.

Config File Location

RPM, Deb or DMG

If you installed Fluentd using the td-agent packages, the config file is located at /etc/td-agent/td-agent.conf. sudo /etc/init.d/td-agent reload will reload the config file.

$ sudo vi /etc/td-agent/td-agent.conf

Gem

If you installed Fluentd using the Ruby Gem, you can create the configuration file using the following commands. Sending a SIGHUP signal will reload the config file.

$ sudo fluentd --setup /etc/fluent
$ sudo vi /etc/fluent/fluent.conf

List of Directives

The configuration file consists of the following directives:

  1. source directives determine the input sources.
  2. match directives determine the output destinations.
  3. filter directives determine the event processing pipelines.
  4. system directives set system wide configuration.
  5. @include directives include other files.

Let's actually create a configuration file step by step.

(1) "source": where all the data come from

Fluentd's input sources are enabled by selecting and configuring the desired input plugins using source directives. Fluentd's standard input plugins include http and forward. http turns fluentd into an HTTP endpoint to accept incoming HTTP messages whereas forward turns fluentd into a TCP endpoint to accept TCP packets. Of course, it can be both at the same time (You can add as many sources as you wish)

# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
<source>
  type forward
  port 24224
</source>

# http://this.host:9880/myapp.access?json={"event":"data"}
<source>
  type http
  port 9880
</source>

Each source directive must include a type parameter. The type parameter specifies which input plugin to use.

Interlude: Routing

The source submits events into the Fluentd's routing engine. An event consists of three entities: tag, time and record. The tag is a string separated by '.'s (e.g. myapp.access), and is used as the directions for Fluentd's internal routing engine. The time field is specified by input plugins, and it must be in the Unix time format. The record is a JSON object.

NOTE: Fluentd accepts all non-period characters as a part of a tag. However, since the tag is sometimes used in a different context by output destinations (e.g., table name, database name, key name, etc.), it is strongly recommended that you stick to the lower-case alphabets, digits and underscore, e.g., ^[a-z0-9_]+$.

In the example above, the HTTP input plugin submits the following event::

# generated by http://this.host:9880/myapp.access?json={"event":"data"}
tag: myapp.access
time: (current time)
record: {"event":"data"}

Didn't find your input source? You can write your own plugin!

You can add new input sources by writing your own plugins. For further information regarding Fluentd's input sources, please refer to the Input Plugin Overview article.

(2) "match": Tell fluentd what to do!

The "match" directive looks for events with _match_ing tags and processes them. The most common use of the match directive is to output events to other systems (for this reason, the plugins that correspond to the match directive are called "output plugins"). Fluentd's standard output plugins include file and forward. Let's add those to our configuration file.

# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
<source>
  type forward
  port 24224
</source>

# http://this.host:9880/myapp.access?json={"event":"data"}
<source>
  type http
  port 9880
</source>

# Match events tagged with "myapp.access" and
# store them to /var/log/fluent/access.%Y-%m-%d
# Of course, you can control how you partition your data
# with the time_slice_format option.
<match myapp.access>
  type file
  path /var/log/fluent/access
</match>

Each match directive must include a match pattern and a type parameter. Only events with a tag matching the pattern will be sent to the output destination (in the above example, only the events with the tag "myapp.access" is matched). The type parameter specifies the output plugin to use.

Just like input sources, you can add new output destinations by writing your own plugins. For further information regarding Fluentd's output destinations, please refer to the Output Plugin Overview article.

Match Pattern: how you control the event flow inside fluentd

The following match patterns can be used for the <match> tag.

  • * matches a single tag part.

    • For example, the pattern a.* matches a.b, but does not match a or a.b.c
  • ** matches zero or more tag parts.

    • For example, the pattern a.** matches a, a.b and a.b.c
  • {X,Y,Z} matches X, Y, or Z, where X, Y, and Z are match patterns.

    • For example, the pattern {a,b} matches a and b, but does not match c

    • This can be used in combination with the * or ** patterns. Examples include a.{b,c}.* and a.{b,c.**}

  • When multiple patterns are listed inside one <match> tag (delimited by one or more whitespaces), it matches any of the listed patterns. For example:

    • The patterns <match a b> match a and b.

    • The patterns <match a.** b.*> match a, a.b, a.b.c. (from the first pattern) and b.d (from the second pattern).

Match Order

Fluentd tries to match tags in the order that they appear in the config file. So if you have the following configuration:

# ** matches all tags. Bad :(
<match **>
  type blackhole_plugin
</match>

<match myapp.access>
  type file
  path /var/log/fluent/access
</match>

then myapp.access is never matched. Wider match patterns should be defined after tight match patterns.

<match myapp.access>
  type file
  path /var/log/fluent/access
</match>

# Capture all unmatched tags. Good :)
<match **>
  type blackhole_plugin
</match>

If you want to send events to multiple outputs, consider out_copy plugin.

(3) "filter": Event processing pipeline

The "filter" directive has same syntax as "match" but "filter" could be chained for processing pipeline. Using filters, event flow is like below:

Input -> filter 1 -> ... -> filter N -> Output

Let's add standard record_transformer filter to "match" example.

# http://this.host:9880/myapp.access?json={"event":"data"}
<source>
  type http
  port 9880
</source>

<filter myapp.access>
  type record_transformer
  <record>
    host_param "#{Socket.gethostname}"
  </record>
</match>

<match myapp.access>
  type file
  path /var/log/fluent/access
</match>

Received event, {"event":"data"}, goes to record_transformer filter first. record_transformer adds "host_param" field to event and filtered event, {"event":"data","host_param":"webserver1"}, goes to file output.

You can also add new filters by writing your own plugins. For further information regarding Fluentd's filter destinations, please refer to the Filter Plugin Overview article.

(4) Set system wide configuration: the "system" directive

Following configurations are set by system directive. You can set same configurations by fluentd options::

  • log_level
  • suppress_repeated_stacktrace
  • emit_error_log_interval
  • suppress_config_dump
  • without_source

Here is an example::

<system>
  # equal to -qq option
  log_level error
  # equal to --without-source option
  without_source
  # ...
</system>

(5) Re-use your config: the "@include" directive

Directives in separate configuration files can be imported using the @include directive::

# Include config files in the ./config.d directory
@include config.d/*.conf

The include directive supports regular file path, glob pattern, and http URL conventions::

# absolute path
@include /path/to/config.conf

# if using a relative path, the directive will use
# the dirname of this config file to expand the path
@include extra.conf

# glob match pattern
@include config.d/*.conf

# http
@include http://example.com/fluent.conf

Supported Data Types for Values

Each Fluentd plugin has a set of parameters. For example, in_tail has parameters such as rotate_wait and pos_file. Each parameter has a specific type associated with it. They are defined as follows:

NOTE: Each parameter's type should be documented. If not, please let the plugin author know.

  • string type: the field is parsed as a string. This is the most "generic" type, where each plugin decides how to process the string.
    • string has 3 literals, non-quoted one line string, ' quoted string and " quoted string.
    • See "Format tips" section and literal examples.
  • integer type: the field is parsed as an integer.
  • float type: the field is parsed as a float.
  • size type: the field is parsed as the number of bytes. There are several notational variations:
    • If the value matches <INTEGER>k or <INTEGER>K, then the value is the INTEGER number of kilobytes.
    • If the value matches <INTEGER>m or <INTEGER>M, then the value is the INTEGER number of megabytes.
    • If the value matches <INTEGER>g or <INTEGER>G, then the value is the INTEGER number of gigabytes.
    • If the value matches <INTEGER>t or <INTEGER>T, then the value is the INTEGER number of terabytes.
    • Otherwise, the field is parsed as integer, and that integer is the number of bytes.
  • time type: the field is parsed as a time duration.
    • If the value matches <INTEGER>s, then the value is the INTEGER seconds.
    • If the value matches <INTEGER>m, then the value is the INTEGER minutes.
    • If the value matches <INTEGER>h, then the value is the INTEGER hours.
    • If the value matches <INTEGER>d, then the value is the INTEGER days.
    • Otherwise, the field is parsed as float, and that float is the number of seconds. This option is useful for specifying sub-second time durations such as "0.1" (=0.1 second = 100ms).
  • array type: the field is parsed as a JSON array
  • hash type: the field is parsed as a JSON object

array and hash are JSON because almost all programming languages and infrastructure tools can generate JSON value easily than unusual format.

Common plugin parameter

  • type: Specify plugin type
  • id: Specify plugin id. in_monitor_agent uses this value for plugin_id field
  • log_level: Specify per plugin log level. See Per Plugin Log section

Format tips

This section describes useful features in configuration format.

Multi line support for array and hash values

You can write multi line value for array and hash values.

array_param [
  "a", "b"
]
hash_param {
  "k":"v",
  "k1":10
}

Fluentd assumes [ or { is a start of array / hash. So if you want to set [ or { started but non-json parameter, please use ' or ".

Example1: mail plugin::

<match **>
  type mail
  subject "[CRITICAL] foo's alert system"
</match>

Example2: map plugin::

<match tag>
  type map
  map '[["code." + tag, time, { "code" => record["code"].to_i}], ["time." + tag, time, { "time" => record["time"].to_i}]]'
  multi true
</match>

NOTE: We will remove this restriction with configuration parser improvement.

"foo" is interpreted as foo, not "foo"

" is a quote character of string value. It causes the different behaviour between v0.12 and old format in v0.10.

str_param "foo"
  • In v0.12, str_param is foo
  • In v0.10 without --use-v1-config, str_param is "foo"

Embedded Ruby code

You can evaluate the Ruby code with #{} in " quoted string. This is useful for setting machine information like hostname.

host_param "#{Socket.gethostname}" # host_param is actual hostname like `webserver1`.

NOTE: config-xxx mixins use "${}", not "#{}". These embedded configurations are two different things.

In double quoted string literal, \ is escape character

\ is interpreted as escape character. You need \ for setting ", \r, \n, \t, \ or several characters in double-quoted string literal.

str_param "foo\nbar" # \n is interpreted as actual LF character