Data Collection

Collecting GlusterFS Logs with Fluentd

This article shows how to use Fluentd to collect GlusterFS logs for analysis (search, analytics, troubleshooting, etc.)


GlusterFS is an open source, distributed file system commercially supported by Red Hat, Inc. Each node in GlusterFS generates its own logs, and it's sometimes convenient to have these logs collected in a central location for analysis (e.g., When one GlusterFS node went down, what was happening on other nodes?).

Fluentd is an open source data collector for high-volume data streams. It's a great fit for monitoring GlusterFS clusters because:

  1. Fluentd supports GlusterFS logs as a data source.
  2. Fluentd supports various output systems (e.g., Elasticsearch, MongoDB, Treasure Data, etc.) that can help GlusterFS users analyze the logs.

The rest of this article explains how to set up Fluentd with GlusterFS. For this example, we chose Elasticsearch as the backend system.

Setting up Fluentd on GlusterFS Nodes

Installing Fluentd

First, we'll install Fluentd using the following command:

$ curl -L | sh

Next, we'll install the Fluentd plugin for GlusterFS:

$ sudo /usr/sbin/td-agent-gem install fluent-plugin-glusterfs
Fetching: fluent-plugin-glusterfs-1.0.0.gem (100%)
Successfully installed fluent-plugin-glusterfs-1.0.0
1 gem installed
Installing ri documentation for fluent-plugin-glusterfs-1.0.0...
Installing RDoc documentation for fluent-plugin-glusterfs-1.0.0...

Making GlusterFS Log Files Readable by Fluentd

By default, only root can read the GlusterFS log files. We'll allow others to read the file.

$ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
-rw------- 1 root root 1385  Feb  3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
$ sudo chmod +r /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
$ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
-rw-r--r-- 1 root root 1385  Feb  3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log

Now, modify Fluentd's configuration file. It is located at /etc/td-agent/td-agent.conf.

NOTE: td-agent is Fluentd's rpm/deb package maintained by Treasure Data

This is what the configuration file should look like:

$ sudo cat /etc/td-agent/td-agent.conf

  type glusterfs_log
  path /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
  pos_file /var/log/td-agent/etc-glusterfs-glusterd.vol.log.pos
  tag glusterfs_log.glusterd
  format /^(?<message>.*)$/

<match glusterfs_log.**>
  type forward
  send_timeout 60s
  recover_wait 10s
  heartbeat_interval 1s
  phi_threshold 8
  hard_timeout 60s

    name logserver
    port 24224
    weight 60

    type file
    path /var/log/td-agent/forward-failed

NOTE: the ... section is for failover (when the aggregator instance at is unreachable).

Finally, start td-agent. Fluentd will started with the updated setup.

$ sudo service td-agent start
Starting td-agent:                                         [  OK  ]

Setting Up the Aggregator Fluentd Server

We'll now set up a separate Fluentd instance to aggregate the logs. Again, the first step is to install Fluentd.

$ curl -L | sh

We'll set up the node to send data to Elasticsearch, where the logs will be indexed and written to local disk for backup.

First, install the Elasticsearch output plugin as follows:

$ sudo /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-glusterfs

Then, configure Fluentd as follows:

$ sudo cat /etc/td-agent/td-agent.conf
  type forward
  port 24224

<match glusterfs_log.glusterd>
  type copy

  #local backup
    type file
    path /var/log/td-agent/glusterd

    type elasticsearch
    port 9200
    index_name glusterfs
    type_name fluentd
    logstash_format true

That's it! You should now be able to search and visualize your GlusterFS logs with Kibana.


This article is inspired by Daisuke Sasaki's article on Classmethod's website. Thanks Daisuke!

Learn More