Configurating Ganglia with unicast communication

Configurating Ganglia with unicast communication

By default ganglia uses multicast communication to share information between gmond processes. This can cause problems in some network configurations, so it can also be configured to use unicast. This is explained well in this post on serverfault.

Configuring gmond

Each worker node (and optionally other nodes in the cluster such as the head node) should run gmond. Ensure that this is installed by running:

sudo yum install -y ganglia-gmond

The cluster section describes where the cluster is and can be filled in as required, e.g.

cluster {
  name = "sl1 cluster"
  owner = "Peter"
  latlong = "unspecified"
  url = "unspecified"
}

The important configuration is, however, in the udp_send_channel and udp_recv_channel sections. Each gmond sends metrics on its udp_send_channel and optionally receives metrics on its udp_recv_channel. The gmond will also provide an XML format report on the tcp_accept_channel port. In our configuration we will configure all gmond processes in the cluster to send their metrics to a single gmond, so choose one of the cluster nodes to be the metric collector. For example, this might be train1.bi.up.ac.za. This needs to be specified as the host in the udp_send_channel. Then disable the multicast support by commenting out the mcast_join lines and by commenting out the bind line in the udp_recv_channel. So for example:

udp_send_channel {
  bind_hostname = yes # Highly recommended, soon to be default.
                       # This option tells gmond to use a source address
                       # that resolves to the machine's hostname.  Without
                       # this, the metrics may appear to come from any
                       # interface and the DNS names associated with
                       # those IPs will be used to create the RRDs.
  #mcast_join = 239.2.11.71
  host = train1.bi.up.ac.za
  port = 8649
  ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
  #mcast_join = 239.2.11.71
  port = 8649
  #bind = 239.2.11.71
}

Restart the gmond after this configuration change with:

sudo service gmond restart

and ensure that the firewall is configured to allow UDP and TCP traffic on port 8649. It takes a while for gmond to provide information, but you can check its view of the cluster by using this command:

telnet train1.bi.up.ac.za 8649 2>&1 |more

(of course change the hostname to the machine you want to get metrics from). This requires telnet, so if that is not available install it with sudo yum install -y telnet.

Configuring gmetad

The head node (or a dedicated monitoring node) should run gmetad and the web interface. Ensure that this is installed with:

yum install -y ganglia-gmetad ganglia-web httpd

In the /etc/ganglia/gmetad.conf file change the data source line so that it refers to the machine you’re using to collect all the gmond info. So for example:

data_source "row1 cluster" train1.bi.up.ac.za

The port need not be specified if you’re using port 8649 for ganglia (the default port).

By default the ganglia web interface is configured to only allow queries from localhost. To change this, edit /etc/httpd/conf.d/ganglia.conf and add an Allow line to increase the permissions. E.g.

  #
  # Ganglia monitoring system php web frontend
  #

  Alias /ganglia /usr/share/ganglia

  <Location /ganglia>
    Order deny,allow
    Deny from all
    Allow from 127.0.0.1
    Allow from ::1
    Allow from .bi.up.ac.za
    # Allow from .example.com
  </Location>

The syntax of the Allow line is discussed in the Apache documentation and access control docs. After making these configuration changes, restart the httpd with:

sudo service httpd restart

Making settings permanent

To preserver your settings over reboots, on the gmond hosts do:

sudo chkconfig gmond on

and on the gmetad/web hosts do:

sudo chkconfig gmetad on
sudo chkconfig httpd on

If you don’t see the graphs you expect

Test the output of gmond using the telnet command listed above. Also consider restarting the gmond and gmetad processes.

Leave a Reply

Your email address will not be published. Required fields are marked *