SCIENTIFIC-LINUX-USERS Archives

October 2008

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Billy Crook <[log in to unmask]>
Reply To:
Billy Crook <[log in to unmask]>
Date:
Wed, 1 Oct 2008 15:21:05 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (43 lines)
With any software, it's usually better to use distribution-provided
packages in the distribution-provided repositories unless there is an
explicit, immutable problem with it.  In which case it's good karma to
let upstream know.  The main ganglia gotcha's I've seen are:

1) If you change the cluster name in gmond.conf, change it in
gmetad.conf to match
2) If you stick with the default multicast-based config, and the
machines running gmond have more than one NIC, make sure there's a
route to 239.2.11.71 via whatever interface is in the same ethernet
broadcast domain as the compute nodes.
3) Standard firewall question: Is the firewall blocking it?

Here's an excerpt from the /etc/init.d/gmetad init script I use on
clusters I set up during the course of my job.  Our clusters'
headnodes use eth0 for the cluster-private network, and eth1 for the
outside world.  When that outside world network has the default route
(and it most certainly will if anything does), it absorbs the
multicast range.  Adding the route commands I add causes the route to
be created when gmetad starts, and removed when it stops.

  start)
     echo -n "Starting GANGLIA gmetad: "
     [ -f $GMETAD ] || exit 1

     daemon $GMETAD
     RETVAL=$?
     echo
     [ $RETVAL -eq 0 ] && touch /var/lock/subsys/gmetad
# 2008-10-01 BC: Added the next line to establish route
     route add -host 239.2.11.71 dev eth0
     ;;

 stop)
     echo -n "Shutting down GANGLIA gmetad: "
     killproc gmetad
     RETVAL=$?
     echo
     [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/gmetad
# 2008-10-01 BC: Added the next line to remove route
     route del -host 239.2.11.71 dev eth0
     ;;

ATOM RSS1 RSS2