LISTSERV - SCIENTIFIC-LINUX-USERS Archives

SCIENTIFIC-LINUX-USERS Archives

July 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

	LISTSERV Archives
	SCIENTIFIC-LINUX-USERS Home
	SCIENTIFIC-LINUX-USERS July 2012

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: qpidd and leap seconds (kernel futex bug?)
From:	Jon Peatfield <[log in to unmask]>
Reply To:	Jon Peatfield <[log in to unmask]>
Date:	Mon, 2 Jul 2012 20:42:28 +0100
Content-Type:	TEXT/PLAIN
Parts/Attachments:	TEXT/PLAIN (34 lines)

On Sun, 1 Jul 2012, Peter van Hooft wrote:

> Hi,
>
> On 6.2 (2.6.32-220.23.1.el6.x86_64) I noticed that qpidd started to use an
> enormous amount of cpu time. It is probably related to the last leap second.
> Doing an strace I saw it doing futex() calls in a furious rate.
> After doing the
> export LANG=C
> date; date `date +"%m%d%H%M%C%y.%S"`; date
> trick and restarting qpidd, the problem was gone.

Caused by a kernel bug in > 2.6.26 (apparently) < 3.4.something.  See:

   http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second

which in turn points at:

   https://access.redhat.com/knowledge/solutions/154713
   https://access.redhat.com/knowledge/articles/15145

for the RHEL6 info...  The solutions stuff seems to require a subscription 
to view in full.

Note that there seem to have been two bugs one may have caused a hard 
crash at any time on the day leading up to the leap-second, and the other 
causes excessive CPU usage after the leap-second has been applied. 
Machines with higher loads were more likely to suffer the hard crash.

The bug all distros with the bad ranges of kernels, so e.g. el5/sl5 seems 
not to have been affected though el6/sl6 were.

  -- Jon

ATOM RSS1 RSS2

LISTSERV.FNAL.GOV