SCIENTIFIC-LINUX-USERS Archives

July 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Peter van Hooft <[log in to unmask]>
Reply To:
Peter van Hooft <[log in to unmask]>
Date:
Mon, 2 Jul 2012 22:50:57 +0200
Content-Type:
text/plain
Parts/Attachments:
text/plain (51 lines)
On Mon, Jul 02, 2012 at 08:42:28PM +0100, Jon Peatfield wrote:
> On Sun, 1 Jul 2012, Peter van Hooft wrote:
> 
> >Hi,
> >
> >On 6.2 (2.6.32-220.23.1.el6.x86_64) I noticed that qpidd started to use an
> >enormous amount of cpu time. It is probably related to the last leap second.
> >Doing an strace I saw it doing futex() calls in a furious rate.
> >After doing the
> >export LANG=C
> >date; date `date +"%m%d%H%M%C%y.%S"`; date
> >trick and restarting qpidd, the problem was gone.
> 
> Caused by a kernel bug in > 2.6.26 (apparently) < 3.4.something.  See:
> 
>   http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second
> 
> which in turn points at:
> 
>   https://access.redhat.com/knowledge/solutions/154713
>   https://access.redhat.com/knowledge/articles/15145
> 
> for the RHEL6 info...  The solutions stuff seems to require a
> subscription to view in full.
> 
> Note that there seem to have been two bugs one may have caused a
> hard crash at any time on the day leading up to the leap-second, and
> the other causes excessive CPU usage after the leap-second has been
> applied. Machines with higher loads were more likely to suffer the
> hard crash.
> 
> The bug all distros with the bad ranges of kernels, so e.g. el5/sl5
> seems not to have been affected though el6/sl6 were.
> 
>  -- Jon
> 

Hi Jon,

Thanks for your mail.
I just wanted to note this for the benefit of others who ran into problems with qpidd.
For your information, we've got reports of matlab running slowly (on systems without oopses,
'hrtimer: interrupt took xxxxx ns' or 'INFO: task xxxxx blocked for more than yyyy seconds'
messages) and the re-setting of the system time seems to solve that as well.
In other words, any application that uses java and/or multithreading may benefit from re-setting
the system time, even if no evidence of kernel problems are present.

Regards,

Peter

ATOM RSS1 RSS2