SCIENTIFIC-LINUX-USERS Archives

July 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Petter Olsson <[log in to unmask]>
Reply To:
Petter Olsson <[log in to unmask]>
Date:
Tue, 3 Jul 2012 10:18:32 +0200
Content-Type:
text/plain
Parts/Attachments:
text/plain (82 lines)
Hi guys,

This is the best summary of the leap second problem I have seen thus
far and figured it could not hurt to share:

http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second

Thanks
Petter

On Mon, Jul 2, 2012 at 10:50 PM, Peter van Hooft
<[log in to unmask]> wrote:
> On Mon, Jul 02, 2012 at 08:42:28PM +0100, Jon Peatfield wrote:
>> On Sun, 1 Jul 2012, Peter van Hooft wrote:
>>
>> >Hi,
>> >
>> >On 6.2 (2.6.32-220.23.1.el6.x86_64) I noticed that qpidd started to use an
>> >enormous amount of cpu time. It is probably related to the last leap second.
>> >Doing an strace I saw it doing futex() calls in a furious rate.
>> >After doing the
>> >export LANG=C
>> >date; date `date +"%m%d%H%M%C%y.%S"`; date
>> >trick and restarting qpidd, the problem was gone.
>>
>> Caused by a kernel bug in > 2.6.26 (apparently) < 3.4.something.  See:
>>
>>   http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second
>>
>> which in turn points at:
>>
>>   https://access.redhat.com/knowledge/solutions/154713
>>   https://access.redhat.com/knowledge/articles/15145
>>
>> for the RHEL6 info...  The solutions stuff seems to require a
>> subscription to view in full.
>>
>> Note that there seem to have been two bugs one may have caused a
>> hard crash at any time on the day leading up to the leap-second, and
>> the other causes excessive CPU usage after the leap-second has been
>> applied. Machines with higher loads were more likely to suffer the
>> hard crash.
>>
>> The bug all distros with the bad ranges of kernels, so e.g. el5/sl5
>> seems not to have been affected though el6/sl6 were.
>>
>>  -- Jon
>>
>
> Hi Jon,
>
> Thanks for your mail.
> I just wanted to note this for the benefit of others who ran into problems with qpidd.
> For your information, we've got reports of matlab running slowly (on systems without oopses,
> 'hrtimer: interrupt took xxxxx ns' or 'INFO: task xxxxx blocked for more than yyyy seconds'
> messages) and the re-setting of the system time seems to solve that as well.
> In other words, any application that uses java and/or multithreading may benefit from re-setting
> the system time, even if no evidence of kernel problems are present.
>
> Regards,
>
> Peter



-- 
Petter Olsson
System Administrator
NumberFour AG
Schönhauser Allee 8
10119 Berlin
Germany
Mobile: +49 170 2393359
Phone: +49 30 40505411
Fax: +49 30 40505410
[log in to unmask]

www.facebook.com/NumberFour
www.twitter.com/numberfourag

"People who say it cannot be done should not interrupt those who are doing it."

ATOM RSS1 RSS2