SCIENTIFIC-LINUX-DEVEL Archives

January 2010

SCIENTIFIC-LINUX-DEVEL@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Steven Timm <[log in to unmask]>
Reply To:
Steven Timm <[log in to unmask]>
Date:
Thu, 21 Jan 2010 11:58:31 -0600
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (82 lines)
Troy, when will this kernel be available for SL testing?
What they are saying is not exactly the same as what we are seeing
but it could be worth a try.
This is now 3 errata kernels released by the upstream vendor in
less than a month.

Steve


On Thu, 21 Jan 2010, Troy Dawson wrote:

> Steven Timm wrote:
>> Since the errata kernel release 2.6.18-164.6.1 we have been
>> seeing Xen domU's that will occasionally jump forward in time by
>> 40-80 minutes.  the behavior is such that the clock will jump
>> forward and then just sit there until the clock of the underlying
>> dom0 catches up to it again.
>> 
>> At first we were running ntpd on our domU's but then disabled it
>> in response to suggestions in several howtos.  So now we know
>> that the problem has nothing to do with rogue ntp packets but
>> could very well be something in xen or kernel-xen that is causing it.
>> There's a report of something very similar in the CentOS forum
>> to which I've appended more details of this bug.
>> 
>> https://www.centos.org/modules/newbb/viewtopic.php?topic_id=23402
>> 
>> Nothing in the upstream vendor bugzilla about this that I can find,
>> or nothing in the Xen mailing lists that's obvious.
>> 
>> Any help is appreciated.
>> 
>> Steve Timm
>> 
>> 
>
> Hi Steve,
> With the new kernel (2.6.18-164.11.1.el5) that just was released, there were 
> lots of time bug fixes.
>
> http://www.redhat.com/docs/en-US/errata/RHSA-2010-0046/Kernel_Security_Update/index.html
>
> Here are the time related ones
>
> * Scientific Linux 5.4 SMP guests running on a Scientific Linux Hypervisor 
> may have experienced inconsistent time, for example, the time going 
> backwards. This could have caused some applications to hang.
>
> * In rare cases, a system management interrupt (SMI) could occur during
> CPU frequency calibration (during boot), resulting in the frequency
> being calculated to a value larger than the CPU's specification. This
> could have resulted in timer values being miscalculated and firing at
> incorrect times. Note: This fix is optional. To enable the fix, the
> system must be booted with the avoid_smi kernel parameter.
>
> * A KVM pvclock fix in the kernel-2.6.18-164.2.1.el5 update introduced a
> bug: Some SMP guest operating systems experienced time drift. This could
> cause problems for time-sensitive applications.
>
> * Scientific Linux 5.4 guests using KVM pvclock, calling the
> clock_gettime(CLOCK_REALTIME) and gettimeofday() functions in sequence
> could have, in rare cases, caused clock_gettime() to return a smaller
> value than gettimeofday(). If the sequence was reversed, gettimeofday()
> could return a smaller value than clock_gettime(CLOCK_REALTIME). This
> could cause applications to hang and use large amounts of CPU (up to
> 100%), or cause problems for applications that depend on timestamps to
> order events. Note: This update only resolves this issue for Intel 64
> and AMD64 systems. The issue can still present on i386 systems.
>
> I am not positive that it will fix your problem, but it sure looks like this 
> kernel they did alot of work on time and virtulization.
>
> Troy
>

-- 
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
[log in to unmask]  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.

ATOM RSS1 RSS2