Troy, when will this kernel be available for SL testing? What they are saying is not exactly the same as what we are seeing but it could be worth a try. This is now 3 errata kernels released by the upstream vendor in less than a month. Steve On Thu, 21 Jan 2010, Troy Dawson wrote: > Steven Timm wrote: >> Since the errata kernel release 2.6.18-164.6.1 we have been >> seeing Xen domU's that will occasionally jump forward in time by >> 40-80 minutes. the behavior is such that the clock will jump >> forward and then just sit there until the clock of the underlying >> dom0 catches up to it again. >> >> At first we were running ntpd on our domU's but then disabled it >> in response to suggestions in several howtos. So now we know >> that the problem has nothing to do with rogue ntp packets but >> could very well be something in xen or kernel-xen that is causing it. >> There's a report of something very similar in the CentOS forum >> to which I've appended more details of this bug. >> >> https://www.centos.org/modules/newbb/viewtopic.php?topic_id=23402 >> >> Nothing in the upstream vendor bugzilla about this that I can find, >> or nothing in the Xen mailing lists that's obvious. >> >> Any help is appreciated. >> >> Steve Timm >> >> > > Hi Steve, > With the new kernel (2.6.18-164.11.1.el5) that just was released, there were > lots of time bug fixes. > > http://www.redhat.com/docs/en-US/errata/RHSA-2010-0046/Kernel_Security_Update/index.html > > Here are the time related ones > > * Scientific Linux 5.4 SMP guests running on a Scientific Linux Hypervisor > may have experienced inconsistent time, for example, the time going > backwards. This could have caused some applications to hang. > > * In rare cases, a system management interrupt (SMI) could occur during > CPU frequency calibration (during boot), resulting in the frequency > being calculated to a value larger than the CPU's specification. This > could have resulted in timer values being miscalculated and firing at > incorrect times. Note: This fix is optional. To enable the fix, the > system must be booted with the avoid_smi kernel parameter. > > * A KVM pvclock fix in the kernel-2.6.18-164.2.1.el5 update introduced a > bug: Some SMP guest operating systems experienced time drift. This could > cause problems for time-sensitive applications. > > * Scientific Linux 5.4 guests using KVM pvclock, calling the > clock_gettime(CLOCK_REALTIME) and gettimeofday() functions in sequence > could have, in rare cases, caused clock_gettime() to return a smaller > value than gettimeofday(). If the sequence was reversed, gettimeofday() > could return a smaller value than clock_gettime(CLOCK_REALTIME). This > could cause applications to hang and use large amounts of CPU (up to > 100%), or cause problems for applications that depend on timestamps to > order events. Note: This update only resolves this issue for Intel 64 > and AMD64 systems. The issue can still present on i386 systems. > > I am not positive that it will fix your problem, but it sure looks like this > kernel they did alot of work on time and virtulization. > > Troy > -- ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 [log in to unmask] http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.