I have compiled the fixed kernel and put it into SL5 x86_64 testing.

I haven't compiled any kernel modules for it, just the kernel.

Troy

Jon Peatfield wrote:
> On Fri, 27 Nov 2009, Michael Bontenackels wrote:
> 
>> Hi Jon,
>>
>> we encountered the same problem on four of our 64-bit machines in Aachen. They
>> are setup as homedir servers and quite loaded. On one machine we had to do the
>> xfs_repair after access to the filesystem resulted in Input/Output errors.
>>
>> The XFS is on top of a sofware RAID-5 consisting of 4 HDDs. The filesystem is
>> exported via NFS3 to our desktop cluster. Before the kernel update no problems
>> occured. We decided to step back to the old kernel version with the XFS
>> modules not included in the kernel rpms. Until now everything seems to be
>> quiet again.
>>
>> We hope to find some time next week to test a similar machine with NFS4 and
>> software RAID-5 with XFS on the newest 64-bit SL5 kernel.
> 
> You may want to try the test kernel mentioned near the end of 
> https://bugzilla.redhat.com/show_bug.cgi?id=512552 since that apparently 
> 'fixes' the raid-5 code to report the inability to do a stripe read-ahead 
> in a way which the rh xfs module is happy with.  At least that will also 
> have the current security fixes as well...
> 
> I don't know if this problem is visible because rh are using an older base 
> of the xfs code or if there was a workround in the version that SL were 
> building.  Ideally the checking for the bio pages being valid should 
> probably be done in both places...
> 
> Anyway it seems like a plausable fix and has been in the mainline kernels 
> since some time in 2006...
> 
> So far I've updated one of our machines which had the problems, and seen 
> no problems yet, but the load may have gone down enough not to trigger it 
> (I only did the update at 4pm local time).
> 
> I'm about to update the other one if the guy runnning code on it doesn't 
> object too strongly...
> 
>   -- Jon


-- 
__________________________________________________
Troy Dawson  [log in to unmask]  (630)840-6468
Fermilab  ComputingDivision/LSCS/CSI/USS Group
__________________________________________________