Subject: | |
From: | |
Reply To: | |
Date: | Mon, 30 Nov 2009 15:22:25 -0600 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
I have compiled the fixed kernel and put it into SL5 x86_64 testing.
I haven't compiled any kernel modules for it, just the kernel.
Troy
Jon Peatfield wrote:
> On Fri, 27 Nov 2009, Michael Bontenackels wrote:
>
>> Hi Jon,
>>
>> we encountered the same problem on four of our 64-bit machines in Aachen. They
>> are setup as homedir servers and quite loaded. On one machine we had to do the
>> xfs_repair after access to the filesystem resulted in Input/Output errors.
>>
>> The XFS is on top of a sofware RAID-5 consisting of 4 HDDs. The filesystem is
>> exported via NFS3 to our desktop cluster. Before the kernel update no problems
>> occured. We decided to step back to the old kernel version with the XFS
>> modules not included in the kernel rpms. Until now everything seems to be
>> quiet again.
>>
>> We hope to find some time next week to test a similar machine with NFS4 and
>> software RAID-5 with XFS on the newest 64-bit SL5 kernel.
>
> You may want to try the test kernel mentioned near the end of
> https://bugzilla.redhat.com/show_bug.cgi?id=512552 since that apparently
> 'fixes' the raid-5 code to report the inability to do a stripe read-ahead
> in a way which the rh xfs module is happy with. At least that will also
> have the current security fixes as well...
>
> I don't know if this problem is visible because rh are using an older base
> of the xfs code or if there was a workround in the version that SL were
> building. Ideally the checking for the bio pages being valid should
> probably be done in both places...
>
> Anyway it seems like a plausable fix and has been in the mainline kernels
> since some time in 2006...
>
> So far I've updated one of our machines which had the problems, and seen
> no problems yet, but the load may have gone down enough not to trigger it
> (I only did the update at 4pm local time).
>
> I'm about to update the other one if the guy runnning code on it doesn't
> object too strongly...
>
> -- Jon
--
__________________________________________________
Troy Dawson [log in to unmask] (630)840-6468
Fermilab ComputingDivision/LSCS/CSI/USS Group
__________________________________________________
|
|
|