SCIENTIFIC-LINUX-USERS Archives

September 2008

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Troy Dawson <[log in to unmask]>
Reply To:
Troy Dawson <[log in to unmask]>
Date:
Tue, 2 Sep 2008 08:39:37 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (98 lines)
Hi Faye,

I hate to burst your bubble, but the developers here at Scientific Linux have 
no kudo's with any RedHat people.  We really don't.  We aren't enemies, but we 
don't get any special treatment at all.

If it continues to look like RedHat won't put this patch into their kernel's 
before Update 3, then we might have to release a patched kernel, similar to 
Jean-Paul Chaput's, but with only the one extra patch to fix the lockd bug.

Troy

[log in to unmask] wrote:
> Hi,
> 
>   I would sugest that the best way to proceed with this is for Troy,
> or the other captains of SL, to appraoch RH and use some of their
> kudos they've astablished with RH over the years to help them
> understand that any people use NFS on RH products and that they should
> test and push out this kernel ASAP.
> 
>   Faye
> 
> 
> Quoting Michael Hannon <[log in to unmask]>:
> 
>> On Fri, Aug 29, 2008 at 09:50:28AM +0100, Faye Gibbins wrote:
>>> We're running SL5 with this kernel:
>>>
>>> 2.6.18-92.1.10.el5 x86_64
>>>
>>> We're experiencing regular lockd failures on one of our nfs servers.
>>>
>>> One other with doesn't have the problem is still running 2.6.18-53.1.14.el5
>>>
>>> The problem is diagnosed by running:
>>>
>>> time flock ~/junk echo ok; rm ~/junk
>>>
>>> on an affected client of the server.
>>>
>>> It may be related to this report of a simular bug in the kernel discused
>>> here:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/181996
>>>
>>> Does anyone know if the latest SL5 kernels are affected with this bug?
>>> Can something be done as the only fix we can find is to restart the box
>>> the NFS server lives on.
>> Greetings.  FWIW, I was about to send a similar note to the SL list when
>> I read your note.  I think the discussion to this point has nailed the
>> issue, but, for the record, here's what we see:
>>
>>     Aug 29 15:18:24 client-sys kernel: lockd: server xxxxxx not
>>     responding, still trying
>>
>> While on the server we see:
>>
>>     root   3569  0.0  0.0    0     0 ?    D   Aug27   0:00 [lockd]
>>
>> The process is not a user-land process and evidently cannot be stopped
>> except by killing the kernel, i.e., rebooting.
>>
>> The user-level ramification is that some, but maybe not all (?),
>> communication with the NFS-mounted /home file system is blocked.  The
>> first complaints that we typically receive are that logged-in users
>> cannot run firefox, and people that aren't logged in can't get logged
>> in.  Needless to say, those are high-profile issues for our user
>> community.
>>
>> We found the following bug report (as mentioned by others):
>>
>>     https://bugzilla.redhat.com/show_bug.cgi?id=453094
>>
>> And, yes, it WOULD be sweet if somebody could patch this.
>>
>>                                       - Mike
>> --
>> Michael Hannon            mailto:[log in to unmask]
>> Dept. of Physics          530.752.4966
>> University of California  530.752.4717 FAX
>> Davis, CA 95616-8677
>>
>>
> 
> 
> 
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.


-- 
__________________________________________________
Troy Dawson  [log in to unmask]  (630)840-6468
Fermilab  ComputingDivision/LCSI/CSI DSS Group
__________________________________________________

ATOM RSS1 RSS2