SCIENTIFIC-LINUX-USERS Archives

August 2008

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Michael Hannon <[log in to unmask]>
Reply To:
Michael Hannon <[log in to unmask]>
Date:
Fri, 29 Aug 2008 15:50:39 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (58 lines)
On Fri, Aug 29, 2008 at 09:50:28AM +0100, Faye Gibbins wrote:
> We're running SL5 with this kernel:
> 
> 2.6.18-92.1.10.el5 x86_64
> 
> We're experiencing regular lockd failures on one of our nfs servers.
> 
> One other with doesn't have the problem is still running 2.6.18-53.1.14.el5
> 
> The problem is diagnosed by running:
> 
> time flock ~/junk echo ok; rm ~/junk
> 
> on an affected client of the server.
> 
> It may be related to this report of a simular bug in the kernel discused 
> here:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/181996
> 
> Does anyone know if the latest SL5 kernels are affected with this bug? 
> Can something be done as the only fix we can find is to restart the box 
> the NFS server lives on.

Greetings.  FWIW, I was about to send a similar note to the SL list when
I read your note.  I think the discussion to this point has nailed the
issue, but, for the record, here's what we see:

    Aug 29 15:18:24 client-sys kernel: lockd: server xxxxxx not
    responding, still trying

While on the server we see:

    root   3569  0.0  0.0    0     0 ?    D   Aug27   0:00 [lockd]

The process is not a user-land process and evidently cannot be stopped
except by killing the kernel, i.e., rebooting.

The user-level ramification is that some, but maybe not all (?),
communication with the NFS-mounted /home file system is blocked.  The
first complaints that we typically receive are that logged-in users
cannot run firefox, and people that aren't logged in can't get logged
in.  Needless to say, those are high-profile issues for our user
community.

We found the following bug report (as mentioned by others):

    https://bugzilla.redhat.com/show_bug.cgi?id=453094

And, yes, it WOULD be sweet if somebody could patch this.

					- Mike
-- 
Michael Hannon            mailto:[log in to unmask]
Dept. of Physics          530.752.4966
University of California  530.752.4717 FAX
Davis, CA 95616-8677

ATOM RSS1 RSS2