On Feb 10, 2005, at 10:44 AM, Jos van Wezel wrote:
> ...
> Interestingly we went to nfs over tcp to work around the problem that
> Devin reports namely the slowdown and final lockup of the nfs servers
> in relation to local IO.
We don't see much, if any, improvement doing nfs over tcp instead of
udp.
> Once we removed the backup client from the servers it was much better
> but it still happens once in a while.
>
> Devin can you confirm that in the lockup all nfsd end up in D wait?
Yes, when the NFS I/O blocking is occurring, the nfsd processes go into
the DW state. On the client side, the blocked process goes into the
'D' state.
Thanks,
Devin
>
> J
>
> Devin Bougie wrote:
>> Thanks Steve and Martin,
>> On Feb 10, 2005, at 3:41 AM, Bly, MJ (Martin) wrote:
>>> Having now read through the description of the problem (which wasn't
>>> available to me when I created the problem description quoted
>>> below!), I
>>> don't think it's the same problem as reported by Devin.
>> Yes, I also agree these are separate issues. The problem we're
>> seeing affects all nfs clients (Solaris, Tru64, linux, ...) accessing
>> nfs-exported filesystems hosted by a RH (RH9, RHEL3, FC3, ...) nfs
>> server. We do also see this with or without LVM.
>> Devin
>>>
>>> The problem we see is absolutely fatal and there is no way out -
>>> processes do not complete and there is no I/O blocking as
>>> described...
>>>
>>> That said, I've seen LVM implicated in some NFS related fatal
>>> lockups of
>>> a different variety.
>>>
>>> Martin.
>>>
>>>
--------------------
Devin Bougie
Laboratory for Elementary-Particle Physics
Computer Group
[log in to unmask]
(607) 254-8353
|