SCIENTIFIC-LINUX-USERS Archives

January 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Chris Schanzle <[log in to unmask]>
Reply To:
Chris Schanzle <[log in to unmask]>
Date:
Sun, 22 Jan 2012 11:35:37 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (61 lines)
On 01/21/2012 11:01 AM, Nico Kadel-Garcia wrote:
> On Fri, Jan 20, 2012 at 11:14 PM, Chris Schanzle<[log in to unmask]>  wrote:
>> On 01/20/2012 09:51 PM, Konstantin Olchanski wrote:
>>>
>>> I feel obligated to vent about the ongoing mess-up of the nfs-utils
>>> package.
>>>
>>> In the nutshell, all of my SL6.1 machines are affected (not "both
>>> machines",
>>> both dozens of machines, 24 is the last count).
>>>
>>> The "/" directory is filling up with 1 Mbyte core files from umount.nfs
>>> at the rate of about 3 core dumps per minute.
>>
>>
>>
>> Just wanted to put a "me too" out there.  I admit to not keeping up with the
>> various nfs-utils versions and just recently joined this list.
>>
>> Seemed that umount.nfs dumping core caused /etc/mtab to not get cleaned up,
>> so you had many duplicates in the output of say, 'df'.
>>
>> We don't use kerberos, just NIS and the automounter, so it seemed like a lot
>> of the discussion didn't apply to us.  It didn't affect all our systems
>> either.
>>
>> I feel the same frustration.  I have stopped rolling out EL6 and I'm
>> apologizing to my existing early adopter users.  With this issue and my
>> previously mentioned email about the inability to reboot successfully (due
>> to umount issues) not generating any discussion, I'm preparing to hop back
>> to the other prominent NA enterprise Linux derivative.  It's great to have
>> choices.
>>
>> PS - I just noticed the mailing list doesn't add a Reply-To: field to direct
>> replies to the list.
>
> Chris, I'm not sure you can blame SL for this one at all. Our favorite
> upstream vendor occasionally publishes software with a bug, although
> they're very good about testing and fixing any reported issues, which
> is why some of us pay them for support licenses and others take
> advantage of the goodness of free software. Is there a sign or pointer
> that this was, in fact, an SL compilation generated bug?

Yes, I understand and agree we get upstream's occasional rare bugs.  I'm pointing the finger (possibly!) at SL due to this thread which references the re-issuing of nfs-utils due to a build environment error:
http://listserv.fnal.gov/scripts/wa.exe?A2=ind1201&L=scientific-linux-devel&T=0&P=77

There are other interesting threads including "Recent updates break autofs/ldap/krb5", but those may be upstream bugs.

> Are you mounting NFS directories at / ? That's usually a *REALLY* bad
> idea, because if the NFS mount has any issues, it interferes with any
> function that glances in / for permissions or other information. If
> not, do you have any idea why it'd dumping those files in / ?

No - I'm sorry if I wrote something that implied that.  We use standard indirect maps (e.g., /home/<user>).  I am guessing umount.nfs is dumping cores in "/" since that is it's CWD.

I've got my CentOS 6.2 installation process ready and will switch my most troublesome user hopefully Monday.  Unfortunately, it is not an apples-to-apples comparison with the current SL 6.1.

As an experiment, I installed 6rolling/testing/x86_64/nfs-utils/nfs-utils-1.2.3-15.el6.0.sl6.x86_64.rpm (I believe the latest upstream version, i.e., what is in 6.2) on that user's system and while the core dumps have not returned, /etc/mtab is still accumulating duplicates, viewable with duplicate counts via:

   sort /etc/mtab | uniq -dc

ATOM RSS1 RSS2