SCIENTIFIC-LINUX-USERS Archives

January 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Pat Riehecky <[log in to unmask]>
Reply To:
Pat Riehecky <[log in to unmask]>
Date:
Mon, 23 Jan 2012 09:44:54 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (122 lines)
Ordinarily I try and avoid entering these sorts of chains on the list.  
I'm frustrated; you're frustrated, and meaningful progress, while often 
born out of frustration, is often impolite in the process.  So forgive 
me if I sound a bit annoyed, but this nfs-utils problem has been stuck 
in my side for a while now.  I have limited my replies to as short as 
possible to avoid a more confrontational approach.  Please consider the 
terse nature of my comments as an attempt at etiquette, by limiting my 
words I hopefully limit my opportunity to be aggressively 
confrontational.  I had a longer response, but it was far too aggressive 
and forceful, lacking all together the spirit of co-operation.


On 01/22/2012 10:35 AM, Chris Schanzle wrote:
> On 01/21/2012 11:01 AM, Nico Kadel-Garcia wrote:
>> On Fri, Jan 20, 2012 at 11:14 PM, Chris Schanzle<[log in to unmask]>  
>> wrote:
>>> On 01/20/2012 09:51 PM, Konstantin Olchanski wrote:
>>>>
>>>> I feel obligated to vent about the ongoing mess-up of the nfs-utils
>>>> package.
I receive a few emails about this package per week these days.  Some 
politely ask what is going on, some simply demand a fix, as though they 
think the error was deliberate.  I have attempted to politely reply to 
each one.
>>>>
>>>> In the nutshell, all of my SL6.1 machines are affected (not "both
>>>> machines",
>>>> both dozens of machines, 24 is the last count).
>>>>
>>>> The "/" directory is filling up with 1 Mbyte core files from 
>>>> umount.nfs
>>>> at the rate of about 3 core dumps per minute.
>>>
>>>
>>>
>>> Just wanted to put a "me too" out there.  I admit to not keeping up 
>>> with the
>>> various nfs-utils versions and just recently joined this list.
>>>
>>> Seemed that umount.nfs dumping core caused /etc/mtab to not get 
>>> cleaned up,
>>> so you had many duplicates in the output of say, 'df'.
>>>
>>> We don't use kerberos, just NIS and the automounter, so it seemed 
>>> like a lot
>>> of the discussion didn't apply to us.  It didn't affect all our systems
>>> either.
>>>
>>> I feel the same frustration.  I have stopped rolling out EL6 and I'm
>>> apologizing to my existing early adopter users.  With this issue and my
>>> previously mentioned email about the inability to reboot 
>>> successfully (due
>>> to umount issues) not generating any discussion, I'm preparing to 
>>> hop back
>>> to the other prominent NA enterprise Linux derivative.  It's great 
>>> to have
>>> choices.
I'm sorry to hear that your EL6 rollout has been paused.  Excluding this 
particular issue, SL6 has been generally considered a stable and up to 
date release.
>>>
>>> PS - I just noticed the mailing list doesn't add a Reply-To: field 
>>> to direct
>>> replies to the list.
>>
>> Chris, I'm not sure you can blame SL for this one at all. Our favorite
>> upstream vendor occasionally publishes software with a bug, although
>> they're very good about testing and fixing any reported issues, which
>> is why some of us pay them for support licenses and others take
>> advantage of the goodness of free software. Is there a sign or pointer
>> that this was, in fact, an SL compilation generated bug?
>
> Yes, I understand and agree we get upstream's occasional rare bugs.  
> I'm pointing the finger (possibly!) at SL due to this thread which 
> references the re-issuing of nfs-utils due to a build environment error:
> http://listserv.fnal.gov/scripts/wa.exe?A2=ind1201&L=scientific-linux-devel&T=0&P=77 
>
>
> There are other interesting threads including "Recent updates break 
> autofs/ldap/krb5", but those may be upstream bugs.

The segfaults are a non-upstream bug, but not specific to SL.  Another 
major rebuild had a similar problem with their nfs-utils package.  The 
buildroot updates were forwarded onto them by one of our list members.  
A different major rebuild was contacted by us privately with the 
buildroot updates before their package was publicly released.

The package with the incremented version was built under identical 
conditions as the one which does not exhibit the segfault.  My tests all 
passed without incident.  It sat in testing until a number of users 
reported it was working fine.  If necessary I can prove both; however, I 
have no intention of naming the users who graciously tested this package 
and, like me, could not make it segfault.

>
>> Are you mounting NFS directories at / ? That's usually a *REALLY* bad
>> idea, because if the NFS mount has any issues, it interferes with any
>> function that glances in / for permissions or other information. If
>> not, do you have any idea why it'd dumping those files in / ?
>
> No - I'm sorry if I wrote something that implied that.  We use 
> standard indirect maps (e.g., /home/<user>).  I am guessing umount.nfs 
> is dumping cores in "/" since that is it's CWD.
>
> I've got my CentOS 6.2 installation process ready and will switch my 
> most troublesome user hopefully Monday.  Unfortunately, it is not an 
> apples-to-apples comparison with the current SL 6.1.
>
> As an experiment, I installed 
> 6rolling/testing/x86_64/nfs-utils/nfs-utils-1.2.3-15.el6.0.sl6.x86_64.rpm 
> (I believe the latest upstream version, i.e., what is in 6.2) on that 
> user's system and while the core dumps have not returned, /etc/mtab is 
> still accumulating duplicates, viewable with duplicate counts via:
>
>   sort /etc/mtab | uniq -dc



-- 
Pat Riehecky
Scientific Linux Developer

ATOM RSS1 RSS2