SCIENTIFIC-LINUX-USERS Archives

March 2015

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"P. Larry Nelson" <[log in to unmask]>
Reply To:
Date:
Tue, 3 Mar 2015 12:52:07 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (135 lines)
Hi Stephen,

Replies in-line below.

Thanks,
- Larry

On 3/3/15 11:49 AM, Stephen John Smoogen wrote:
>
> On Mar 3, 2015 8:49 AM, "P. Larry Nelson" <[log in to unmask]
> <mailto:[log in to unmask]>> wrote:
>  >
>  > I am seeing a bizarre bug where an SL6.x system hangs on either
>  > shutdown or reboot at the point where it wants to shutdown the
>  > loopback interface.
>  >
>  > Let me start off by saying I'm running a mixed shop of SL5.x servers
>  > (DNS, NIS, NTP, DHCP, NFS, etc.) along with a bunch of new cluster-esque
>  > nodes running SL6.x.  All new SL6 nodes are Dell R410, R510, R710, for
>  > whatever that's worth, but I don't believe they have anything to do
>  > with the bug, per se.
>  >
>  > Since building these new SL6 nodes many weeks back, they have all
>  > exhibited this extremely annoying habit of hanging on shutdown or
>  > reboot at the shutdown of the loopback interface.
>  > Eventually (for the most part) they stop spinning whatever wheels
>  > they're spinning and do manage to complete either the shutdown or
>  > reboot, but it takes upwards of 15, 20, or 30 minutes!  Usually
>  > I can't wait that long and just do a power off/on of the node.
>  >
>  > No amount of trying to find out what they are doing has worked,
>  > from trying to open another console window (Alt-F1, etc.) at
>  > shutdown/reboot to having top running in one terminal window while
>  > doing a 'service network restart' in another.  Everything just freezes!
>  >
>  > I tried any number of things over the past several weeks, including
>  > ripping out NetworkManager knowing that it has had a history of mucking
>  > things up.  No luck.  They still hang.
>  >
>  > On another front, I was having some UID/GID problems with the mix of
>  > NFS v3 from my SL5.x file servers and NFS v4 on the SL6 nodes, so
>  > I forced all mounts to use NFS v3.  I thought maybe that could be
>  > the problem, but again, no luck - still hanging.
>  >
>  > Revisiting it again in earnest this weekend via Google, I came up
>  > empty as all hits seemed to have something to do with scenarios that
>  > just did not apply, including many hits about a problem with running
>  > the iscsi daemon (and there was a patch for that).  But I'm not running
>  > the iscsi daemon.  It's not even installed.
>  >
>  > One comment by someone who also had the same problem was that he, not
>  > ever figuring out the cause, just commented out the line in
>  > /etc/init.d/network that shuts down the loopback interface, saying it's
>  > not a real device anyway, so what the hell.
>  >
>  > So yesterday I thought I'd try the commenting out the loopback
> shutdown tactic on a test system.  Sure enough, the reboot was normal
> with no
>  > hangs.
>  >
>  > Ok, at least now I have a workaround, though that seems pretty kludgy.
>  >
>  > I decided to try and nail the culprit down with a fresh rebuild of
>  > a test system and see just where in the build process the bug appears.
>  >
>  > After the basic install of SL6, the system reboots just fine.
>  > Then do a 'yum update' with all its hundreds of patches.
>  > It reboots just fine, as I expected.
>  >
>  > So the first "local" change was to configure NIS.
>  > Try the reboot.  Reboots fine.
>  >
>  > [ok, here is where it becomes bizarre]
>  > Modify /etc/nsswitch.conf to switch the order of "files nis" to
>  > "nis files" for passwd, shadow, and group, as I've always done.
>  > Reboot.  Boom!  It hangs at loopback interface shutdown!
>  >
>
> I want to thank you for giving all the details of your testing. I would
> like to use it as a future example of how to be constructive and helpful
> to other people needing help.

Thanks.  Yep, feel free to use this as an example.  I suppose it comes
from being in the biz for over 46 years and shaking my head at *SO* many
ill conceived requests for help on listservs.

> So have you looked at nscd any? Does having nscd turned on or off alter
> this problem.

Nay, I have not, and frankly, it didn't occur to me till you asked.
I will explore that when I get a chance and see if it alters the problem.

> Also what is in hosts and is the NIS server listed. Thanks

I assume you're talking about /etc/hosts on the clients.
The SL6.x clients just have the following in hosts:

127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 
localhost6.localdomain6

>  > I repeated this many times to be sure, and it happens the same on
>  > every SL6.x node.
>  >
>  > Bug or feature?  I can't imagine it to be a feature nor can I
>  > fathom what the order of "files" and "nis" in /etc/nsswitch.conf
>  > has to do with the hanging of the loopback interface shutdown.
>  > It's possible that an SL6.x NIS server might correct the situation,
>  > but I have no time right now to spend a week on that not knowing
>  > it would even work.
>  >
>  > Comments and suggestions are welcome.
>  >
>  > - Larry
>  >
>  > --
>  > P. Larry Nelson (217-244-9855) | IT Administrator
>  > 461 Loomis Lab                 | High Energy Physics Group
>  > 1110 W. Green St., Urbana, IL  | Physics Dept., Univ. of Ill.
>  > MailTo:[log in to unmask] <mailto:[log in to unmask]>    |
> http://www.brf-llc.com/lnelson/
>  > -------------------------------------------------------------------
>  >  "Information without accountability is just noise."  - P.L. Nelson
>


-- 
P. Larry Nelson (217-244-9855) | IT Administrator
461 Loomis Lab                 | High Energy Physics Group
1110 W. Green St., Urbana, IL  | Physics Dept., Univ. of Ill.
MailTo:[log in to unmask]    | http://www.brf-llc.com/lnelson/
-------------------------------------------------------------------
  "Information without accountability is just noise."  - P.L. Nelson

ATOM RSS1 RSS2