SCIENTIFIC-LINUX-USERS Archives

March 2015

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"P. Larry Nelson" <[log in to unmask]>
Reply To:
Date:
Tue, 3 Mar 2015 09:48:48 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (86 lines)
I am seeing a bizarre bug where an SL6.x system hangs on either
shutdown or reboot at the point where it wants to shutdown the
loopback interface.

Let me start off by saying I'm running a mixed shop of SL5.x servers
(DNS, NIS, NTP, DHCP, NFS, etc.) along with a bunch of new cluster-esque
nodes running SL6.x.  All new SL6 nodes are Dell R410, R510, R710, for
whatever that's worth, but I don't believe they have anything to do
with the bug, per se.

Since building these new SL6 nodes many weeks back, they have all
exhibited this extremely annoying habit of hanging on shutdown or
reboot at the shutdown of the loopback interface.
Eventually (for the most part) they stop spinning whatever wheels
they're spinning and do manage to complete either the shutdown or
reboot, but it takes upwards of 15, 20, or 30 minutes!  Usually
I can't wait that long and just do a power off/on of the node.

No amount of trying to find out what they are doing has worked,
from trying to open another console window (Alt-F1, etc.) at
shutdown/reboot to having top running in one terminal window while
doing a 'service network restart' in another.  Everything just freezes!

I tried any number of things over the past several weeks, including
ripping out NetworkManager knowing that it has had a history of mucking
things up.  No luck.  They still hang.

On another front, I was having some UID/GID problems with the mix of
NFS v3 from my SL5.x file servers and NFS v4 on the SL6 nodes, so
I forced all mounts to use NFS v3.  I thought maybe that could be
the problem, but again, no luck - still hanging.

Revisiting it again in earnest this weekend via Google, I came up
empty as all hits seemed to have something to do with scenarios that
just did not apply, including many hits about a problem with running
the iscsi daemon (and there was a patch for that).  But I'm not running
the iscsi daemon.  It's not even installed.

One comment by someone who also had the same problem was that he, not
ever figuring out the cause, just commented out the line in
/etc/init.d/network that shuts down the loopback interface, saying it's
not a real device anyway, so what the hell.

So yesterday I thought I'd try the commenting out the loopback shutdown 
tactic on a test system.  Sure enough, the reboot was normal with no
hangs.

Ok, at least now I have a workaround, though that seems pretty kludgy.

I decided to try and nail the culprit down with a fresh rebuild of
a test system and see just where in the build process the bug appears.

After the basic install of SL6, the system reboots just fine.
Then do a 'yum update' with all its hundreds of patches.
It reboots just fine, as I expected.

So the first "local" change was to configure NIS.
Try the reboot.  Reboots fine.

[ok, here is where it becomes bizarre]
Modify /etc/nsswitch.conf to switch the order of "files nis" to
"nis files" for passwd, shadow, and group, as I've always done.
Reboot.  Boom!  It hangs at loopback interface shutdown!

I repeated this many times to be sure, and it happens the same on
every SL6.x node.

Bug or feature?  I can't imagine it to be a feature nor can I
fathom what the order of "files" and "nis" in /etc/nsswitch.conf
has to do with the hanging of the loopback interface shutdown.
It's possible that an SL6.x NIS server might correct the situation,
but I have no time right now to spend a week on that not knowing
it would even work.

Comments and suggestions are welcome.

- Larry

-- 
P. Larry Nelson (217-244-9855) | IT Administrator
461 Loomis Lab                 | High Energy Physics Group
1110 W. Green St., Urbana, IL  | Physics Dept., Univ. of Ill.
MailTo:[log in to unmask]    | http://www.brf-llc.com/lnelson/
-------------------------------------------------------------------
  "Information without accountability is just noise."  - P.L. Nelson

ATOM RSS1 RSS2