SCIENTIFIC-LINUX-USERS Archives

January 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Winnie Lacesso <[log in to unmask]>
Reply To:
Winnie Lacesso <[log in to unmask]>
Date:
Fri, 20 Jan 2012 09:19:58 +0000
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (40 lines)
> SL6 systems, we can't reboot reliably, particularly if someone is logged 
> on the console or if the user initiates the shutdown from the desktop.  
> Have not seen one hang if no automounts are mounted.  The last messages 
> on the console are:
> 
> Unmounting file systems:  [ OK ]
> /home:                  rcercrcrcrcrcrcrcrcrcrcrcrcrc[...]rcrce
> init: rc main process (19211) killed by KILL signal
> 
> The "rc" is repeated for about 3.5 console lines (note "e" near 
> beginning and at end is not a typo).

Something simliar happened here 27 Dec 2010 on a batch of SL5.x
64-bit cluster WN, in run level 3 (no Desktop), no automount or NIS (local
pool accounts only), but they did have an NFS mount for /software, and
they were multiclustered with a gpfs storage cluster to mount the /gpfs
storage.
The WN were running
kernel-2.6.18-194.11.4.el5.x86_64

After yum update to kernel-2.6.18-194.26.1.el5.x86_64
(they skipped kernel-2.6.18-194.17.1 update)
then shutdown -r now to make them boot into new kernel, they all hung coming
down with this on console:

sbin: rcercrcrcrcrcrcrcrcrcr
INIT: no more processes left in this runlevel
/bin: rcercrcrcrcrcrcrcrcrcrcr
Could not kill process 2050: no such process

Had never seen that before.
Poke reset button -> they all rebooted fine.
On a 2nd batch of WN (headless) they also did exactly the same,
updating from kernel-2.6.18-194.11.4 to kernel-2.6.18-194.32.1 in Feb'11.
(In both cases in-between kernel updates were skpped; unsure if 
relevant). Never happened since.
It was v curious but no time to look into it.

Sympathies if you're seeing this repeatedly.

ATOM RSS1 RSS2