We had a massive power failure, beyond what the UPS could handle.
Despite attempts to find a way for the system to shut down gracefully,
it simply powered down without unmounting the disk partitions.
Nominally, the backup local UPS I am using (APC Back-UPS 650) has an
interface Port DB-9 RS-232 but I have not found any Linux application
that reliably would communicate with this model of UPS (that is, emulate
the same behavior as the application available from APC for MS Win that
senses the RS-232 information from APC, waits the appropriate time, and
then shutdown -- anyone found one?).
Upon boot, automatic fsck failed, and a request was posted for root
password. However, no more than one character of the password would be
accepted, causing an endless loop to this condition and not allowing me
control of the system (run fsck manually).
I then booted into the rescue image on the SL 6 bootable installation
DVD. I manually ran fsck on all the partitions except for the one that
was mounted on /mnt/sysimage. However, before issuing a reboot, I did
umount /mnt/sysimage, and verified using mount that the partition was
not mounted. I issued reboot, but the system evidently had not done a
clean umount, fsck again failed (on just the one partition that had been
mounted on /mnt/sysimage), and I had to repeat the above procedure, but
NOT mounting anything. Rather, I did fsck -y /dev/sda5 (as /dev/sda5
just happens to be the partition that had been mounted on and unmounted
from /mnt/sysimage), sync, sync, sync, reboot -- and everything worked.
Both of these TUV bugs need to be fixed (password for manual control,
umount not cleanly unmounting for the rescue system) -- I do not care if
the fix is from Fermilab/CERN stock SL following TUV or from ElRepo or
another non-TUV compliant chain that does fix the problem. I am running
SL 6x X86-64 current.
Yasha Karant
|