The new SL 7 installer is not your friend. But what you sysadmin friend was missing, in order to edit the root drive's /etc/fstab, was the command "mount -o remount,rw /", or "mount -o remount,rw /mnt/sysimage" depending on the state of the system.
Nico Kadel-Garcia
Email: [log in to unmask]
Sent from iPhone
> On Feb 12, 2015, at 16:50, Yasha Karant <[log in to unmask]> wrote:
>
> I always run an "enterprise" environment on any server, including our GPU compute engine for research applications. This is not a
> testbed machine per se, although we must load new drivers and new concurrent/GPU implementation methodologies as these evolve. The base of the
> GPU engine is CUDA. New compute applications, often from other problem domain areas, typically are run (sometimes ported) to this compute engine.
>
> We recently started the transition from SL 6 to SL 7; a colleague here was doing the work. He has numerous comments, posted below,
> and is now insisting that SL (e.g., RHEL) 7 is not suitable for production use in our environment, but that OpenSuSE, Debian, or Mint are more suitable environments.
> I personally disagree, but I greatly would appeciate commentary, particularly from anyone who run other Linux distributions in a production server environment.
> We must support CUDA, some variety of MPI, and operational Infiniband drivers and services.
>
> Comments (only lightly "cleaned up)
>
> so i verified that the drive indeed has a bad superblock - open suse did not hesitate to mount
> because the drive was not in fstab, sl6 had mounted it previously because drives only
> get fsckd every (usually) 20 reboots
>
> so this drive reached the 20 reboot threshold and fsck failed with bad superblock -
>
> so far so good. the problem is - sl6 refused to mount the root drive rw in the emergency shell,
> but also refused to do anything other than reboot once a drive that is known not to be the
> system drive failed fsck (and it know this was not the sys drive because it had alread mounted
> root to get at fstab)
>
> the sane, competent, safe solution to a drive problem is to not mount that drive, not refuse
> to bootdrive failure with bad superblock - however it is a data drive, in no way needed to
> boot the system.
>
> over many trials, it became clear that:
>
> 1> the drive is in fstab, so system tries fsck which fails into a shell - there appears to be no
> way to tell the system to continue to load, since manual fsck also fails - reboot leads to the same
> problem
>
> 2> removed the drive - does not help, still tries to fsck and fails, and refuses to continue to load
>
> 3> tried to edit fstab from shell to rem out drive - could not edit, drive was mounted readonly,
> could not change
>
> 4> tried to boot from the sl7 live/install usb key - did not let me get to a shell, did not want to
> go ahead and install on top of current system
>
> 5> created open suse usb key - this allowed me to boot, mount the raid1 drives, edit fstab -
> whereupon the system was able to boot
>
> What kind of system is unable to deal gracefully with a failed data drive?
>
> My conclusion - Scientific Linux is too fragile a system for serious use
|