LISTSERV - SCIENTIFIC-LINUX-USERS Archives

SCIENTIFIC-LINUX-USERS Archives

September 2007

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

	LISTSERV Archives
	SCIENTIFIC-LINUX-USERS Home
	SCIENTIFIC-LINUX-USERS September 2007

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives
Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]
Subject:	Re: How do I perform manual or automatic mounts/accesses/repairs of hardisk/devices on SL 5.0
From:	Jon Peatfield <[log in to unmask]>
Reply To:	Jon Peatfield <[log in to unmask]>
Date:	Fri, 7 Sep 2007 01:04:25 +0100
Content-Type:	TEXT/PLAIN
Parts/Attachments:	TEXT/PLAIN (328 lines)
On Thu, 6 Sep 2007, William Shu wrote:

<snip>
> I used system-config-lvm in the SL 5.0 Live CD to change the logical 
> volume from the live CD.

ok so lv resizing works as expected from the Live CD.

> I have successfully reduced the size of the 
> physical volume. However, the partition size in question (/dev/sda6) 
> does not seem to have been reduced. fdisk and parted do not find free 
> space to create a new partition, and gparted (free-standing) still shows 
> the old partition size.
>
> Question: why was the partition /dev/sda6 not automatically reduced in 
> size, when the the physical volume was reduced? Do I manually reduce the 
> partition to create the free space? Some of the interactions follow:
<snip use of pvresize>

Because pvresize just alters the amount of space that LVM is allowed to 
use.  The pvresize manpage says:

        Shrink the PV on /dev/sda1 prior to shrinking the partition with
        fdisk (ensure that the PV size is appropriate for your intended new
        partition size):

Altering the pv-size is an unusual operation.  What are you actually 
trying to do?  If you just wanted to move some of the space from one lv to 
another (in the same vg) then tools like system-config-lvm will let you do 
that (though shrinking can't be done while a filesystem is live)...

<snip snip>
> [root@slinux sluser]# mount -text3 /dev/mapper/VolGroup00/LogVol00 /mnt/anchor
> mount:mounting /dev/mapper/VolGroup00/LogVol00 failed: No such file or directory
> [root@slinux sluser]# mount -text3 /dev/VolGroup00/LogVol00 /mnt/anchor
> mount:mounting /dev/VolGroup00/LogVol00 failed: No such file or directory
>
> [root@slinux sluser]# mount -text3 VolGroup00/LogVol00 /mnt/anchor
> mount:mounting VolGroup00/LogVol00 failed: No such file or directory
>
> [root@slinux sluser]# mount -text3 /dev/sda6 /mnt/anchor        # physical partition is sda6.
> mount:mounting /dev/sda6 failed: No such file or directory

is lvm active at this point?  What do you get from running:

   vgscan
   vgchange -tv -ay
   pvdisplay -c
   vgdisplay -c
   lvdisplay -c

BTW the error I'd expect if the device doesn't exist is:

   mount: special device /dev/VolGroup00/LogVol00 does not exist

or if the mount-point doesn't exist I'd expect something like:

   mount: mount point /mnt/anchor does not exist

At least that is what I get from my SL50 machines.

>>> Q3) How do I perform a file system check with LVM partitions?
>>> I suspect I have a disk crash/bad sectors on my desktop but do not want
>>> to loose information. fsck does not work, presumably because of wrong
>>> file type, since I have to unmount the partition!

Copy the raw disk/partitions off to somewhere safe before working on it if 
there is anything 'valuable' on the disk.  If the disk does have bad 
sectors then a utility like ddrescue may be more helpful than plain dd.

Once you have a copy in your firesafe (or on another continent), the 
badblocks command will let you check the disk for bad blocks.  Of course 
drives may hide bad blocks by re-mapping them automatically but you should 
be able to tell and stuff like smartd (if it is monitoring your disks) 
will probably alert you to disks which are starting to fail (if you are 
lucky with enough time to get your data off!)

My personal view is that any disk reporting bad blocks is dead or dying 
and should be replaced asap - more on that later...

> I had messages similar to the one below for various devices tried. (the 
> copy below is hand-copied (as I do not know how to copy the screen nor 
> mount a flash stick when in rescue mode.)

hmm reference to rescue mode.... hmm...

> # e3fsck -fcvn /dev/VolGroup00/LogVol00
> e3fsck: Command not found.

The command is e2fsck even for ext3 file systems, anyway fsck will pick 
the right one.

> # fsck -fcvn
> e2fsck 1.39 (29 May-2006)
> e2fsck: No such file or directory when to trying to open /dev/VolGroup00/LogVol00. The superblock could not be read or does not describe the correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with alternate superbloc:
>     e2fsck -b 8193 <device>.

That is the error I'd expect if the lv specified doesn't exist.

<snip>
>>> Q4) How can you control where you mount devices automatically  (e.g.,
>>> flash sticks)?
>>> The mountpoints are not indicated in /etc/fstab, and the config files
>>> (*.conf) of automount and autofs do not seem to tell me where! In short
>>> I do not understand how these or the hal (hardware abstraction layer) work!
>>>
>>
>> They get mounted in /media
>>
>> I'll let others expain how to figure that out.
>
> Is there some documentation that presents things in a coherent fashion. 
> So far, I have drifted into finding out about udev, but things are just 
> getting more elaborate!

udev isn't really relevant except that it is involved in setting up the 
devices when 'hotplug' stuff happens.

The magic keyword to look for is 'hal' (or 'hald').  The hald keeps track 
of hardware and presents APIs to access it (over d-bus I think).

The shortish answer is that hal picks a mount-point based on info from the 
device or file-system.  Most commonly it will pick the volume-label if the 
file-system has one (and it doesn't clash with an existing mount) and then 
uses that under /media/

Applications like gnome-volume-manager speak to hald (over d-bus) to get 
info about available devices and make requests to have things done (e.g. 
mount or umount volumes).  In fact g-v-m calls gnome-mount which speaks to 
hal for it.  From the command-line you can call gnome-mount directly if 
you want, e.g. I may use:

   gnome-mount -p JSPDATA

and it mounts my usb-stick (which has the volume-label JSPDATA), under 
/media/JSPDATA/ and later I can say:

   gnome-mount --unmount -p JSPDATA

to make it go away.  If there isn't a suitable volume-label it probably 
uses something else and I'm sure it you can find out what by reading hal's 
.fdi files - ok I'm not really sure at all the behaviour might possibly be 
hard-wired...  You can ask hal to mount in a different place (still under 
/media) and add mount options by adding extra options to the command-line:

$ gnome-mount -p JSPDATA --mount-point ook
$ df -hl| grep media
/dev/sdb1             962M  4.3M  958M   1% /media/ook

>> Conclusion: If you are concerned that you have a bad disk, get another 
>> disk, install S.L. 5.0 on it, then try your hardest to get the data off 
>> the other disk.  If you have a bad disk, that is not the time to be 
>> trying updates and upgrades on the disk.
>
> The hard disk is new and a test under windows (by a third party) did not 
> show any defects before I started using it. Things where working 
> smoothly then suddenly, files for the X windowing system went missing 
> but I could boot on a text screen sometimes.

Is the rest of the hardware known to be ok?  I'd run memtest+ on it for a 
couple of days just to be sure that memory is ok.

Did you get any smartd (or other interesting) messages before it started 
to fail?

Can you attach this disk to another machine and use the utilities on there 
to check it?

> The good news is I recovered reasonable from backup, though I lost all 
> the setting up the machine in ways I cannot repeat! fsck, say, may help 
> me if it is just a question of bad blocks.
>
> The good news is I recovered reasonable from backup, though I lost all 
> the setting up of the machine in ways I cannot repeat!

I hope that the loss is what can't be repeated :-)

An adage to live by is:

   If data is worth spending any time trying to recover, it should be
   backed up.

Not that this helps you right now but it might be something to avoid 
problems in the future.  I've almost been attacked by people when I tell 
them this after their disk failed, but I've never had the same person come 
back a second time... :-)

> In non-lvm file system you know where things are, and dd, fdisk and 
> other commands let you take your chances. In lvm, I am lost and given 
> the lagacy software in various partitions of the machine (a pentium II) 
> I cannot add more disks or use RAID.

If you can boot a full os on the box then you have other options but I 
remember you mentioned 'rescue mode'.  In the very limited set of commands 
you have on the installer you can still fairly easily enumerate all the 
available disk-partitions or lvs...

Here is what I do in the %pre of a kickstart install to walk through the 
set of possible old places where we want to grab config data from (iff the 
machine had been previously installed of course):

...
# where we mount old partitions etc
mkdir /mnt/oldsys

# device-mapper is needed for linux-raid and lvm stuff
modprobe dm_mod

# activate all dmraid partitions -- lvm might be using them!
# turn on linux raid too
if [ -x /usr/sbin/dmraid ]; then
     # if we have dmraid do this -- it should be *best*
     /usr/sbin/dmraid -i -a y
elif [ -x /usr/bin/raidstart ]; then
     # if we have raidstart do this -- though it is a bit tedious...
     for i in /dev/md?
     do
       /usr/bin/raidstart $i
     done
else
     echo "Can't do raid startup"
fi

echo "Scanning for LVM volumes...."
# probe for all lvm volumes...
lvm vgscan 2>/dev/null

# turn on lvm
lvm vgchange -ay 2>/dev/null

# check lvm devices
for lv in $(lvm lvdisplay -c 2>/dev/null | awk -F: '/\/(root|var|etc|usr)/ 
{print $1}'); do
     echo "Checking log-vol $lv"
     checkdev $lv
done

# check all plausable raw partitions and md devices
for hd in $(awk '/ ([sh]d[a-z]|md)[0-9]/ {print $4}' /proc/partitions); do
     checkdev /dev/$hd
     # quit loop if we found all we were looking for
<snip>
 	echo "Found everything -- skipping rest"
 	break
     fi
done

# turn lvm off again
lvm vgchange -an 2>/dev/null

# turn off linux raid too
if [ -x /usr/sbin/dmraid ]; then
     # if we have dmraid do this...
     /usr/sbin/dmraid -i -a n
elif [ -x /usr/bin/raidstart ]; then
     for i in /dev/md?
     do
       /usr/bin/raidstop $i
     done
else
     echo "Can't do raid stopping"
fi
...

obviously on the real code checkdev is defined to do something sane, the 
definition starts:

# Given a device check if we can fcsk/mount it, and if so does it
# contain the old files we are trying to preserve over the re-install
checkdev() {
   ddev=$1
   echo "Check $ddev"
   if fsck -p $ddev ; then
     echo "Try to mount $ddev on /mnt/oldsys"
     if mount -o ro $ddev /mnt/oldsys; then
 	echo "Mounted!"
...

This is intended to be run automatically and cope with a variety of 
possible disk arrangements you may not have.  e.g. if you know you don't 
use the linux-raid (md) stuff you don't need to bother to do the dmraid 
(or raidstart) stuff etc.

Once you have done the 'lvm vgscan' and 'lvm vgchange -ay', getting the 
list of available lvs is just a matter of running 'lvm lvdisplay -c'
which is actually easier than trying to parse /proc/partitions (at least 
with the tools one has in the %pre).

BTW one annoyance of the upstream-vendor's defaults is that people (unless 
forced not to) see to pick VolGroup00 for their VG name (and LogVol00 
(etc) for the LVs).  This means that if one attached the disk to another 
machine and both of them happen to have VolGroup00 as the VG to use the 
names clash and lvm can't easily use either.

I want the VG names to be unique (I then don't care about the LV names but 
I pick more descriptive ones), so we use ones including the hostname, e.g. 
I'm typing this on a box called unicef and the VG containing root etc is 
UnicefSys00 and with the default LV names we use that leans we have:

$ lvdisplay -c
   /dev/UnicefSys00/root:UnicefSys00:3:1:-1:1:4194304:64:-1:0:0:253:0
   /dev/UnicefSys00/scratch:UnicefSys00:3:1:-1:1:251461632:3837:-1:0:0:253:1
   /dev/UnicefSys00/tmp:UnicefSys00:3:1:-1:1:20971520:320:-1:0:0:253:2
   /dev/UnicefSys00/usr:UnicefSys00:3:1:-1:1:20971520:320:-1:0:0:253:3
   /dev/UnicefSys00/var:UnicefSys00:3:1:-1:1:4194304:64:-1:0:0:253:4
   /dev/UnicefSys00/swap:UnicefSys00:3:1:-1:1:10485760:160:-1:0:0:253:5
$ df -hl
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/UnicefSys00-root
                       2.0G  280M  1.6G  15% /
/dev/sda1              99M   11M   83M  12% /boot
tmpfs                1013M     0 1013M   0% /dev/shm
/dev/mapper/UnicefSys00-scratch
                       117G  191M  110G   1% /local
/dev/mapper/UnicefSys00-tmp
                       9.7G  159M  9.1G   2% /tmp
/dev/mapper/UnicefSys00-usr
                       9.7G  3.8G  5.5G  41% /usr
/dev/mapper/UnicefSys00-var
                       2.0G  204M  1.7G  11% /var

Of course sometimes the names do end up being quite long...

Bored now.
-- 
Jon Peatfield,  Computer Officer,  DAMTP,  University of Cambridge
Mail:  [log in to unmask]     Web:  http://www.damtp.cam.ac.uk/
ATOM RSS1 RSS2
LISTSERV.FNAL.GOV