On Wed, 26 Mar 2008, Jan Schulze wrote:
> Hi all,
>
> I have a disk array with about 4.5 TB and would like to use it as one
> large logical volume with an ext3 file system. When mounting the logical
> volume , I get an "Input/Output error: can't read superblock".
Do you get any interesting kernel messages in the output of dmesg (or
/var/log/messages etc)? Which exact kernel is this (uname -r) and what
arch (i386/x86_64 etc; uname -m)?
> I'm using SL 4.2 with kernel 2.6 and this is what I did so far:
>
> - used parted to create a gpt disk label (mklabel gpt) and one large
> partition (mkpart primary ext3 0s -1s)
>
> - used parted to enable LVM flag on device (set 1 LVM on)
I know it would be slow but can you test that you can read/write to all of
/dev/sda1?
>
> - created one physical volume, one volume group and one logical volume
> (pvcreate /dev/sda1, vgcreate raid6 /dev/sda1, lvcreate -l 1189706 -n
> vol1 raid6)
>
> - created an ext3 filesystem and explicitly specified a 4K blocksize, as
> this should allow a filesystem size of up to 16 TB (mkfs.ext3 -m 0 -b
> 4096 /dev/raid6/vol1)
For some reason my EL4 notes tell me that we also specify -N (number of
inodes), as well as -E (set RAID stride), -J size= (set journal size) and
-O sparse_super,dir_index,filetype though most of that is probably the
default these days...
> However, mounting (mount /dev/raid6/vol1 /raid) gives the superblock
> error, mentioned above.
>
> Everything is working as expected, when using ext2 filesystem (with LVM)
> or ext3 filesystem (without LVM). Using a smaller volume (< 2 TB) is
> working
>
> with ext3+LVM as well. Only the combination of > 2TB+ext3+LVM gives me
> trouble.
>
> Any ideas or suggestions?
We found that in at least some combinations of kernel/hardware (drivers
really I expect), that support for >2TB block devices was still rather
flakey (at least in the early versions of EL4).
We ended up getting our RAID boxes to present as multiple LUNs each under
2TB which we can then set up as PVs and join back together into a single
VG and still have an LV which was bigger than 2TB. I'm rather
conservative in such things so we still avoid big block devices at the
moment.
[ obviously with single disk sizes growing at the rate they are it means
that the block-devices >2TB code is going to get a LOT more testing! ]
However, some of the tools (e.g. ext2/3 fsck) still seemed to fail at
about 3.5TB so we ended up needing to build the 'very latest' tools to be
able to run fsck properly (the ones included in EL4 - and EL5 I think -
get into an infinite loop at some point while scanning the inode tables).
Currently we try to avoid 'big' ext3 LVs ; the one where we discovered the
fsck problems was originally ~6.8TB but we ended up splitting that into
several smaller LVs since even with working tools it still took ~2 days to
fsck... (and longer to dump/copy/restore it all!)
Some of my co-workers swear by XFS for 'big' volumes but then we do have
SGI boxes where XFS (well CXFS) is the expected default fs. I've not done
much testing with XFS on SL mainly because TUV don't like XFS much...
--
Jon Peatfield, Computer Officer, DAMTP, University of Cambridge
Mail: [log in to unmask] Web: http://www.damtp.cam.ac.uk/
|