SCIENTIFIC-LINUX-USERS Archives

November 2008

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Konstantin Olchanski <[log in to unmask]>
Reply To:
Konstantin Olchanski <[log in to unmask]>
Date:
Thu, 20 Nov 2008 12:03:18 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (52 lines)
On Wed, Nov 19, 2008 at 04:07:02PM -0600, Miles O'Neal wrote:
> Our local vendor built us a Supermicro/Adaptec
> system with 16x1TB SATA drives. ...
> 
> Anyone have experience with filesystems this large
> on a Linux system?  Will XFS work well for this?
> 


We have 2 comparable systems in heavy use with SL4 and
software raid5.

One is 11TB XFS built from 16 750GB disks.
Second one is 5TB ext3 built from 15+1 400GB disks.

A few remarks:

1) ext2 is a no-go because of unreasonable reboot time
   after crash (long fsck time, I mean very long time).

2) xfs appears to be required for anything bigger than 8 TB. ext3
   is advertised to support up to 16 TB, but at least in SL4,
   that does not seem to work.

3) beware of lemon disks, sata controllers and cables. Lemon
   and non-lemon hardware looks the same, but you know you
   have a lemon if your 16-disk raid aray is unstable. This is
   very hard to debug. If your array does not survive "mdadm --create",
   raid resync and mkfs, take it straight to the dumpster, it will
   never work. (Until you replace all the lemons, that is).

4) ext3 + software raid5 can be usable with lemon disks (our 400GB
   disks tend to randomly drop out from the array), and having
   one spare disk (15 in use + 1 spare) helps a lot - raid5
   recovers from 1 lost disk automatically by grabbing the spare disk.

5) monitor "smart" status all the time. Disks with "pending" and
   "uncorrectable" errors will be dropped from the raid array and
   if there is more than one of them, your raid array is toast. (Unless
   you are a brain surgeon and know how to recover raid arrays from
   multiple disk failures).

6) where are you going to backup your data, in case the filesystem
   eats itself one day?


-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada

ATOM RSS1 RSS2