LISTSERV - SCIENTIFIC-LINUX-USERS Archives

SCIENTIFIC-LINUX-USERS Archives

April 2017

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

	LISTSERV Archives
	SCIENTIFIC-LINUX-USERS Home
	SCIENTIFIC-LINUX-USERS April 2017

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: RAID 6 array and failing harddrives
From:	Steven Haigh <[log in to unmask]>
Reply To:	Steven Haigh <[log in to unmask]>
Date:	Wed, 5 Apr 2017 08:58:26 +1000
Content-Type:	multipart/signed
Parts/Attachments:	text/plain (4 kB) , signature.asc (4 kB)

On 05/04/17 05:44, Konstantin Olchanski wrote:
> Moving to ZFS because of issues like this. RAID6 rebuild with 4-6-8-10TB disks
> has become too scary. If there is any transient error during the rebuild,
> the md driver starts kicking disks out, getting into funny states with many
> "missing" disks, recovery is only via "mdadm --assemble --force" and without
> per-file checksums in ext4/xfs there is no confidence whatsoever that data
> was not subtly corrupted.
> 
> ZFS is also scary, but seems to behave well, even in the presence
> of "bad disks". ZFS scrub seems to overcome/rewrite bad sectors,
> bad data on disk (if I poop on a disk using "dd"), all without corrupting
> data files (I compute and check my own sha-512 checksums for each file).

Heh - another soon to be victim of ZFS on linux :)

You'll quickly realise that the majority of major features you'd expect
to work - don't. You can't grow a ZFS 'raid'. You're stuck with the
number of disks you first start with. You'll find out more as you go
down this rabbit hole.

> BTRFS is even better (on paper), but not usable in el7.3 because it has no
> concept of "failed disk". If you pull a disk on btrfs, it will fill /var/log
> with disk error messages, will not take any mitigation/recovery action
> (drop disk from array, rerun the data balancer, etc).

DO NOT USE RAID5/6 WITHIN BTRFS.

I have tried this before and have the many Gb of lost data when it goes
wrong. In fact, I discovered several new bugs that I lodged with the
BTRFS guys - which led to warnings of DO NOT USE PARITY BASED RAID
LEVELS IN BTRFS becoming the official line.

However, BTRFS is very stable if you use it as a simple filesystem. You
will get more flexible results in using mdadm with btrfs on top of it.

mdadm can be a pain to tweak - but almost all problems are well known
and documented - and unless you really lose all your parity, you'll be
able to recover with much less data loss than most other concoctions.

> 
> K.O.
> 
> 
> On Tue, Apr 04, 2017 at 04:17:22PM +0200, David Sommerseth wrote:
>> Hi,
>>
>> I just need some help to understand what might be the issue on a SL7.3
>> server which today decided to disconnect two drives from a RAID 6 setup.
>>
>> First some gory details
>>
>> - smartctl + mdadm output
>> <https://paste.fedoraproject.org/paste/wLyz44nipkJ7FgKxWk-1mV5M1UNdIGYhyRLivL9gydE=>
>>
>> - kernel log messages
>> https://paste.fedoraproject.org/paste/mkyjZINKnkD4SQcXTSxyt15M1UNdIGYhyRLivL9gydE=
>>
>>
>> The server is setup with 2x WD RE4 harddrives and 2x Seagate
>> Constellation ES.3 drives.  All 4TB, all was bought brand new.  They're
>> installed in a mixed pattern (sda: RE4, sdb: ES3, sdc: RE4, sdd: ES3)
>> ... and the curious devil in the detail ... there are no /dev/sde
>> installed on this system - never have been even, at least not on that
>> controller.  (Later today, I attached a USB drive to make some backups -
>> which got designated /dev/sde)
>>
>> This morning *both* ES.3 drives (sdb, sdd) got disconnected and removed
>> from the mdraid setup.  With just minutes in between.  On drives which
>> have been in production for less than 240 days or so.
>>
>> lspci details:
>> 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset
>> Family SATA AHCI Controller (rev 05)
>>
>> Server: HP ProLiant MicroServer Gen8 (F9A40A)
>>
>> <https://www.hpe.com/us/en/product-catalog/servers/proliant-servers/pip.specifications.hpe-proliant-microserver-gen8.5379860.html>
>>
>>
>> Have any one else experienced such issues?  Several places on the net,
>> the ata kernel error messages have been resolved by checking SATA cables
>> and their seating.  It just sounds a bit too incredible that two
>> harddrives of the same brand and type in different HDD slots have the
>> same issues but not at the exact same time (but close, though).  And I
>> struggle to believe two identical drives just failing so close in time.
>>
>> What am I missing? :)  Going to shut down the server soon (after last
>> backup round) and will double check all the HDD seating and cabling.
>> But I'm not convinced that's all just yet.
>>
>>
>> -- 
>> kind regards,
>>
>> David Sommerseth
> 

-- 
Steven Haigh

Email: [log in to unmask]
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

ATOM RSS1 RSS2

LISTSERV.FNAL.GOV