SCIENTIFIC-LINUX-USERS Archives

September 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Date:
Wed, 5 Sep 2012 15:34:01 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (46 lines)
On 2012/09/05 11:38, Todd And Margo Chester wrote:
> On 09/04/2012 12:21 PM, Konstantin Olchanski wrote:
>>> Cherryville drives have a 1.2 million hour MTBF (mean time
>>> >between failure) and a 5 year warranty.
>>> >
>> Note that MTBF of 1.2 Mhrs (137 years?!?) is the*vendor's estimate*.
>
> Baloney check.  1.2 Mhrs does not mean that the device is expected
> to last 137 years.  It means that if you have 1.2 million devices
> in front of you on a test bench, you would expect one device to
> fail in one hour.
>
> -T

Baloney check back at you. If you have 1.2 million devices in front
of you all operating under the same conditions as specified for the 1.2
million hours MTBF that half of them would have failed. bu the end of
the 1.2 million hours. Commentary here indicates those conditions are
a severe derating on the drive's transaction capacity.

It does not say much of anything about the drive's life under other
conditions because no failure mechanism is cited. For example, if the
drive is well cooled and that means the components inside are well
cooled rather than left in usual mountings the life might be far
greater simply based on the component temperature drop. 10C can make
a large difference in lifetime. But if the real limit is related to
read write cycles on the memory locations you may find that temperature
has little real affect on the system lifetime.

If I could design a system that worked off a fast normal RAID and could
buffer in the SSD RAID with a safe automatic fall over when the SSD RAID
failed, regardless of failure mode, and I needed the speed you can bet
I'd be in there collecting statistics for the company for whom I did the
work. There is a financial incentive here and potential competitive
advantage to KNOW how these drives fail. With 100 drives after the first
5 or 10 had died some knowledge might be gained. And, of course, if the
drives did NOT die even more important knowledge would be gained.

Simple MTBF under gamer type use is pretty useless for a massive database
application. And if manufacturers are not collecting the data there is a
really good potential for competitive advantage if you collect your own
data and hold it proprietary. I betcha somebody out there is doing this
right now for Wall Street automated trading uses if nothing else.

{^_^}

ATOM RSS1 RSS2