SCIENTIFIC-LINUX-USERS Archives

February 2012

SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Date:
Thu, 23 Feb 2012 23:56:20 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (65 lines)
On Wed, Feb 22, 2012 at 4:22 PM, Bill Maidment <[log in to unmask]> wrote:
> -----Original message-----
> From:   Tom H <[log in to unmask]>
> Sent:   Thu 23-02-2012 01:12
> Subject:        Re: Degraded array issues with SL 6.1 and SL 6.2
> To:     SL Users <[log in to unmask]>;
>> On Wed, Feb 22, 2012 at 7:58 AM, Bill Maidment <[log in to unmask]> wrote:
>> > -----Original message-----
>> > From:   Bill Maidment <[log in to unmask]>
>> > Sent:   Mon 20-02-2012 17:43
>> > Subject:        Degraded array issues with SL 6.1 and SL 6.2
>> > To:     [log in to unmask] <[log in to unmask]>;
>> >> I have had some issues with the last two kernel releases. When a degraded
>> array
>> >> event occurs, I am unable to add a new disk back in to the array. This has
>> been
>> >> reported on Centos 6.1/6.2 and also RHEL 6.2 (see Bug 772926 - dracut unable
>> to
>> >> boot from a degraded raid1 array). I have found that I need to revert to
>> kernel
>> >> 2.6.32-131.21.1.el6.x86_64 in order to be able to add the new drive.
>> >
>> > The response from RH is as follows:
>> > 1) If you try to re-add a disk to a running raid1 after having failed it,
>> > mdadm correctly rejects it as it has no way of knowing which of the disks
>> > are authoritative. It clearly tells you that in the error message you
>> > pasted into the bug.
>> >
>> > 2) You reported a Scientific Linux bug against Red Hat Enterprise Linux.
>> > Red Hat does not support Scientific Linux, please report bugs against
>> > Scientific Linux to the people behind Scientific Linux.
>> >
>> > My response is:
>> > 1) a) It used to work it out. b) No it does not clearly spell it out. c) Why
>> was it not a problem in earlier kernels?
>> > 2) Is this an SL bug? I think not!
>>
>> Bug 772926 doesn't have anything about SL. Are you referring to another bug?
>>
>> In (1) above, are they replying that you can't "--fail", "--remove",
>> and then "--add" the same disk or that you can't "--fail" and
>> "--remove" a disk, replace it, and then can't "--add" it because it's
>> got the same "X"/"XY" in "sdX"/"sdaXY" as the previous, failed disk?
>>
>
> Bug 772926 was reported from via someone from CentOS, but it would affect SL too and it seemed to be related
> http://bugs.centos.org/view.php?id=5400
>
> I think they are saying that you NOW can't re-add the same disk without first zeroing out the disk superblock.
> I just find the wording of the error message a bit confusing:
>
> [root@ferguson ~]# mdadm /dev/md3 -a /dev/sdc1
> mdadm: /dev/sdc1 reports being an active member for /dev/md3, but a --re-add fails.
> mdadm: not performing --add as that would convert /dev/sdc1 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdc1" first.
> [root@ferguson ~]#

772926 doesn't have "You reported a Scientific Linux bug against Red
Hat Enterprise Linux".

The wording about a spare in the third line seems wrong. Anyway, I'd
never re-add a failed and removed disk without zeroing the superblock;
if you could do it previously, it was an oversight/bug that's now been
fixed.

ATOM RSS1 RSS2