ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    RAID10 - Two Drive Failure

    Scheduled Pinned Locked Moved IT Discussion
    62 Posts 11 Posters 3.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • wirestyle22W
      wirestyle22 @coliver
      last edited by

      @coliver said in RAID10 - Two Drive Failure:

      @Dashrender said in RAID10 - Two Drive Failure:

      @coliver said in RAID10 - Two Drive Failure:

      @Dashrender said in RAID10 - Two Drive Failure:

      @gjacobse said in RAID10 - Two Drive Failure:

      @wirestyle22 said in RAID10 - Two Drive Failure:

      @gjacobse said in RAID10 - Two Drive Failure:

      @JaredBusch said in RAID10 - Two Drive Failure:

      Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

      in my experience - you never replace more than one drive at a time...

      Ask me how I know.

      That's very interesting. I have not really had to deal with drive failures actually.

      I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had one.

      Work related,.. maybe all of three.

      Wow, that's pretty small.

      Personally, I've probably lost 3-4 drives. In businesses - well over 10.

      And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.

      We have one or two fail every two-three months. Nothing crazy.

      How many drives do you have?

      A few hundred for now. Should be under 100 at the end of summer.

      What are you guys changing to reduce that number by that much?

      coliverC 1 Reply Last reply Reply Quote 0
      • coliverC
        coliver @wirestyle22
        last edited by

        @wirestyle22 said in RAID10 - Two Drive Failure:

        @coliver said in RAID10 - Two Drive Failure:

        @Dashrender said in RAID10 - Two Drive Failure:

        @coliver said in RAID10 - Two Drive Failure:

        @Dashrender said in RAID10 - Two Drive Failure:

        @gjacobse said in RAID10 - Two Drive Failure:

        @wirestyle22 said in RAID10 - Two Drive Failure:

        @gjacobse said in RAID10 - Two Drive Failure:

        @JaredBusch said in RAID10 - Two Drive Failure:

        Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

        in my experience - you never replace more than one drive at a time...

        Ask me how I know.

        That's very interesting. I have not really had to deal with drive failures actually.

        I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had one.

        Work related,.. maybe all of three.

        Wow, that's pretty small.

        Personally, I've probably lost 3-4 drives. In businesses - well over 10.

        And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.

        We have one or two fail every two-three months. Nothing crazy.

        How many drives do you have?

        A few hundred for now. Should be under 100 at the end of summer.

        What are you guys changing to reduce that number by that much?

        Higher density drives.

        1 Reply Last reply Reply Quote 1
        • DashrenderD
          Dashrender @coliver
          last edited by

          @coliver said in RAID10 - Two Drive Failure:

          @Dashrender said in RAID10 - Two Drive Failure:

          @coliver said in RAID10 - Two Drive Failure:

          @Dashrender said in RAID10 - Two Drive Failure:

          @gjacobse said in RAID10 - Two Drive Failure:

          @wirestyle22 said in RAID10 - Two Drive Failure:

          @gjacobse said in RAID10 - Two Drive Failure:

          @JaredBusch said in RAID10 - Two Drive Failure:

          Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

          in my experience - you never replace more than one drive at a time...

          Ask me how I know.

          That's very interesting. I have not really had to deal with drive failures actually.

          I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had one.

          Work related,.. maybe all of three.

          Wow, that's pretty small.

          Personally, I've probably lost 3-4 drives. In businesses - well over 10.

          And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.

          We have one or two fail every two-three months. Nothing crazy.

          How many drives do you have?

          A few hundred for now. Should be under 100 at the end of summer.

          so you're losing around 1.5% of your drives per year... that seems a bit high, but my memory for the norm as published by google could be off. Plus your environment might not be as good as theirs.

          scottalanmillerS 1 Reply Last reply Reply Quote 0
          • JaredBuschJ
            JaredBusch @wirestyle22
            last edited by

            @wirestyle22 said in RAID10 - Two Drive Failure:

            @JaredBusch said in RAID10 - Two Drive Failure:

            individual resilver.

            Does this mean that the mileage is only applied to the new drive or it's just minimal in relation to the rest of the raid? Reason I ask is I always thought this put a lot of strain on the entire raid.

            WTF? This is a nothing more than a single mirror pair. The "strain" here is only a copy operation. The least possible work.

            The point of individual is because something like this is processed 100% by the CPU on the RAID card. So don't make it do more than one thing at a time.

            A parity array is different.

            1 Reply Last reply Reply Quote 0
            • JaredBuschJ
              JaredBusch @DustinB3403
              last edited by

              @DustinB3403 said in RAID10 - Two Drive Failure:

              @aaronstuder What raid controller do you have?

              Exactly this. A real SMB system should be a hot plug. But we have no idea what you bought.

              scottalanmillerS 1 Reply Last reply Reply Quote 1
              • travisdh1T
                travisdh1 @gjacobse
                last edited by

                @gjacobse said in RAID10 - Two Drive Failure:

                @JaredBusch said in RAID10 - Two Drive Failure:

                Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                in my experience - you never replace more than one drive at a time...

                Ask me how I know.

                I solemnly swear that I've pulled the wrong drive to replace before . Made a RAID6 rebuild take a lot longer, and a RAID 10 freak out till a reboot happened. Restoring from backup was always an option at least.

                scottalanmillerS 1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller @Alex Sage
                  last edited by

                  @aaronstuder said in RAID10 - Two Drive Failure:

                  Drive are 1 and 3 are in "predictive failure" , I am assuming the pairs are 0+1 and 2+3.

                  Why?

                  JaredBuschJ 1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller @gjacobse
                    last edited by

                    @gjacobse said in RAID10 - Two Drive Failure:

                    @JaredBusch said in RAID10 - Two Drive Failure:

                    Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                    in my experience - you never replace more than one drive at a time...

                    Ask me how I know.

                    In RAID 10, you always do if they are in different RAID 1 sets, always.

                    JaredBuschJ 1 Reply Last reply Reply Quote 0
                    • JaredBuschJ
                      JaredBusch @scottalanmiller
                      last edited by

                      @scottalanmiller said in RAID10 - Two Drive Failure:

                      @aaronstuder said in RAID10 - Two Drive Failure:

                      Drive are 1 and 3 are in "predictive failure" , I am assuming the pairs are 0+1 and 2+3.

                      Why?

                      Why what? Assuming? Because he did not document and most hardware RAID controllers are not accessible except during the boot process.

                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                      • JaredBuschJ
                        JaredBusch @scottalanmiller
                        last edited by

                        @scottalanmiller said in RAID10 - Two Drive Failure:

                        @gjacobse said in RAID10 - Two Drive Failure:

                        @JaredBusch said in RAID10 - Two Drive Failure:

                        Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                        in my experience - you never replace more than one drive at a time...

                        Ask me how I know.

                        In RAID 10, you always do if they are in different RAID 1 sets, always.

                        I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.

                        Of course I am assuming that the unit is in use and busy with normal system read/writes.

                        wirestyle22W scottalanmillerS 2 Replies Last reply Reply Quote 0
                        • wirestyle22W
                          wirestyle22 @JaredBusch
                          last edited by wirestyle22

                          @JaredBusch said in RAID10 - Two Drive Failure:

                          I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.

                          Doesn't this put twice the amount of mileage on the array though? or no

                          JaredBuschJ scottalanmillerS 2 Replies Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @JaredBusch
                            last edited by

                            @JaredBusch said in RAID10 - Two Drive Failure:

                            Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                            With mirrors RAID even the slowest RAID card won't feel the load of a straight copy. As long as they are not in the same RAID set, it'll be fastest and safest to do both at once.

                            1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller @wirestyle22
                              last edited by

                              @wirestyle22 said in RAID10 - Two Drive Failure:

                              @JaredBusch said in RAID10 - Two Drive Failure:

                              individual resilver.

                              Does this mean that the mileage is only applied to the new drive or it's just minimal in relation to the rest of the raid? Reason I ask is I always thought this put a lot of strain on the entire raid.

                              RAID 10 does not strain anything during a resilver, and the resilver operation only happens to a subset of the array, the overall array doesn't even know that it is happening.

                              1 Reply Last reply Reply Quote 1
                              • JaredBuschJ
                                JaredBusch @wirestyle22
                                last edited by

                                @wirestyle22 said in RAID10 - Two Drive Failure:

                                @JaredBusch said in RAID10 - Two Drive Failure:

                                I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.

                                Doesn't this put twice the amount of mileage on the array though? or no

                                How could this even be conceived? FFS think. No matter which way you do it is is two RAID1 resilver operations. Same time or separate, that changes nothing.

                                wirestyle22W 1 Reply Last reply Reply Quote 0
                                • scottalanmillerS
                                  scottalanmiller @Alex Sage
                                  last edited by

                                  @aaronstuder said in RAID10 - Two Drive Failure:

                                  Some information I am reading says I need to take the drive "offline" first, is this true?

                                  PowerEdge R720.

                                  If killing one in predictive failure, yes you generally offline it first. But this depends on the controller in question, the server doesn't matter.

                                  JaredBuschJ 1 Reply Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller @wirestyle22
                                    last edited by

                                    @wirestyle22 said in RAID10 - Two Drive Failure:

                                    I'm assuming @JaredBusch's qualifier of "Predictive failure is not failure" means that the could possibly change if this were a failed drive?

                                    If the drive has failed, then it is already offline.

                                    1 Reply Last reply Reply Quote 0
                                    • scottalanmillerS
                                      scottalanmiller @Dashrender
                                      last edited by

                                      @Dashrender said in RAID10 - Two Drive Failure:

                                      @gjacobse said in RAID10 - Two Drive Failure:

                                      @wirestyle22 said in RAID10 - Two Drive Failure:

                                      @gjacobse said in RAID10 - Two Drive Failure:

                                      @JaredBusch said in RAID10 - Two Drive Failure:

                                      Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                                      in my experience - you never replace more than one drive at a time...

                                      Ask me how I know.

                                      That's very interesting. I have not really had to deal with drive failures actually.

                                      I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had one.

                                      Work related,.. maybe all of three.

                                      Wow, that's pretty small.

                                      Personally, I've probably lost 3-4 drives. In businesses - well over 10.

                                      And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.

                                      That's a pretty big factor. And how many drives are in them. 8,000 servers, over ten years, with a minimum of four drives in each... I saw a lot.

                                      1 Reply Last reply Reply Quote 0
                                      • wirestyle22W
                                        wirestyle22 @JaredBusch
                                        last edited by

                                        @JaredBusch said in RAID10 - Two Drive Failure:

                                        @wirestyle22 said in RAID10 - Two Drive Failure:

                                        @JaredBusch said in RAID10 - Two Drive Failure:

                                        I completely disagree. Reason stated above.This is a predictive failure, not a failure. You will get a faster resilver of each mirror by doing them individually.

                                        Doesn't this put twice the amount of mileage on the array though? or no

                                        How could this even be conceived? FFS think. No matter which way you do it is is two RAID1 resilver operations. Same time or separate, that changes nothing.

                                        Shawn Wallace aside, thanks for the info

                                        1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @Dashrender
                                          last edited by

                                          @Dashrender said in RAID10 - Two Drive Failure:

                                          @coliver said in RAID10 - Two Drive Failure:

                                          @Dashrender said in RAID10 - Two Drive Failure:

                                          @coliver said in RAID10 - Two Drive Failure:

                                          @Dashrender said in RAID10 - Two Drive Failure:

                                          @gjacobse said in RAID10 - Two Drive Failure:

                                          @wirestyle22 said in RAID10 - Two Drive Failure:

                                          @gjacobse said in RAID10 - Two Drive Failure:

                                          @JaredBusch said in RAID10 - Two Drive Failure:

                                          Predictive failure is not failure. Replace one at a time. to give the RAID card the most power to work on the individual resilver.

                                          in my experience - you never replace more than one drive at a time...

                                          Ask me how I know.

                                          That's very interesting. I have not really had to deal with drive failures actually.

                                          I haven't kept up with the number of drive failures I have had over nearly 30 years. In all of my personal systems, I think I have had one.

                                          Work related,.. maybe all of three.

                                          Wow, that's pretty small.

                                          Personally, I've probably lost 3-4 drives. In businesses - well over 10.

                                          And Scott has probably seen hundreds fail. Of course it all boils down to how many systems you see/support.

                                          We have one or two fail every two-three months. Nothing crazy.

                                          How many drives do you have?

                                          A few hundred for now. Should be under 100 at the end of summer.

                                          so you're losing around 1.5% of your drives per year... that seems a bit high, but my memory for the norm as published by google could be off. Plus your environment might not be as good as theirs.

                                          That's low, actually.

                                          1 Reply Last reply Reply Quote 0
                                          • JaredBuschJ
                                            JaredBusch @scottalanmiller
                                            last edited by

                                            @scottalanmiller said in RAID10 - Two Drive Failure:

                                            @aaronstuder said in RAID10 - Two Drive Failure:

                                            Some information I am reading says I need to take the drive "offline" first, is this true?

                                            PowerEdge R720.

                                            If killing one in predictive failure, yes you generally offline it first. But this depends on the controller in question, the server doesn't matter.

                                            Obviously, depends on the controller, but the point of blind swap being standard choice for the SMB is that you simply swap the drives.

                                            scottalanmillerS 1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 3 / 4
                                            • First post
                                              Last post