ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Is this server strategy reckless and/or insane?

    IT Discussion
    12
    224
    23.6k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • creaytC
      creayt
      last edited by creayt

      I have 2 servers. Other than one having 4 more processor cores total, the servers are identical. Specs are:

      R620
      a: 2x octacore xeon, b: 2x decacore xeon
      128GB ram each
      1GB Perc H710P RAID controller
      2x 256GB Samsung 850 Pros ( Os and installs live here, Raid 1 )
      5x 1TB Samsung 850 Pros ( Data and file uploads live here, Raid 0 ) ( can add 1 more on the decacore and up to 3 more on the octacore later if desired, but this RAID controller gets pretty saturated at 4-5 from what I've read )

      My question is, I like the benefits of not having to leave the box to go from app code to database as the goal of this project is for it to be as absolutely instant and fast feeling as possible, so my plan is to basically configure the servers like so
      Full serverware stack on each ( IIS, app server, MySQL )
      One of the two will be the MySQL master and replicate to the other ( all writes will go to this server )
      The other server will be the image upload and processing box, and final images will all be copied to the other server in the background
      The two machines will be clustered for all web traffic save those two uses ( db writes and file uploads )
      Both will be ready, w/ just a few minutes downtime, to take over the full workload should the other fail, which includes a disk in either Raid 0 failing

      The plan would be
      If any piece of the DB master server fails, it drops out of the picture until I can get the failure resolved, and the 2nd server takes over all duties. The only piece I can't automate is the switch from slave to master, which I would handle manually and up to an hour of downtime is acceptable

      If any piece of the image processing server fails, all traffic would automatically go to the other server until I can resolve the failure, and no perceptible downtime would occur

      The only thing running on these servers will be a hobby project I made, and I'm ok w/ a little downtime in the event of a presumably unlikely hardware failure.

      What do you think? Is this a completely unorthodox approach? I like the idea of most web site requests being able to go through either server so I can make use of the horsepower of both of them and my goal is to make the fastest web site I've ever used so keeping the db and app code that touch each other on the same machine is ideal for me, as is using high-performance-within-my-budget techniques like a Raid 0 of SSDs.

      Let me know what you think, I'm a programmer not a server pro so there may be a ton of negatives I haven't though of in this set up.

      Thanks!

      JaredBuschJ 1 Reply Last reply Reply Quote 0
      • DashrenderD
        Dashrender
        last edited by

        I assume you don't care about the data on the RAID 0?

        creaytC 1 Reply Last reply Reply Quote 1
        • DashrenderD
          Dashrender
          last edited by

          You're already saturating your RAID controller, so tossing one more drive in each to make them RAID 5 (assuming these are SSDs), one less thing to rebuild, restore/resync if you have a drive failure.

          1 Reply Last reply Reply Quote 1
          • scottalanmillerS
            scottalanmiller
            last edited by

            That's pretty common. Master to slave automated, manual fail back. Works fine as long as you are around most of the time.

            1 Reply Last reply Reply Quote 1
            • creaytC
              creayt
              last edited by

              I was wrong, it looks like you can fully automatically fail over to the slave and set it as the new master w/ the latest MySQL set up, so that makes the decision a bit easier.

              As far as Raid 5 instead of 0, I'd thought that the performance of Raid 5 was absolutely terrible and that almost no one used it anymore, is that a wrong memory?

              DustinB3403D scottalanmillerS 2 Replies Last reply Reply Quote 0
              • DustinB3403D
                DustinB3403 @creayt
                last edited by

                @creayt said in Is this server strategy reckless and/or insane?:

                I was wrong, it looks like you can fully automatically fail over to the slave and set it as the new master w/ the latest MySQL set up, so that makes the decision a bit easier.

                As far as Raid 5 instead of 0, I'd thought that the performance of Raid 5 was absolutely terrible and that almost no one used it anymore, is that a wrong memory?

                No one uses RAID5 with spinning rust.

                RAID5 is perfectly acceptable with SSDs

                1 Reply Last reply Reply Quote 4
                • creaytC
                  creayt @Dashrender
                  last edited by

                  @dashrender I care about it, but because it's automatically replicated after each write there's a fully up-to-date, ready-to-go backup of it the next U down at all times. Could/would also push nightly backups offsite somewhere I suppose.

                  Looks like Raid 5 for SSDs can also, possibly, shorten their lifespan because of the parity writes: https://serverfault.com/questions/513909/what-are-the-main-points-to-avoid-raid5-with-ssd

                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller @creayt
                    last edited by

                    @creayt said in Is this server strategy reckless and/or insane?:

                    As far as Raid 5 instead of 0, I'd thought that the performance of Raid 5 was absolutely terrible and that almost no one used it anymore, is that a wrong memory?

                    RAID 5 is the standard for SSDs. But you will take performance hits. But whether or not you can tell is the question. On an all flash array with caching, the hit is pretty small.

                    1 Reply Last reply Reply Quote 3
                    • scottalanmillerS
                      scottalanmiller @creayt
                      last edited by

                      @creayt said in Is this server strategy reckless and/or insane?:

                      @dashrender I care about it, but because it's automatically replicated after each write there's a fully up-to-date, ready-to-go backup of it the next U down at all times. Could/would also push nightly backups offsite somewhere I suppose.

                      Looks like Raid 5 for SSDs can also, possibly, shorten their lifespan because of the parity writes: https://serverfault.com/questions/513909/what-are-the-main-points-to-avoid-raid5-with-ssd

                      Yes, but with enterprise drives and cache buffering, that's trivial. You are typically looking at decades before failure.

                      ObsolesceO 1 Reply Last reply Reply Quote 2
                      • ObsolesceO
                        Obsolesce @scottalanmiller
                        last edited by

                        @scottalanmiller said in Is this server strategy reckless and/or insane?:

                        @creayt said in Is this server strategy reckless and/or insane?:

                        @dashrender I care about it, but because it's automatically replicated after each write there's a fully up-to-date, ready-to-go backup of it the next U down at all times. Could/would also push nightly backups offsite somewhere I suppose.

                        Looks like Raid 5 for SSDs can also, possibly, shorten their lifespan because of the parity writes: https://serverfault.com/questions/513909/what-are-the-main-points-to-avoid-raid5-with-ssd

                        Yes, but with enterprise drives and cache buffering, that's trivial. You are typically looking at decades before failure.

                        850 pros are not enterprise drives.

                        scottalanmillerS DashrenderD 2 Replies Last reply Reply Quote 2
                        • scottalanmillerS
                          scottalanmiller @Obsolesce
                          last edited by

                          @tim_g said in Is this server strategy reckless and/or insane?:

                          @scottalanmiller said in Is this server strategy reckless and/or insane?:

                          @creayt said in Is this server strategy reckless and/or insane?:

                          @dashrender I care about it, but because it's automatically replicated after each write there's a fully up-to-date, ready-to-go backup of it the next U down at all times. Could/would also push nightly backups offsite somewhere I suppose.

                          Looks like Raid 5 for SSDs can also, possibly, shorten their lifespan because of the parity writes: https://serverfault.com/questions/513909/what-are-the-main-points-to-avoid-raid5-with-ssd

                          Yes, but with enterprise drives and cache buffering, that's trivial. You are typically looking at decades before failure.

                          850 pros are not enterprise drives.

                          Whoops, missed that.

                          1 Reply Last reply Reply Quote 1
                          • DashrenderD
                            Dashrender @Obsolesce
                            last edited by

                            @tim_g said in Is this server strategy reckless and/or insane?:

                            @scottalanmiller said in Is this server strategy reckless and/or insane?:

                            @creayt said in Is this server strategy reckless and/or insane?:

                            @dashrender I care about it, but because it's automatically replicated after each write there's a fully up-to-date, ready-to-go backup of it the next U down at all times. Could/would also push nightly backups offsite somewhere I suppose.

                            Looks like Raid 5 for SSDs can also, possibly, shorten their lifespan because of the parity writes: https://serverfault.com/questions/513909/what-are-the-main-points-to-avoid-raid5-with-ssd

                            Yes, but with enterprise drives and cache buffering, that's trivial. You are typically looking at decades before failure.

                            850 pros are not enterprise drives.

                            I was to slow to respond.. I didn't miss that. 😉

                            1 Reply Last reply Reply Quote 0
                            • creaytC
                              creayt
                              last edited by

                              Let me ask this.

                              The only thing that'll be stored on each Raid 0/5 is

                              The MySQL data files ( not the MySQL installation )
                              and
                              The image uploads

                              So if a drive in the Raid 0 fails, I simply replace the drive, recreate the virtual disk, and then copy the database and images, which I think takes just a few minutes w/ two systems of this caliber 1U away from each other especially w/ so many cores to spare ( won't be competing w/ the load of the live site ).

                              So, since I have to drive an SSD over to the datacenter 10 minutes away, open the box, and get it in, a few more minutes for the copy feels like it'll be negligibly more time than if it failed w/ a Raid 5, where it would stay online ( though I don't know if my set up lets you do the Raid 5 replacement while the OS is running, maybe it does, or maybe I just hot swap the drive I'm not sure ).

                              So, because the full penalty for a Raid 0 failing vs. a Raid 5 in my set up is basically a few more minutes to copy the stuff manually, seems like the performance improvements would be worth the gamble. Is that logic sound or do y'all think just keeping the array online is better so 5 is the way to go anyway?

                              creaytC DustinB3403D DashrenderD 3 Replies Last reply Reply Quote 1
                              • ObsolesceO
                                Obsolesce
                                last edited by

                                Just an FYI:

                                 
                                Posted by
                                DELL-Josh Cr 
                                on 16 Mar 2015 15:41 
                                
                                Hi,
                                ...if it is not a Dell drive we won’t have put our firmware on it that is designed for our controllers and we will not have validated it....
                                
                                Thanks,
                                Josh Craig
                                Dell EMC | Enterprise Support Services
                                Get support on Twitter: @DellCaresPRO
                                Download our QRL app: iOS, Android, Windows
                                
                                creaytC 1 Reply Last reply Reply Quote 1
                                • creaytC
                                  creayt @creayt
                                  last edited by

                                  @creayt Also forgot to bring up that Raid 0 also gives me way more capacity right so it'd give me terabyte(s) more before I had to scale to extra hardware? Can't remember how much Raid 5 subtracts.

                                  DustinB3403D DashrenderD scottalanmillerS 3 Replies Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller
                                    last edited by

                                    That's not a horrible recovery strategy. But if the question is performance, how much downtime or effort caused by that offsets the performance difference? That's a real question. Will anyone notice the performance difference day to day? Will they notice five minutes or an hour of downtime? Will you notice having to do all of that work that could have been avoided?

                                    Those are the real questions.

                                    1 Reply Last reply Reply Quote 3
                                    • DustinB3403D
                                      DustinB3403 @creayt
                                      last edited by

                                      @creayt said in Is this server strategy reckless and/or insane?:

                                      Let me ask this.

                                      The only thing that'll be stored on each Raid 0/5 is

                                      The MySQL data files ( not the MySQL installation )
                                      and
                                      The image uploads

                                      So if a drive in the Raid 0 fails, I simply replace the drive, recreate the virtual disk, and then copy the database and images, which I think takes just a few minutes w/ two systems of this caliber 1U away from each other especially w/ so many cores to spare ( won't be competing w/ the load of the live site ).

                                      So, since I have to drive an SSD over to the datacenter 10 minutes away, open the box, and get it in, a few more minutes for the copy feels like it'll be negligibly more time than if it failed w/ a Raid 5, where it would stay online ( though I don't know if my set up lets you do the Raid 5 replacement while the OS is running, maybe it does, or maybe I just hot swap the drive I'm not sure ).

                                      So, because the full penalty for a Raid 0 failing vs. a Raid 5 in my set up is basically a few more minutes to copy the stuff manually, seems like the performance improvements would be worth the gamble. Is that logic sound or do y'all think just keeping the array online is better so 5 is the way to go anyway?

                                      Keeping the OBR5 online and recovering from that would be faster than having to completely rebuild an OBR0.

                                      creaytC 1 Reply Last reply Reply Quote 0
                                      • creaytC
                                        creayt @Obsolesce
                                        last edited by

                                        @tim_g What are the implications of this, do you know? For what it's worth none of these drives do the amber light thing in either server, all green and they report as SSDs etc. in the lifecycle tooling.

                                        ObsolesceO 1 Reply Last reply Reply Quote 0
                                        • DustinB3403D
                                          DustinB3403 @creayt
                                          last edited by

                                          @creayt said in Is this server strategy reckless and/or insane?:

                                          @creayt Also forgot to bring up that Raid 0 also gives me way more capacity right so it'd give me terabyte(s) more before I had to scale to extra hardware? Can't remember how much Raid 5 subtracts.

                                          How much storage does this system need?

                                          creaytC 1 Reply Last reply Reply Quote 0
                                          • creaytC
                                            creayt @DustinB3403
                                            last edited by creayt

                                            @dustinb3403 It's a community style site that's some kind of hybrid between Reddit and something like Mango Lassi, so the more users I get, the more content they'll generate ( mostly in the form of MySQL data ) and the more footprint I'll need, eventually having to go cloud probably if it takes off. But will be a huge volume of small database writes happening pretty much 24/7.

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 5
                                            • 6
                                            • 7
                                            • 11
                                            • 12
                                            • 5 / 12
                                            • First post
                                              Last post