ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Deduplication on CSV storage

    IT Discussion
    4
    18
    927
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      Jimmy9008
      last edited by

      Hi folks,

      We have a three node Windows Failover Cluster, with several CSVs, each provided by Starwind using local storage.

      One interesting idea has been brought up internally and i'd like to know if it sounds sensible or possible.

      Is it possible to enable deduplication of CSV storage within the cluster? My initial thoughts are that as VM files are open in vmms.exe, the deduplication process will not be able to work on those open files. Is that wrong though? I would expect iso files to dedup a they are offline/not in use, but have a feeling live running VM VHD(X) files on the host will be unable to run through the process...

      I could try this, but don't want to risk causing issues to the storage. So, thought best to put the idea out first.

      Best,
      Jim

      ObsolesceO 1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller
        last edited by

        Never researched this, but I see no reason why a live file would be unable to be deduplicated as at the block level, where dedupe happens, the concept of open or closed doesn't exist.

        1 Reply Last reply Reply Quote 0
        • scottalanmillerS
          scottalanmiller
          last edited by

          Dedupe is by volume, not file system. So I'm pretty sure CSV with mounted VHD will be just the same as anything else.

          Also, Hyper-V is an optimization option for Dedupe, so I'd really expect it to work.

          1 Reply Last reply Reply Quote 0
          • dbeatoD
            dbeato
            last edited by

            No, do not enable deduplication on the CSV Storage if the the Server OS are Server 2012
            https://docs.microsoft.com/en-us/windows-server/failover-clustering/failover-cluster-csvs

            4f9d83c8-cb68-460a-b928-a12fdf71a9d6-image.png

            https://support.microsoft.com/en-us/help/2906888/known-issues-after-you-enable-data-deduplication-on-csv

            scottalanmillerS 1 Reply Last reply Reply Quote 1
            • scottalanmillerS
              scottalanmiller @dbeato
              last edited by

              @dbeato said in Deduplication on CSV storage:

              if the the Server OS are Server 2012

              This is really just part of the more general advice of "never use Server 2012".

              1 Reply Last reply Reply Quote 1
              • ObsolesceO
                Obsolesce @Jimmy9008
                last edited by

                @Jimmy9008 said in Deduplication on CSV storage:

                Hi folks,

                We have a three node Windows Failover Cluster, with several CSVs, each provided by Starwind using local storage.

                One interesting idea has been brought up internally and i'd like to know if it sounds sensible or possible.

                Is it possible to enable deduplication of CSV storage within the cluster? My initial thoughts are that as VM files are open in vmms.exe, the deduplication process will not be able to work on those open files. Is that wrong though? I would expect iso files to dedup a they are offline/not in use, but have a feeling live running VM VHD(X) files on the host will be unable to run through the process...

                I could try this, but don't want to risk causing issues to the storage. So, thought best to put the idea out first.

                Best,
                Jim

                No, as others pointed out. If you want dedupe, do it inside of the VMs on the data volumes there for that kind of stuff, not on the CSV.

                scottalanmillerS 1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller @Obsolesce
                  last edited by

                  @Obsolesce said in Deduplication on CSV storage:

                  No, as others pointed out. If you want dedupe, do it inside of the VMs on the data volumes there for that kind of stuff, not on the CSV.

                  Why not on the CSV?

                  ObsolesceO 1 Reply Last reply Reply Quote 0
                  • ObsolesceO
                    Obsolesce @scottalanmiller
                    last edited by

                    @scottalanmiller said in Deduplication on CSV storage:

                    @Obsolesce said in Deduplication on CSV storage:

                    No, as others pointed out. If you want dedupe, do it inside of the VMs on the data volumes there for that kind of stuff, not on the CSV.

                    Why not on the CSV?

                    That does it across the board then for everything in it. If that's what he wants, go for it. Just know what you need to do to do it properly. It also depends on a few variables.

                    Just look over the docs well and know it first.

                    J 1 Reply Last reply Reply Quote 1
                    • J
                      Jimmy9008
                      last edited by

                      @Obsolesce said in Deduplication on CSV storage:

                      @Jimmy9008 said in Deduplication on CSV storage:

                      Hi folks,

                      We have a three node Windows Failover Cluster, with several CSVs, each provided by Starwind using local storage.

                      One interesting idea has been brought up internally and i'd like to know if it sounds sensible or possible.

                      Is it possible to enable deduplication of CSV storage within the cluster? My initial thoughts are that as VM files are open in vmms.exe, the deduplication process will not be able to work on those open files. Is that wrong though? I would expect iso files to dedup a they are offline/not in use, but have a feeling live running VM VHD(X) files on the host will be unable to run through the process...

                      I could try this, but don't want to risk causing issues to the storage. So, thought best to put the idea out first.

                      Best,
                      Jim

                      No, as others pointed out. If you want dedupe, do it inside of the VMs on the data volumes there for that kind of stuff, not on the CSV.

                      Such as? I mean, I thought it shouldn't be done. A few people have said no in this thread, some have said yes...

                      ObsolesceO 1 Reply Last reply Reply Quote 0
                      • ObsolesceO
                        Obsolesce @Jimmy9008
                        last edited by

                        @Jimmy9008 said in Deduplication on CSV storage:

                        Such as?

                        I don't understand what you're asking?

                        1 Reply Last reply Reply Quote 0
                        • J
                          Jimmy9008 @Obsolesce
                          last edited by

                          @Obsolesce said in Deduplication on CSV storage:

                          @scottalanmiller said in Deduplication on CSV storage:

                          @Obsolesce said in Deduplication on CSV storage:

                          No, as others pointed out. If you want dedupe, do it inside of the VMs on the data volumes there for that kind of stuff, not on the CSV.

                          Why not on the CSV?

                          That does it across the board then for everything in it. If that's what he wants, go for it. Just know what you need to do to do it properly. It also depends on a few variables.

                          Just look over the docs well and know it first.

                          Sorry, was supposed to quote that. I mean, such as what variables? I've looked at various resources and can't see definitive information about doing this on a csv with VM files and drives...

                          Not sure if it's sensible. Won't it have an overhead too?

                          scottalanmillerS 2 Replies Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @Jimmy9008
                            last edited by

                            @Jimmy9008 said in Deduplication on CSV storage:

                            Not sure if it's sensible. Won't it have an overhead too?

                            Yes, but normal tuning makes it only use that overhead when the system is idle. So unless you have some really specific workload that will cause problems that we don't know about, the system docs are pretty clear that the overhead won't impact you.

                            1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller @Jimmy9008
                              last edited by

                              @Jimmy9008 said in Deduplication on CSV storage:

                              I've looked at various resources and can't see definitive information about doing this on a csv with VM files and drives...

                              That's because you are looking for "negative" documentation. It's like "can I store my video games on X hard drive." It's a hard drive, of course you can store a video game on it. You can't look for "Civilization VI on WD 4TB Red Drive" because it's not a proper question. The drive is SATA and supports any file system, Civ 6 will go on any file system, ergo, it works. You can't document every use case imaginable like that.

                              That's what is going on here. The dedupe is block level. Ergo, talking about the file types on it isn't relevant. Hence you are feeling like you aren't getting answers. but knowing that it is block level answers all of that automatically.

                              1 Reply Last reply Reply Quote 0
                              • dbeatoD
                                dbeato
                                last edited by

                                For example SQL Servers or Exchange cannot have deduplication on them so maybe you don’t have that. I have attempted to enable deduplication on Hyperv Servers and it really break things but I would open a ticket with Starwinds Support and they should be able to advise you on that. Their support is great.

                                scottalanmillerS 2 Replies Last reply Reply Quote 0
                                • scottalanmillerS
                                  scottalanmiller @dbeato
                                  last edited by

                                  @dbeato said in Deduplication on CSV storage:

                                  For example SQL Servers or Exchange cannot have deduplication on them so maybe you don’t have that.

                                  I'm pretty sure that they can. Maybe not inside the VM, but certainly from outside of it. How would they know or be affected? They can't be. If dedupe works, it works for every workload. If it doesn't, it never does. It's an all of nothing system, the workload on top of it can't determine if you can use it or not. It would only be affected by performance.

                                  1 Reply Last reply Reply Quote 0
                                  • scottalanmillerS
                                    scottalanmiller @dbeato
                                    last edited by

                                    @dbeato said in Deduplication on CSV storage:

                                    For example SQL Servers or Exchange cannot have deduplication on them

                                    Microsoft lists SQL Server as their prime example of "might be good for it, but you need to evaluate your use case" ...

                                    https://docs.microsoft.com/en-us/windows-server/storage/data-deduplication/install-enable

                                    Screenshot from 2020-06-25 09-15-51.png

                                    dbeatoD 1 Reply Last reply Reply Quote 0
                                    • dbeatoD
                                      dbeato @scottalanmiller
                                      last edited by

                                      @scottalanmiller said in Deduplication on CSV storage:

                                      @dbeato said in Deduplication on CSV storage:

                                      For example SQL Servers or Exchange cannot have deduplication on them

                                      Microsoft lists SQL Server as their prime example of "might be good for it, but you need to evaluate your use case" ...

                                      https://docs.microsoft.com/en-us/windows-server/storage/data-deduplication/install-enable

                                      Screenshot from 2020-06-25 09-15-51.png

                                      Well, I would say this is the part with Hyper-V that is so ambiguous
                                      16760424-62d6-4179-b5c1-c09a070c276d-image.png

                                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                                      • scottalanmillerS
                                        scottalanmiller @dbeato
                                        last edited by

                                        @dbeato said in Deduplication on CSV storage:

                                        @scottalanmiller said in Deduplication on CSV storage:

                                        @dbeato said in Deduplication on CSV storage:

                                        For example SQL Servers or Exchange cannot have deduplication on them

                                        Microsoft lists SQL Server as their prime example of "might be good for it, but you need to evaluate your use case" ...

                                        https://docs.microsoft.com/en-us/windows-server/storage/data-deduplication/install-enable

                                        Screenshot from 2020-06-25 09-15-51.png

                                        Well, I would say this is the part with Hyper-V that is so ambiguous
                                        16760424-62d6-4179-b5c1-c09a070c276d-image.png

                                        It's not, it can't be. VDI has massive overlap and can use more aggressive deduplication, but that's all. Dedupe by definition is either always safe, or never safe. There cannot be an inbetween. Not when it runs below the filesystem.

                                        Hyper-V support and safety is never in question. Only Hyper-V for VDI has a specialized tuning option.

                                        1 Reply Last reply Reply Quote 0
                                        • 1 / 1
                                        • First post
                                          Last post