Missing Dedupe Debate Detail!

Storage I/O trends

The de-dupe vendors like to debate details of their solutions, ranging from compression or de-dupe ratios, to hashing and caching algorithms, to processor vs. disk vs. memory, to in-band vs. out-of-band, pre or post processing among other items. At times the dedupe debates can get more lively than a political debate or even the legendary storage virtualization debates of yester year.

However one item that an IT professional recently mentioned that is not being addressed or talked about during the de-dupe debates is how IT customers will get around vendor lock-in. Never mind the usual lock-in debates of whose back-end storage or disk drives, whose server a de-dupe appliance software runs and so forth.

The real concern is how data in the future will be recoverable from a de-dupe solution similar to how data can be recovered from tape today. Granted this is an apple to oranges comparison at best. The only real similarity is that a backup or archive solution sends a data stream in a tar-ball or backup or archive save set or perhaps in a file format to the tape or de-dupe appliance. Then, the VTL or de-dupe appliance software puts the data into yet another format.

Granted not all tape media can be interchanged between different tape drives given format, generations and of course using the proper backup or archive application to un-pack the data for use. Probably a more applicable apple to oranges comparison would be how will IT personal get data back from a VTL (non de-duping) disk based storage system compared to getting data back from a VTL or de-dupe appliance.

Today and for the foreseeable future the answer is simple, if your pain point is severe and you need the benefits of de-dupe, then the de-dupe software and appliance is your point of vendor lock-in. If vendor lock-in is a main concern, take your time, do your homework and due diligence for solutions that reduce lock-in or at least give a reasonable strategy for data access in the future.

Welcome to the world of virtualized data and virtualized data protection. Here?s the golden rule for de-dupe and that is like virtualization, who ever controls the software and management meta data controls the vendor lock-in, good, bad or in-different, that?s the harsh reality.

For the record, I like de-dupe technology in general as part of an overall data footprint reduction strategy combined with archiving and real-time compression for on-line and off-line data. I see a very bright future for it moving forward. I also see many of the heavy thinking and heavy lifting issues to support large-scale deployments and processing getting addressed over time allowing de-dupe to move from mid markets to large-scale mainstream adoption.

Now, back to your regularly scheduled de-dupe debate drama!

Cheers
gs