Today, while reviewing the new blog posts, I came across Heidi Biggar entry To de-dupe or not to de-dupe about data de-duplication / capacity optimization / commonality factoring / single-instancing. In my last entry The Coolest Product from a Storage Startup, I mentioned coming across an exciting product and this product happen to fall in to the same category as mentioned by Heidi.
I agree with her observation that capacity optimization will be one of this year's most talked about new technology. Unlike storage virtualization hype of today, capacity optimization is delivering measurable and significant value to customer. But, it may take few years before the full potential of this technology is realized.
I believe the impact of capacity optimization goes beyond just data protection to primary storage. Just consider the possibility of able to store 50TB data on a 3TB storage array. The technology is quite powerful specially when it is positioned as data virtualization across an enterprise - removing the repeating segments everywhere in the enterprise with pointers to the centrally stored single instance of that segment.
A good "geek" read on the inner working of this technology is the US patent 6,928,526 Efficient Data Storage System granted to Dr. Kai Li, co-founder of DataDomain, and others. In my opinion, they shortchanged themselves with restricting their patient to data protection only or may be that was the only way to get the patent as I noticed patent examiner added several other patents as reference. I haven’t had chance to read them yet.
I fully agree with statement by Heidi about capacity optimization:
I'm pretty certain you'll be amazed at the results!