No, this is not an article about Miley Cyrus’ latest song. This is an article about data deduplication, often referred to as “dedupe”. The intent of this article is to briefly discuss what data duplication is and how it might be employed in your current BDR plan.
Data deduplication is a specialized data compression technique. In its simplest form, the deduplication process compares unique byte patterns in chunks of data intended for storage with an internal index of data already stored. Whenever a match occurs the redundant chunk of data is replaced with a small reference that points to the previously stored data.
Another way to think about data deduplication is where it occurs. A deduplication process which occurs close to where the data is created is referred to as “source deduplication” whereas a similar deduplication process occurring close to where the data is stored is a “target deduplication”.
Data deduplication carries with it many of the same drawbacks and benefits of other compression processes. For example, whenever data is transformed there is a potential risk of lost or corrupted data. In addition, there may be the added overhead of computational resources required for the compression process. Hopefully the benefit of an optimized storage footprint outweighs the risk and where large amounts of data is concerned, this is very possible.
However if we consider the low cost of drive space today a small business might do well to consider buying additional storage capacity rather than purchase and implement a deduplication process. One study using IBM disk manufacturing data implies that the cost per GigaByte is dropping by roughly 37.5 percent each year.
So, before you “throwdown” that pile of cash you might consider integrating low cost data storage as a safer and easier solution to implementing data deduplication processes. Meanwhile, check back here at StorageCraft often for more backup and data recovery solutions.