As you might guess, data redundancy is a primary contributor to explosive data growth. Studies estimate that multiple copies of data require organizations to buy, use, and administer two to fifty times more storage than they’d need with data deduplication.

Initially, data deduplication eliminated data redundancy in specific cases like full backups, email attachments, and VMware images. However, you’d soon notice the pervasiveness of duplicated data. That’s because test and development data multiplies across an organization over time. Replication, backup, and archiving create multiple data copies scattered across the enterprise, and users often copy data to multiple locations for their own convenience.
Organizations now recognize that—far from being a niche technology— deduplication should be an integrated and mandatory element in their overall IT strategies.
There are essentially two ways to reduce the cost of your data storage. First, you can try to leverage a lower-cost storage platform, which results in an additional set of problems. Your other option is to leverage data deduplication to reduce your data growth and total required storage.
Data deduplication can lower the cost of your data storage by reducing the amount of disk needed to store your data, whether it’s backups or online primary production volumes. In this three part blog series, we’ll discuss five best practices to help you select and implement the optimum deduplication solution for your environment.
Like disk-to-disk backup or server virtualization, you don’t want to evaluate deduplication as an isolated product or feature. You must consider the broader implications of deduplication within the context of your entire data management and storage strategy.
For example, deduplication can be performed at the _le, block, and byte levels. You’ll have to consider the tradeoffs for each method, which include computational time, accuracy, level of duplication detected, index size, and in some cases, the scalability of the solution.
Also, consider how you can use deduplication to eliminate tape where it makes sense in your environment. That might be remote offices or any locations where your company doesn’t have trained IT personnel.
We’ll discuss two more data deduplication best practices in our next blog post, so come back soon. And please share your own best practices with us as well.