ANTenna Blog -- How-To

Data Dedupe: Doing More -- A Lot More! -- With Less

Posted by Matthew McKenzie Thursday, Nov 5, 2009, 04:09 PM ET

You're already familiar with file compression technology. Now meet its big brother -- data deduplication -- and learn how it can save your company a ton of money.

Businesses with growing data-storage requirements must also deal with the challenges of backing up this data. In the past, this meant either investing in additional backup capacity, cutting backup data-retention times, or simply choosing not to back up certain data sources.

The first choice is expensive, even as disk-based storage costs continue to fall. The second choice can quickly poke holes in your company's backup strategy.

And the third choice? Think of it as playing Russian Roulette with one of your company's most important assets -- its data.

Deduplication technology has been around for a number of years. Recently, however, it has exploded into the IT mainstream; whether a vendor provides enterprise-class backup solutions or caters to the smallest businesses, "dedupe" probably figures prominently in its backup products.

Don't Miss: NEW! Storage How-To Center

Yet there is clearly a disconnect here. According to storage expert George Crump, most IT professionals still don't pay much attention to dedupe technology. Yet at the same time, some of the world's biggest storage vendors are paying megabucks to acquire the latest and greatest dedupe innovations.

The basic idea behind deduplication is simple. Think of it as a backup solution that is intelligent enough to know when it encounters the same data twice. An obvious example would be an email archive backup that includes lots of attachments. If a backup system recognizes that a number of messages contain the same attachment, it can keep a single copy and replace the others with a virtual pointer.

It's a far more powerful approach than traditional data compression. Consider another example: a backup archive full of JPEG images. Those JPEGs are already compressed; a data compression tool will have little or no impact on them.

A dedupe solution, however, can pick through the same content -- document archives, Web site content repositories, or other sources -- and drastically reduce the space required to back it up.

The gory details here can get very complicated, very quickly. Different dedupe solutions operate at different points in a company's IT infrastructure and apply different techniques to get the job done. All of them, however, are capable of cutting the space required for backups by up to 90 percent in some cases.

Dedupe technology truly allows you to do more with less. Many companies can actually cut their investments in backup storage hardware while still increasing their data-retention periods or cutting the amount of time between backups. And while it is always a smart idea to prioritize business data as part of a backup strategy, dedupe means never having to skimp on backups that are necessary to protect essential data.

Finally, dedupe offers another huge benefit: It makes cloud-based solutions practical for even very large backup jobs. In the past, even companies with relatively fast Internet connections could take days to upload a multi-gigabyte backup to a cloud provider. And for smaller firms that rely on DSL connections, the same jobs to could take weeks -- an obvious deal-killer for comprehensive cloud-based backups.

That's why many cloud-based backup services are touting dedupe technology. In many cases, it can turn a multi-day backup into much faster process, especially once a company has its first full online backup in place and can start uploading smaller, incremental backups.

And, of course, smaller uploads also mean smaller per-GB storage charges.

Where should you go to get a quick education on dedupe technology? Here are a few suggestions (note that I note vendor-specific links here because they offer good background information, not because I endorse their products or services.)

- Start with this InformationWeek article that explains dedupe clearly and lays out the technology's pros and cons.
- Storage solution provider and IBM partner EMC also has a great FAQ that covers both the basics and some more advanced aspects of the topic for IT professionals. An EMC-authored white paper on the subject is also good enough for me to recommend it as essential reading.
- Another vendor, Quantum, offers an overview of the technology that focuses on its economic and operational benefits.
- Finally, a TechTarget.com article offers both a solid introduction to dedupe technology along with a wealth of pointers to other articles dealing with the topic and with how storage solution providers are implementing it.


How-To
Backup | Cloud Computing | Disaster Recovery | Hardware & Software | Storage




This is a public forum. CMP Media and its affiliates are not responsible for and do not control what is posted herein. CMP Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of CMP Media LLC and may be edited and republished in print or electronic format as outlined in CMP Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




Explore ANTenna Blog
Most Recent Posts
ANTenna Blog Topics
     
     
ANTenna Bloggers
ANTenna Blog Roll


 


Browse by Category

bMighty Tech
Term Of Day:

Boost your tech
vocabulary!
bMighty's SMB
TechEncyclopedia
defines more than
20,000 IT terms.



FREE Technology Services Locator!

Search our database of 200,000 solution- provider locations by business activity, technology, vertical market, and customer size. Find a technology partner NOW.

go