Exposing the Six Myths of Deduplication
March 21st, 2013 By: Industry Perspectives
Darrell Riddle, senior director of product marketing for FalconStor Software., is a professional with more than 23 years of experience in the data protection industry. Darrell has an extensive understanding of both the technical and business aspects of marketing, product management and go-to-market strategies. Prior to joining FalconStor, Darrell worked at Symantec.DARRELL RIDDLE
Most companies have lots of duplicate data. That’s a fact. Many companies are aware of it, but it falls in the category of cleaning out the garage or a spare room. You see the problem, but until you run completely out of space, it usually doesn’t get straightened up.
Many IT managers believe the software and/or hardware they purchased already deals with this kind of problem. The truth is, this may or may not be correct. In fact, enterprises are taking full advantage of how current technology can eliminate redundant data. In some cases, companies have not turned on features that help them with duplicate data (hereafter known as “deduplication” or “dedupe”), nor are they actively using deduplication as a key aspect of their data protection plans. The reluctance of IT administrators to embrace dedupe usually stems from their lack of knowledge of the potential benefits of deduplication or past experience with a less-than-robust solution.
However, deduplication is a critical aspect of every backup environment that brings cost-savings and efficiency to the enterprise. Depending on which report you read, companies are faced with data growing at the rate of 50 percent to nearly doubling data annually. That impacts the entire data protection strategy. It also makes data slow and dopey like a koala bear. Backup windows aren’t being met, and there is no way that disaster recovery testing can take place. Think of this entire problem like picking up a squirt gun to put out a fire – it just won’t work.
Deduplication solutions are also valuable to disaster recovery (DR) efforts. Once the data is deduplicated, it is then transferred (or replicated) to the remote data center or offsite DR facility, ensuring that the most critical data is available at all times. Deduplication is crucial as it reduces storage and bandwidth costs, provides flexibility and data availability, and integrates with tape archival systems. Deduplication is a vital part of the future of data protection and needs to be integrated.
In this article, I will dispel six myths attached to deduplication, bring clarity to the technology and outline the cost savings and efficiencies enterprises can reap.
Myth 1: Deduplication methodology is a life sentence with no chance of parole. Most enterprise IT admins feel that if they purchased a specific deduplication solution, they are stuck with that method for life.
Reality: Flexibility is at the core of modern deduplication solutions, which allow firms to choose the deduplication methods that are the best fit for specific data sets. Many companies offer portable solutions, similar to being able to move electronic music from one device to the next. By doing this, IT can align its backup policies with business goals.
Myth 2: Each server is its own island and there are no boats. The myth is that each server is its own island with separate deduplication processes and none of the islands talk to each other.
Reality: As the Internet has expanded our ability to communicate globally, deduplication solutions have also gone global to eliminate any multiple copies of data. With global deduplication, each node within the backup system is deduplicated against all the data in the repository. Global deduplication spans multiple application sources, heterogeneous environments and storage protocols.
Myth 3: I don’t have the money to swap out or upgrade my hardware, and even if I did, I would spend it on something else. The perception is that deduplication servers need to be replaced when space on the server runs out. The system doesn’t allow for upgrades. To increase capacity, companies need to exchange the equipment and implement more servers and memory.
Reality: Scalability is key to all IT environments, as the rate of data is growing exponentially. IT administrators must be able to scale capacity to the backup target disk pool and build disk-to-disk-to-tape backup architectures around the deduplication system. Rather than a swap out replacement, deduplication repositories can scale as needed with cluster and storage expansions.
Myth 4: Deduplication slows down performance worse than my antivirus product. IT admins feel that the performance of their systems will slow down because there is too much work for the deduplication server to handle. This performance will hamper the entire backup environment and cause issues when data needs to be recovered quickly.
Reality: Deduplication can scale up to high speeds and has the ability to pull data into post processing to take the pressure off the backup window and increase the speed. In choosing a deduplication solution, IT administrators must consider how it will support the latest high-speed storage area networks (SANs). This is critical for achieving fast deduplication times. Those solutions with unique read-ahead technology provide fast data restore, even from deduplicated tapes.