In his most recent post for Dedupe Matters, Tony Asaro makes some predictions around the growing impact deduplication will have in the datacenter for 2009. I was particularly struck by the following statement:
"The perceived risk of implementing D2D backup with dedupe is all but gone and the value is glaringly clear. Additionally, the bad economy makes the value proposition that much more compelling."
This notion of perception is important to flesh out, as it is indicative of another question: When making a purchase decision, are you led by the technology or the vendor?
Let's explore an example of a technology-driven solution being deployed in production datacenters. I made mention in my last post of Data Domain OpenStorage integration with NetBackup-- noting that our joint customers are achieving high throughput in production environments. High throughput has historically been achievable only by using LOTS of disk spindles, back-ending a tape emulation interface (VTL). Data Domain's SISLTM architecture does it with very few disks (which is critical to delivering the cost benefits of deduplication, as discussed here), and the OST interface allows us to further optimize performance at a protocol level. In addition, using the OST interface with Data Domain systems gives NetBackup the ability to manage the WAN efficient replication process between Data Domain systems. Called "Optimized Duplication", this is a key feature enabling administrators to leverage every copy of their data in the distributed environment.
Looking deeper, these customers are also adopting 10gbps Ethernet to connect the NBU Media servers to the Data Domain systems (4gbps FC doesn't seem so fast anymore, does it?), deploying their media servers as a scalable tier of data movers (with dynamic load balancing), and otherwise seeking out, and addressing, the bottlenecks within their infrastructure. This kind of approach to systems architecture is typical of organizations interested in solving their challenges around storage and data protection, and results in a best of breed solution.
The process tends to propagate, as these same customers ask themselves how they can apply the Data Domain system to solve other problems. Why limit yourself to backup, when you can use the same data to automatically refresh reporting and development instances of databases, or ensure that a DR site thousands of miles beyond the reach of block-array based replication technologies is updated daily? Why limit yourself to data protection, when you can use the same storage for cost-effective tier 2 storage as well?
Our customers doing this today know that the limits dissolve when they allow the technology to point them towards the future. Ask yourself, whose lead will you follow?