A past conversation with a Data Domain Fortune 1000 customer revealed that, at the primary data center, they were maintaining no less than 15 various types of backup devices and associated software - specifically for the purpose of being able to support the restoration of data in remote sites. They had nearly 700 of these remote sites and a variety of regional data centers. For a number of reasons, including acquisitions, the company's backup infrastructure had evolved over time to become incredibly diverse and difficult to manage. To mitigate, they had recently installed Data Domain systems at their primary and regional data centers deployed with a multi-site replication typology. Their longer term vision was to tackle the challenges they faced with their remote offices.
The story underscores the point that the replication needs of large organizations can be virtually boundless and can require tremendous configuration flexibility. Data Domain recently expanded the capabilities of its replicator software in two significant dimensions.
First, replication fan-in for many-to-one topology now supports up to 180 remote sites all replicating into a single system at a central hub site. One might wonder who would ever need that much, but the aforementioned customer is one of many who I have spoken to who have such an environment.
Second, Data Domain systems now support cascaded replication. In other words, you can now replicate data from one location to a secondary location, and then from the secondary location to a third site. While this might be more than some customers require, it's a requested feature that is more commonly asked for than you might think.
Recently I was discussing Data Domain systems and technology at an energy company. Since the bulk of the conversation was about their two data center locations, I assumed that cascaded replication would be of little interest. Turns out I was way off target. The company in question recently experienced a security related 'incident' that exposed a weakness in their dual data center model. The company realized then that they needed to have a stronger segregation of duties and are moving towards adding a third data center with strong isolation from either the first or the second site. In their new model, sites A and B will selectively cross-replicate with each other, and then both replicated data sets will also replicate into a secured and hardened site C. Administrators at sites A and B will not have a physical access or administrative access to equipment at site C.
However, I believe the main adopters of cascaded replication will be organizations that want to create a replication topology that maps precisely to their distributed site model. I see cascading fitting well into organizations with a combination of small remote sites, medium-sized regional hubs, and large global data centers. In this model you will see any number of small, remote sites replicating into regional hubs. These regional hubs will then replicate the remote site data plus their own local data to the larger, global data centers for longer term retention and a minimal amount of tape creation as may be required.
Let me backtrack a little and point out that Data Domain has already benefited from an industry-leading replication capability before the recent enhancements. In fact, some of the competition have been shipping a very limited and inflexible replication capability for a few short months, while others are still promising 'Replication 1.0' in the near future.
With Data Domain, IT architects can now even more easily design and implement whatever replication topology their businesses require. With a flexible replication technology, customers can mold their replication strategy around the environment with respect to WAN topology, regional affinities and the desire to massively centralize or eliminate tape automation.