
Architecture
Architecture:
ExaGrid understands that deduplication is required, but how you implement it changes everything in backup. ExaGrid has a unique landing zone where backups can land straight to disk without any inline processing. Backups are fast and the backup window is short. Deduplication and offsite replication occur in parallel with the backups. Deduplication and replication never impede the backup process as they always are second order priority. ExaGrid calls this “adaptive deduplication.”
Fastest Backup/Shortest Backup Window
Since backups write directly into the landing zone, the most recent backups are in their full undeduplicated form ready for any request. Local restores, instant VM recoveries, audit copies, tape copies, and all other requests do not require rehydration and are as fast disk. As an example, instant VM recoveries occur in seconds to minutes versus hours for the inline deduplication approach.
Fastest Restores, Recoveries, VM boots and Tape Copies
Scalability: Fixed length Backup Window and Data Growth
ExaGrid provides full appliances (processor, memory, bandwidth, and disk) in a scale-out GRID. As data grows, all resources are added including additional landing zone, additional bandwidth, processor, and memory as well as disk capacity. The backup window stays fixed in length regardless of data growth, which eliminates expensive forklift upgrades. Unlike the inline, scale-up approach where you need to guess at which sized front-end controller is required, the ExaGrid approach allows you to simply pay as you grow by adding the appropriate sized appliances as your data grows. ExaGrid has 10 appliance models and any size appliance or any age appliance can be mixed and matched in a single GRID, which allows for IT departments to buy compute and capacity as they need it. This approach also eliminates product obsolescence.
Why ExaGrid Tiered Backup Storage Versus Backup Software Deduplication
Data deduplication enables the cost-effective use of disk as it greatly reduces the amount of disk required by only storing unique bytes or blocks from backup to backup. Over an average backup retention period, deduplication will use about 1/10th to 1/50th of the disk, depending on the mix of data types. On average, the deduplication ratio is 20:1.
All vendors need to offer data deduplication in order to reduce the amount of disk to lower the cost to be about the same as tape. However, how deduplication is implemented changes everything about backup. Data deduplication reduces the amount of storage and also the amount of data replicated, saving storage and bandwidth costs; however, if not implemented correctly, it will create three new compute problems that greatly impact backup performance (backup window), restores, and VM boots and whether the backup window will stay fixed in length or grow as data grows.
Deduplication in backup software is typically performed on the client or agent, on the media server, or both.
The deduplication ratio for most backup software is on average 2:1 to 8:1, much lower than hardware appliances (20:1), as the hardware is not dedicated to deduplication and therefore the software vendors typically employ deduplication algorithms that are less aggressive. Deduplication in backup software, depending on the vendor, delivers deduplication ratios of 2:1, 3:1, 4:1, 6:1 and possibly as high as 8:1. This means that anywhere from 2.5 to 8X the storage is required to store the same retention periods as a dedicated appliance. The lower deduplication ratio implementations will also use a lot more WAN bandwidth. At 3 to 4 weeks of retention, the amount of storage and bandwidth will probably work; however, if you are keeping many weeks, months, and years of retention, the cost of storage and bandwidth using deduplication in the backup software is far too expensive. In some cases such as Veeam and Commvault, deduplication can remain on and ExaGrid can take the deduplicated data an improve the deduplication ratio dramatically such as 7:1 for Veeam and 3:1 for Commvault.
Deduplication in the backup software deduplicates the backups inline during the backup process. Deduplication is a compute-intensive process and slows backups down, which results in a longer backup window. Furthermore, if deduplication occurs inline, then all the data on the disk is deduplicated and needs to be put back together, or “rehydrated,” for every request. Local restores, instant VM recoveries, audit copies, tape copies, and all other requests take hours to days. Furthermore, these solutions only add disk as data grows. Since additional compute resources are not added, as data grows, the backup window expands until the backup window becomes too long and then the media server has to be upgraded to a bigger, faster, and more expensive media server.

ExaGrid understands that deduplication is required, but how you implement it changes everything in backup. ExaGrid is Tiered Backup Storage. ExaGrid has a disk-cache Landing Zone without deduplication so that writing backups and performing restores is the same as using any disk. Backups are fast and the backup window is short. ExaGrid is typically 3X faster for backup ingest. Deduplication and offsite replication occur in parallel with the backups for a strong RPO (recovery point). ExaGrid stores long-term retention data in a tiered deduplication repository for long-term cost efficiency. Deduplication and offsite replication occur in parallel with the backups by using available unused resources. Deduplication and replication never impede the backup process as they always are second order priority. ExaGrid calls this, “adaptive deduplication.” Since backups write directly to the disk-cache Landing Zone, the most recent backups are in their full undeduplicated form ready for any request. Local restores, instant VM recoveries, audit copies, tape copies, and all other requests do not require rehydration and are as fast as disk. As an example, instant VM recoveries occur in seconds to minutes versus hours for the inline deduplication approach. ExaGrid provides full appliances (processor, memory, bandwidth, and disk) in a scale-out system. As data grows, all resources are added, including additional Landing Zone, bandwidth, processor, and memory as well as disk capacity. The backup window stays fixed in length regardless of data growth, which eliminates expensive server upgrades. Unlike the inline, scale-up approach where you need to guess at how much server hardware and storage is required, the ExaGrid approach allows you to simply pay as you grow by adding the appropriate sized appliances as your data grows. ExaGrid has eight appliance models and any size or age appliance can be mixed and matched in a single system, which allows IT departments to buy compute and capacity as they need it. This evergreen approach also eliminates product obsolescence.

ExaGrid thought through data deduplication implementation and created an architecture that provides the speed of backups and restore of disk with a tiered long-term deduplicated repository. The combination is the best of both worlds for the fastest backups, restores, recoveries and tape copies; fixed the backup window as data grows; and eliminated forklift upgrades and obsolescence, while allowing IT staff to buy what they need as they need it. There is no downside and only upside. ExaGrid Tiered Backup Storage provides 3X the backup performance, up to 20X the restore and VM boot performance, and a backup window that stays fixed in length as data grows.
Why ExaGrid Tiered Backup Storage Versus Traditional Inline Disk-based Backup Storage Appliances
Data deduplication enables the cost-effective use of disk because it reduces the amount of disk required by only storing unique bytes or blocks from backup to backup. Over an average backup retention period, deduplication will use about 1/10th to 1/50th of the disk capacity, depending on the mix of data types. On average, the deduplication ratio is 20:1.
All vendors need to offer data deduplication in order to reduce the amount of disk to lower the cost to be about the same as tape. However, how deduplication is implemented changes everything about backup. Data deduplication reduces the amount of storage and also the amount of data replicated, saving costs in storage and bandwidth. However, if not implemented correctly, deduplication will create three new compute problems that greatly impact backup performance (backup window), restores and VM boots, and whether the backup window stays fixed or grows as data grows.

Alternate approaches deduplicate backups “inline,” or during the backup process. Deduplication is compute intensive and inherently slows backups, resulting in a longer backup window. Some vendors put software on the backup servers in order to use additional compute to help keep up, but this steals compute from the backup environment. If you calculate the published ingest performance and rate that against the specified full backup size, the products with inline deduplication cannot keep up with themselves. All of the deduplication in the backup applications are inline, and all the large brand deduplication appliances also use the inline approach. All of these products slow down backups, resulting in a longer backup window.
In addition, if deduplication occurs inline, then all of the data on the disk is deduplicated and needs to be put back together, or “rehydrated,” for every request. This means that local restores, instant VM recoveries, audit copies, tape copies and all other requests will take hours to days. Most environments need VM boot times of single-digit minutes; however, with a pool of deduplicated data, a VM boot can take hours due to the time it takes to rehydrate the data. All of the deduplication in the backup applications as well as the large-brand deduplication appliances store only deduplicated data. All of these products are very slow for restores, offsite tape copies, and VM boots.


Furthermore, many of these solutions employ a scale-up architecture with a front-end controller and disk shelves. As data grows, only disk shelves are added, which expands the backup window until the backup window becomes too long and the front-end controller needs to be replaced with a bigger, faster, and more expensive front-end controller, called a “forklift upgrade.” All of the backup applications and large-brand deduplication appliances use the scale-up approach whether in software or in a hardware appliance. With all of these solutions, as the data grows, the backup window does as well.
ExaGrid’s Tiered Backup Storage has implemented a best of both worlds approach with a disk-cache Landing Zone for fast backups and restores tiered to a long-term deduplicated data repository. Each ExaGrid appliance has a unique Landing Zone where backups land straight to disk without any inline processing, so backups are fast and the backup window is short. ExaGrid is typically 3X faster for backup ingest. Deduplication and offsite replication occur in parallel with backups for a strong RPO (recovery point) and never impede the backup process as they are always second order priority. ExaGrid calls this “adaptive deduplication.”
Since backups write directly to the Landing Zone, the most recent backups are in their full undeduplicated form ready for any restore request, which is the same as it would be writing to any low-cost primary storage disk. Local restores, instant VM recoveries, audit copies, tape copies, and all other requests do not require rehydration and are as fast disk. As an example, instant VM recoveries occur in seconds to minutes versus hours when using the inline deduplication approach.
ExaGrid provides full appliances (processor, memory, bandwidth, and disk) in a scale-out system. As data grows, all resources are added including additional Landing Zone, bandwidth, processor, and memory as well as disk capacity. This keeps the backup window fixed in length regardless of data growth, which eliminates expensive forklift upgrades. Unlike the inline, scale-up approach where you need to guess at which sized front-end controller is required, the ExaGrid approach allows you to simply pay as you grow by adding the appropriate sized appliances as your data grows. ExaGrid offers eight appliance models, and any size or age appliance can be mixed and matched in a single system, which allows IT departments to buy compute and capacity as they need it. This evergreen approach also eliminates product obsolescence.
When architecting its appliances, ExaGrid thought through the implementation of the benefits of low-cost primary storage disk performance tiered to a long-term retention deduplicated data repository for the lowest costs. This approach is designed and optimized to provide the fastest backups, restores, recoveries, and tape copies; while permanently fixing the backup window length, even as data volumes grow; and eliminates forklift upgrades and product obsolescence, all while allowing IT staff the flexibility to buy what they need as they need it. ExaGrid’s appliances deliver 3X the backup performance, up to 20X the restore and VM boot performance, and a backup windows that stays fixed in length as data grows, all at the lowest cost.