What’s the Real Difference Between Data Archives and Backup Data?

What’s the Real Difference Between Data Archives and Backup Data?

March 9

Until recently, I saw data archives as a subset of general Backup and Disaster Recovery (BDR). I based this assumption on personal experience. I’ve tried to save everything that has made up my digital life, from every Microsoft Word file I’ve created since 1993 (many of which I can no longer open) to the thousands of photos I’ve taken on my iPhone over the last six years. My personal BDR strategy is solid, although I do struggle to find older files, most of which lack identifying tags or dates.

But a recent conversation with StorageCraft technical marketing manager Steve Snyder helped me understand that most businesses need to know the difference between data used for BDR purposes and archived data. Ultimately, it relates to your objectives for that data.

Both BDR and data archiving focus on protecting and saving data; however, their objective for doing so differs. BDR, not surprisingly, is designed to quickly recover currently used data that has been lost or corrupted. TechTarget’s Curtis Preston explains:

If you accidentally deleted a file or a bunch of files, or you had a double disk failure in a RAID 5 array, and you need to restore things to the way they previously looked, that’s what backups are for. Or say something bad happened to your files yesterday, you might want to restore them to two days ago or three days ago. Or if you want to get a version of a file from a few weeks or months ago, you can use your backup application.

In other words, you want your BDR solution to recover this data fast, so you can get back to work. Its primary purpose is to maintain your business’s availability, whether you’re an online retailer or a dentist. BDR software like StorageCraft’s ShadowProtect helps you create backup images in multiple places, including onsite, at a nearby site, and in a highly available cloud environment. You want your RTO (Restore Time Objective) and your RPO (Recovery Point Objective) to be fast.

In contrast, fast recovery speed does not typically top the feature list of a good data archiving solution. Instead search functionality, the ability to quickly search for certain types of data using specific parameters, tops the list. This capability is essential for handling e-discovery demands and proving that you are in compliance with your industry’s regulations.

Preston says that someone suing or investigating your business doesn’t care about your backup image. Instead they will ask for documents with specific keywords or email conversations between certain parties over a specified period of time. A good data archive provides a complete history of files, including:

  • Where they were located
  • Time period they were active
  • List of people who made changes

Why would such a task be so much harder to accomplish using BDR software? Preston offers a good example:

Let’s say you have a full backup of Exchange every week for the last seven years. Then someone comes to you and says, “I want all of these emails with this word in them.” What you’re going to need if you want to extract this information with a backup application is [to] restore the entire Exchange server and then extract out of that Exchange server the files that you need from seven years ago. Then you’re going to need to restore Exchange again to seven years ago minus a week, and do that all over again and over again, and in this case, roughly 150 times. Then you’re going to have to extract from it what you need. So satisfying archive requests with data backup and recovery software is something you’ll only do once. You’ll try it and then say to yourself, “We should have used archive software to satisfy this requirement.”

In my next post I will discuss some best practices on the way archival data should be managed. Meanwhile, if you have something to add, please let us know in the comments or on Twitter!

Photo credit: DRs Kulturarvsprojekt via Flickr