3 Key Features of Good Data Archiving Software

March 23

Now that we’ve outlined the key differences between archival data and BDR data, let’s check out what bare-bones features you need in a useful data archiving solution. We already know “searchability” is the top feature of any good archiving software, but what specific attributes does your software need to give you a meaningful level of search functionality?

1. Granularity

First off, you need granularity in your ability to perform searches. IT consultant Brien Posey writes that this capability is essential, especially during the e-discovery process. Because this process typically involves examining huge amounts of data to find relevant information, your software must execute searches based on:

  • Data type, such as email, PDFs, Microsoft Office documents, and other files
  • Data sources, where a given search could access results from, say, Microsoft SharePoint, a specific file server, and your financials app
  • Author of file (document author)
  • Key pieces of data within the files, including credit card numbers, bank account numbers, and Social Security numbers
  • Data that matches a specific data structure instead of a specific item, such as data containing any Social Security number rather than a specific Social Security number

2. Storage Optimization

After search, the next most important functionality you need is what research firm Gartner calls “overall storage optimization.” As Gartner points out, this feature allows you to:

…reduce the volume of data in production and maintain seamless data access. The benefits of using this technology include reduced capital and operating expenditures, improved information governance, lower risk, and access to secondary data for reporting and analysis.

In other words, as your data archives grow, you don’t want to find multiple copies of the same file in a search result. Therefore, a powerful deduplication engine (which, according to Posey, is present in almost all modern archiving software) goes a long way toward preventing such a situation.

Nor do you want to find out that you’re still storing terabytes of unchanged, ready-to-archive data in your main storage or storage backup because frankly, top-tier highly available storage is significantly pricier than archival storage alternatives. StorageCraft’s Casey Morgan discusses this topic in a recent post:

Archiving can be extremely inexpensive, particularly with services like Amazon Glacier. This is great for data that needs to be saved for long periods and isn’t likely to be accessed. Note that it’s cheap because it can take several hours to access data in Glacier—that’s not ideal for a recovery scenario. Still, it’s a decent option if you have hypersensitive stuff that you need “just in case,” but not for keeping business-critical information.

3. Flexibility

Finally (for the purposes of this post), a good data archiving solution needs to be flexible. Posey defines flexibility as being able to support as many data platforms as possible:

While there’s no such thing as a universal archive product, there are archival products on the market that are designed to work with a number of popular applications and platforms. Some of these even include the ability to archive social networking data, such as the contents of an organization’s Facebook page.

Furthermore, it should be able to handle a wide variety of data sources, data targets, and writing capabilities for extraction purposes. As Posey noted, an organization cannot be limited to, say, tape archives, especially when so many archival options (including cloud options like Glacier) are available:

A good archival product should allow you to write archives to disk, tape, the cloud or any other medium… Similarly, diverse media should be supported for archive retrieval. When data is extracted from the archives, you might want to write that data to tape, DVD/Blu-ray or some other medium.

Actually, great data archiving software needs even more than these key features, most notably automation and the ability to help its users manage the specificities of varying compliance regulations. I hope to discuss that in a future post.

Meanwhile, if you have something to add, please let us know in the comments or on Twitter!

