Backup, Archive, or Delete? A Primer on Data Types and Retention

Backup, Archive, or Delete? A Primer on Data Types and Retention

June 15

Note: This article also appears on The VAR Guy.

Every business has data that multiplies like rabbits. Storage costs are relatively inexpensive, but the exponential growth of data can still be costly when so much new data is being created. How do we reduce storage costs (and clutter, for that matter) when data grows like crazy?

The trouble with all the data we create is that a lot of it just isn’t that useful. The deluge of cat emails in my Outlook inbox aren’t really doing anybody any good (not from a business sense, at least), but the email thread from my CEO about a hyper-critical project can’t be disposed of. Organizing and deleting is really a matter of prioritization and giving careful thought to data types and their associated retention policies.

Mission-critical data

This category includes data that can’t be recreated—data you can’t do business without. Things like contacts, legal agreements, credit card info, IP, tax documents, and whatever makes your company tick. This data should be backed up locally, and stored in a cloud (even mirrored to a second data center) just in case. Don’t take any chances with this type of data; store it in a few places, and use quality hardware to store it on. Oh, and don’t forget encryption.

Necessary data

Necessary data consists of things that shouldn’t be deleted, but could be recreated if need be. This is the email thread with a vendor or the ten-page report you’ve been working on. If you’re talking about data stored on a workstation, this critical data should be saved somewhere that’s being backed up. It’s important for employees to know what to expect from backups. For instance, some companies don’t back up all workstations. Instead, employees will be expected to store essential files on a drive that is being backed up. In the event that an employee has a hard drive failure, their data will be toast if they didn’t store it in the right place. Make sure employees know what to expect and what will happen if they don’t follow standards (see “setting policies”).

Rarely-accessed data

Certain things that need to be retained for a few years (tax or other documents, particularly for compliance-heavy industries like legal and finance) aren’t likely to be accessed, but are still important. These can safely be stored in something like a cloud archive, but keeping local copies is important too—having data in a single place is not a smart idea. With cloud archive, the keywords are “rarely accessed.” The real benefit to services like these is that storage is cheap, but accessing the data isn’t a quick process. It’s really just somewhere to keep data you may or may not need. Keep that in mind.

Might-be-useful data

When prioritizing storage space or trying to save on storage costs, this is the first data to go. This might include old statistics, legacy documents that no longer apply to current offerings, things of that nature. I like to ask myself a few questions when organizing storage areas. Have I used this in the last six months? Will I use it in the next six to twelve months? If the answer is no, it might not be worth hanging on to.

Useless data

What counts as useless really depends on who you ask, which can make it difficult to decide. Some may say that the nearly 6K emails I have saved in my deleted folder are useless, but what happens when I need something I deleted? One person’s useless data is someone’s might-be-useful data. In general, useless data is just that. Data that is completely useless. Delete the temporary files, setup files for programs you installed a year ago, cat videos (*sniff*), hilarious gifs (double *sniff*), old notes, and any other garbage that just isn’t useful. It’s sad (poor kitties), but it must be done.

Storage policies

When you have a lot of people storing a lot of things in once place, you’re bound to get some junk. For a server, it’s tough to manage and organize everything, especially if it’s a network share or a place where multiple employees collaborate and store information. One way to reduce storage footprint is to establish policies that govern what types of information should be saved. For instance, perhaps rather than saving all versions of a particular project, any initial drafts are deleted once the final is complete. Taking some time to establish these policies can reduce footprint, particularly for employees who handle multiple drafts of video, images, and other large files. If the data is no longer needed, why keep it around?

Retention policies

Thinking about retention is a matter of asking yourself: how long do you reasonably need to keep data and how far back should the data go? Do you need things from ten years ago? Five? One? With things like tax documents or other crucial items, you probably can’t get rid of them for a few years. But that doesn’t mean you need to keep everything for that long. Various types of data will require different approaches to retention. Think carefully about the types of data that need to stay around a while and the types that ought to be tossed after a certain amount of time. Don’t forget that retention doesn’t just apply to production data, but backups as well. A lengthy retention policy is a good way to go back in time (so to speak) and recover a long-deleted document, but it’s also a good way to see storage costs swell.

Any MSP, admin, or end user can benefit from taking the time to do a little spring cleaning on their hardware storage. Keeping storage lean and mean is a good way to reduce storage costs and keep systems performing well.