Getty Images

7 data archiving best practices for backup admins

Among retention policies, regulatory compliance and limited storage budgets, knowing what data to keep is critical. Follow these seven best practices for a solid archiving strategy.

Every storage manager faces the issue of accommodating and storing ever-expanding data sets. When backup and retention policies are involved, it becomes even more complicated.

Primary storage tends to be expensive and has a finite capacity, so many organizations move older data to an archive. This practice helps to free up space on an organization's primary storage and to make room for new data.

On the surface, the concept of archiving data is simple. In practice, it often proves to be quite challenging. Careful planning is required before moving the first bits of data, so be sure to follow these seven data archiving best practices.

1. Identify and source data

The first critical archiving step is to identify all the organization's data, as well as where that data resides. It is impossible to adequately protect data if you don't even know that the data exists.

Backup admins must keep track of more than just where the data resides to archive it efficiently. They must also identify how the data is being used by the organization. Different data sets are used in different ways, so an archiving policy that is perfectly aligned to one data set might not necessarily work for another.

If, for example, one department actively uses documents for 30 days, then it might be appropriate to archive documents after a month. If, however, another department actively uses documents for six months, then archiving after one month would create significant problems for that department.

2. Keep thorough documentation of data

Another key data archiving best practice is to keep thorough documentation of data. This documentation can help when crafting an audit policy and performing other important tasks. For example, an organization may need to periodically review the documentation as new data sets are created or when old data sets are phased out. Compliance auditors might also require this documentation.

3. Know what is good for archiving vs. backup

Although there are similarities between them, archiving and backup are two different tasks that serve two entirely different purposes. Backups are intended for use as a mechanism for restoring data that has been accidently deleted or modified or that has been lost as a result of a hardware failure or cyber attack. Backups are primarily designed to protect an organization's active data.

Archives ensure that older data is retained for a specific period of time, even if that data is no longer being actively used by the organization. This retention period may be tied to business requirements or regulatory policies.

The important thing to keep in mind is that, while all of the data that exists today will presumably be archived at some point, the archival process is primarily designed to protect aging data, while backups are mostly meant to protect the data that an organization uses every day.

There is a direct cost associated with data storage, so retaining data that is no longer needed can incur unnecessary expenses.

4. Use the right tools

Another data archiving best practice is to make sure that you use the right tools for the job. Organizations can choose between numerous data archival tools. Some of these tools may be integrated into software that the organization is already using, while others exist as standalone applications.

All archive tools are not created equally. It is important to do the research and choose an archive tool with a feature set that matches an organization's archival needs. For example, an organization that is subject to stringent regulatory requirements might need an archival tool that works with immutable storage.

5. Craft a strong archive policy

The next best practice is to craft a strong archive policy. The policy must be unambiguous with regard to the data that is to be archived, why that data needs to be archived, where archive data is to be stored, who has access to the archived data and more. As you create the organization's archival policy, make sure to get the various stakeholders to sign off on the policy that you create -- especially as it relates to data lifecycle management.

6. Synchronize archiving with data lifecycle management

A data archiving policy must directly align with the organization's data lifecycle management policy. Ideally, the archiving mechanism should even be able to automatically purge old data that has exceeded its required retention period.

There is a direct cost associated with data storage, so retaining data that is no longer needed can incur unnecessary expenses. Additionally, old data can be subpoenaed if the organization is ever subject to litigation. As such, organizations often try to limit their legal exposure by purging data that is no longer needed.

7. Meet compliance regulations

Finally, make sure that whatever data archiving system the organization uses complies with the necessary regulations. Regulatory standards, such as HIPAA, GDPR and the California Electronic Communications Privacy Act, require organizations to retain data for specific purposes and durations.

Failing to adhere to these requirements can result in civil and/or criminal charges.

Next Steps

Get to know the archive bit and its role in data backup

Dig Deeper on Archiving and tape backup