Definition

hierarchical storage management (HSM)

What is hierarchical storage management?

Hierarchical storage management (HSM) is policy-based management of data files that uses storage media economically and without the user being aware of when files are retrieved from storage. HSM automatically moves data among different storage tiers based on a defined policy. It can be set up as a standalone system, but it is often used in distributed enterprise networks.

HSM consists of two or more types of storage media that make up the storage tiers. At one end is a high-performance tier that is typically more expensive and accessed more frequently than the other tiers. This tier includes storage class memory (SCM), enterprise-grade flash solid-state drives (SSDs) and high-performing hard disk drives (HDDs).

At the other end of the storage tier spectrum are slower, less expensive devices, such as optical disks and tape systems. This tier is used for data archives and cold data. Other tiers fall in between these extremes, based on data requirements, supported workloads and how often users access the data.

How does hierarchical storage management work?

Each tier in the HSM hierarchy represents a different cost and performance pairing. As a file ages and is accessed less often, the system moves it to slower and less expensive form of storage.

A file that has moved to a slower tier can be retrieved and moved back to a higher performing tier if it is needed for more critical workflows. Administrators set data governance policies that manage how data moves among the tiers. Once the policies are set, the HSM software manages the data itself.

HSM helps organizations make more efficient use of storage devices and lowers storage costs. It is especially useful in large-scale environments that support massive data sets. Older files can be moved to less expensive storage yet still appear to be immediately accessible. When a user tries to access the files, the HSM software restores them automatically and transparently from the lower data tier to a higher performing tier.

HSM software often uses stub files -- abbreviated representations of the original files -- to point to the real location of the files in secondary storage. Some HSM software enables administrators to set high and low thresholds for disk capacity. With this approach, the software determines when to move older or less-frequently accessed files to another storage medium. In addition, administrators can usually exclude certain file types, such as executable files, from being automatically moved.

table of data classifications
Organizations classify data into various categories, including mission-critical, hot, warm and cold data. See what distinguishes each category.

How does storage tiering work?

Tiered storage divides data based on its business value and how often users and applications access it. Data is assigned to a specific storage tier based on these factors. More valuable, frequently used and mission-critical data is assigned to faster, more expensive storage flash SSDs and SCM devices. Less important and less frequently used data goes to optical disks and HDDs. Older, seldom accessed data is archived on tape drives or in the cloud.

The exact configuration of a tiered system depends on the storage needs of an organization and how it uses and classifies its data. These systems generally have between two and five tiers. Possible tiers include the following:

  • Mission-critical data is used with high-performing workloads that can't tolerate delays.
  • Hot data is used on a continuous basis to support ongoing operations.
  • Warm data is accessed less frequently but is needed regularly.
  • Cold data is seldom accessed.
table comparing various types of storage media
See how various storage media compare in terms of performance, capacity, endurance and cost.

Benefits of HSM

HSM systems provide a range of advantages, such as the following:

  • Performance optimization. HSM frees up more expensive, high-performance storage for the workloads that need it the most.
  • Cost savings. HSM systems keep as much data as possible on low-cost storage devices, saving money without sacrificing performance.
  • Efficient resource use. The automation capabilities included in HSM systems make data retrieval faster. The tiered storage makes more efficient use of overall storage space.
  • Backup capabilities. HSM provides archiving capabilities on lower-level devices that can serve as data backups.

HSM products

HSM software is available as standalone products that can be used with specific hardware systems. It also comes as part of software platforms that address other storage, data lifecycle management and data security needs.

The following products are examples of how HSM is used.

  • IBM Spectrum Protect is a data protection offering that accommodates various operating environments and device types. It includes HSM capabilities to automatically move data between high- and low-cost storage media.
  • Hewlett Packard Enterprise Data Management Framework optimizes data accessibility and storage resource use in high-performance computing Linux storage environments. It's HSM architecture supports tiered storage that optimizes storage resource use.
  • Quantum StorNext is a scale-out file storage system with the FlexTier subsystem. It offers file protection features along with HSM functionality.
  • Zimbra offers a secure, private collaboration infrastructure with HSM capabilities and features such as real-time backup and restore.

Learn 12 ways to manage your storage infrastructure and get it under control.

This was last updated in February 2022

Continue Reading About hierarchical storage management (HSM)

Dig Deeper on Storage architecture and strategy