IBM

What is object storage technology (and why should you care)?

Object technology offers scalability, economical operation and better data management; but it's very different from file and block storage systems.

It's unlikely that anyone old enough to remember writing code for anything other than a database or file system is reading this article. After all, it's been more than 40 years since programs accessed other types of data structures for general commercial applications. Thus, object storage represents the first major new data structure category in almost two generations that is significantly impacting data management practices. Object implementations began in the late 1990s, but mostly for niche applications. Large-scale deployments were spearheaded by cloud services, such as Amazon S3, Facebook and Spotify. Now, object storage solutions from both established and emerging vendors are reaching critical mass, and IT organizations need to consider them for their own implementations.

Object storage differs from SAN or NAS systems in a number of important ways. Most noticeable to storage administrators is that the factors of LUNs, volumes and RAID are absent. Data objects are stored in variable-sized "containers" rather than fixed blocks. Whereas metadata and data are accessed together with traditional data access methods, object storage allows the data itself to be accessed directly. In addition, security can be applied on a per-object or per-command basis.

Nevertheless, it's unlikely an IT department would proceed with the idea that it needed object storage. Rather, storage managers look for better ways to meet the data access needs of the organization as simply and inexpensively as possible, and object storage may meet those requirements.

Adding object storage to the data center is an "and" not an "or." Object storage holds much promise, but it won't entirely replace SAN or NAS anytime soon. The commonalities among vendor implementations are for:

  • Archive
  • Cloud storage
  • Backup and recovery
  • Compliance
  • Content serving

These are use cases where data access is imperative, but where performance isn't necessarily an issue.

Object storage implementation models

Object storage implementations that are currently available range from ready-to-use cloud services, to appliances delivered as a hardware/software bundle or software only, up to complete converged arrays. Some implementations try to embrace and include traditional architectures, while others are more pure-play in nature.

Hewlett-Packard (HP) offers a full range of object products, starting with the HP Helion Public Cloud. For in-house or private cloud environments, HP's StoreAll 8200 Gateway Storage appliance can front-end StoreServe 7000 and StoreServe 10000 arrays, also known as 3PAR arrays. The StoreAll 8800 Storage system is a fully converged array based on 3PAR, supporting SAN, NAS and object storage in the same device. Thus, HP targets customers of all shapes and sizes, from small and medium-sized businesses (SMBs) to the largest enterprises.

The first product from Exablox, an emerging object storage vendor, is its OneBlox appliance. OneBlox's architecture is a "ring" of peer nodes across which objects are stored via a global file system that supports SMB/CIFS. OneBlox appliances are notable for the fact that they can support any SAS or SATA drive -- including the new Western Digital HGST Ultrastar He6 helium-filled 6 TB drives -- that is literally off the shelf. IT organizations can insert drives from their favorite electronics store, thereby avoiding the typical array markup. Exablox targets SMB customers with up to 2,000 users as well as cloud service providers.

object storage FAQ

Quantum's Lattus Object Storage product is also an appliance, ranging from six-node to 20-node configurations. These nodes can be geographically dispersed for wide-area access and collaboration. There are three Lattus models: D, X and M. The D model includes a native S3 (HTTP REST) interface; the X model supports NFS, CIFS and HTTP REST; and the M has Quantum's StorNext Storage Manager interface. Lattus is non-disruptively self-healing and self-migrating. It's aimed at medium to large enterprises, especially those in media and entertainment, or those with compute/process/edit applications.

EMC offers a variety of object product flavors. Atmos is available as a cloud service as Atmos GeoDrive, but it's also available as a complete array. EMC's SourceOne archiving applications provide archive, compliance and e-discovery by taking CIFS and NFS and translating those file interfaces into Atmos storage. In addition, the company's ViPR software-defined storage has an object service that front-ends EMC Isilon, EMC VNX and NetApp arrays to allow object access to data.

Simplicity is the key

The differences that define object storage drive to a key benefit: simplicity. Most IT organizations don't bemoan a lack of technology, but rather ever-increasing complexity. The hallmark of an object storage system is simple implementation and management. For example, Exablox claims its OneBlox appliance is so simple to install and configure that it has a tongue-in-cheek "cappuccino challenge." The company demonstrates how the device can be unboxed, powered up, and the hard disks installed and storing data in about the time it takes to make a cup of cappuccino.

Because object systems don't rely on LUNs and volumes, they are non-disruptively extensible. Generally, new capacity can be simply added to the configuration and included on the fly. Exablox and Quantum both claim users will never again experience a forklift upgrade or the need to configure or reconfigure a system. This extensibility is enabled by the underlying file system of the device. HP's StoreAll systems use the StoreAll Distributed File System, while Exablox has its ring architecture with a global file system. These, and other object systems, behave much like scale-out systems in that the file system enables a global namespace across nodes. Caveat emptor, however, as scalability isn't unlimited; vendors do have limits to the number of nodes supported in a given configuration.

Object-friendly applications

A key reason for the discrete use cases mentioned here is that object storage is accessed using the REST application programming interface (API). The data access commands in this API are fairly limited to the basics like POST, GET, PUT and DELETE. However, many cloud providers use REST as the preferred interface. To offer richer capabilities, the HP StoreAll product is a converged file and object system. StoreAll supports CIFS, NFS, OpenStack, Identity Services, Swift and Keystone in the StoreAll operating system. EMC's cloud gateway appliances, SourceOne and Cloud Tiering, translate CIFS and NFS into its Atmos object storage and therefore a variety of third-party applications beyond archiving. Quantum bundles a RESTful interface with its Lattus object storage product. This interface enables its partnership with companies such as CommVault (Simpana) and Arkivio. Exablox's OneBlox supports REST, but it's presented to the application as a CIFS share; an NFS capability is in the works.

Object storage and data durability

Because object storage doesn't rely on RAID for failure protection, vendors use other strategies to do so. In most cases, this involves replication across nodes. Quatum's Lattus deploys durability policies, where IT managers can specify replication across nodes and locations to survive a certain number of failures. For example, a 20/4 policy spreads data across 20 nodes and can survive four device failures; an 18/7 policy across three sites could survive a site failure. Exablox also spreads data across OneBlox nodes, where data spread across three nodes can survive two OneBlox failures; a hashing algorithm assures optimum distribution across all nodes.

EMC Atmos has two protection modes to let IT managers determine how to optimize accessibility and efficiency. Replication is one option, either synchronous or asynchronous between nodes. In addition, the product uses distributed erasure encoding. Erasure encoding is more efficient for storage utilization, but does require access to two or more data stores per data request. Quantum's Lattus also offers fountain erasure encoding, which enables data to be spread across nodes and doesn't require replication.

More advanced storage services

Because of the limitations inherent in the REST API, vendors have used their own means to offer storage services beyond the base capabilities of the API. This is one reason HP has implemented StoreAll as a converged device. Environments that require the full range of storage services can utilize the NAS side of the system. For example, when the StoreAll 8200 and StoreAll 8800 systems use 3PAR as a back end, HP's Adaptive Optimization, encryption, WORM and tiering are built in.

Exablox, without a legacy installed base to consider, uses an entirely different approach. Its atomic unit for data management is a 32 KB hashed block. If a block already exists, a pointer is created rather than a fresh block. Thus, the company claims deduplication comes along "for free." Encryption has been implemented using the AES 256 standard.

Object storage performance improving

While it's true IT users won't use object storage for online transaction processing (OLTP) applications, vendors are working to enhance the performance of their object storage systems. For example, EMC uses a "box carting" methodology to handle large volumes of small transactions. Box carting is a means of combining these small transactions into a single write operation. Using a different technique, Exablox's hashing algorithm assures even distribution of data across all nodes to avoid I/O bottlenecks.

While object storage isn't a silver bullet for the complexities of SAN and NAS, it can simplify at least a portion of the storage estate, especially around archive and unstructured data repositories. Object storage systems may offer a lower cost-per-gigabyte price point, but the biggest benefit may be in reduced storage administration. If storage management costs contribute 85% of the total cost of storage ownership as is generally accepted, then object storage's elimination of configuration, reconfiguration and provisioning tasks should have a big impact on total cost of ownership. IT managers will find a way to include it in their organizations.

About the author:
Phil Goodwin is a storage consultant and freelance writer.

Dig Deeper on Data storage strategy