Which in-memory DBMS best fits your company's needs?
Database expert Craig S. Mullins examines some top in-memory DBMS software offerings to help you determine which is the best fit for your organization.
In-memory database management systems (DBMSes) are growing in popularity and there are many different options from which to choose. The basic defining characteristic of an in-memory DBMS is that it's capable of storing entire database structures in memory. With the data in memory, high-overhead disk I/O operations can be avoided, resulting in improved performance and response times.
In-memory DBMSes run the gamut from relational to NoSQL, from older to brand-new technology, from existing relational DBMSes (RDBMSes) with in-memory capabilities added to entirely new in-memory offerings, and from completely in-memory to a hybrid combination of in-memory and disk.
In-memory DBMS options
One of the fastest-growing relational DBMS products on the market is SAP HANA, an in-memory data management and application platform that combines transactional and analytical processing in one system on one copy of the data. The name HANA was originally an acronym for high-performance analytic appliance. SAP HANA runs on SUSE Linux and Red Hat Enterprise Linux. SAP markets HANA as a database appliance, with hardware from IBM or Hewlett-Packard. It's also marketed as a cloud offering from providers including Amazon and Microsoft Azure.
SAP HANA is available in four editions:
- Base edition, which provides core database services;
- Platform edition, which adds predictive and graph engines;
- Enterprise edition, which includes additional components for data provisioning; and
- Run-time edition, which is for use with SAP applications.
Of course, SAP is targeting its application installed base to switch to HANA instead of the existing relational DBMS (usually Oracle, DB2 or SQL Server) used to support SAP applications. HANA enables SAP applications to run with a single in-memory database used for both transactions and analytics. Nonetheless, SAP touts HANA as a complete replacement for traditional relational databases, not just for use with SAP applications.
SAP HANA enables real-time analytics on a large scale and on a variety of data. SAP HANA combines database, data processing and application server capabilities all running in-memory, on one data set, and on a single platform.
SAP HANA enables you to achieve a smaller data footprint and simpler data processing and operations. SAP HANA supports multi-tenancy, gives the option to store warm data on disk, and offers a choice of deployment models and partners. You can deploy SAP HANA on-premises, in the cloud or as a hybrid of both.
If you're looking to build and support operational enterprise applications requiring rapid access to data, SAP HANA is a worthy, modern relational DBMS. SAP HANA is worth considering for ERP implementations as well as new, home-grown applications.
Another relational in-memory DBMS is TimesTen, which was acquired by Oracle in June 2005. Oracle TimesTen In-Memory Database is a full-featured relational DBMS that runs in the application tier, storing all data in main memory. Oracle TimesTen supports full SQL transaction semantics and includes OCI, Pro*C and PL/SQL for compatibility with Oracle Database.
Oracle TimesTen In-Memory Database is embedded in Oracle Exalytics In-Memory Machine, enabling Oracle Business Intelligence Standard Edition users to rapidly perform complex analytic queries. In addition to Oracle's standalone TimesTen In-Memory Database, Oracle Database 12c also offers in-memory capabilities (which are covered later in this article).
VoltDB is another in-memory relational DBMS, developed by a team led by Michael Stonebraker, one of the original relational pioneers and founders of Ingres. VoltDB is open source, with a community and commercial licensing option. The commercial license adds high availability, disaster recovery (HA/DR) capabilities and technical support.
The data is in memory for computations and analysis, but all transactions are stored durably to disk. VoltDB is compliant with the ACID (atomicity, consistency, isolation and durability) concept and delivers fault-tolerant replication by replicating partitions onto multiple servers; if a server fails, the data is still there in another partition.
VoltDB applications are developed in SQL using multiple languages, including Java, Python, PHP, C++ and C#; Java-stored procedures are also supported. Data movement to VoltDB is simple, as pre-packaged importers and connectors are supplied to make it easy to ingest and export data.
Not all in-memory database systems are relational. Aerospike is an open source, in-memory, NoSQL database management system. It's a key-value data store that enables developers to build personalized user experiences and data-driven applications that scale out to process millions of transactions per second with sub-millisecond response times.
Aerospike runs on Linux with support offered for many different Linux distributions including pre-built binaries for Red Hat, Ubuntu, CentOS and Debian. Although Aerospike is available as open source, it can also be commercially licensed. The commercial edition of Aerospike adds features such as cross datacenter replication (for synchronization of multiple clusters), fast restart and improved security.
One of the more compelling capabilities of Aerospike is that it simplifies development for programmers to build and operate new applications at scale with minimal up-front administration work. Additional use cases for Aerospike include data caching (such as to store session information or user profiles) and personalizing the user experience on Web portals and mobile applications.
Relational support for in-memory
You don't need to implement a brand-new DBMS technology to benefit from in-memory database processing. The three leading relational vendors -- Oracle, IBM and Microsoft -- all offer in-memory capabilities in the latest versions of their DBMSes.
Oracle Database 12c includes new memory-optimized database technology capable of delivering high-speed performance for analytical processing. Oracle Database In-Memory is an in-memory columnar data format designed to process SQL rapidly without limiting functionality.
Deploying Oracle Database In-Memory with any existing Oracle Database-compatible application is easy and requires no application changes. Oracle Database In-Memory is fully integrated with Oracle Database's scale-up, scale-out, storage tiering, availability and security technologies, combining relational strengths with in-memory speed.
IBM DB2 11.5 with BLU Acceleration also delivers a combination of innovations from IBM Research & Development labs, including in-memory capabilities. By enabling high-speed queries against operational and historical data, BLU Acceleration can improve the analytics gathering process.
DB2 with BLU Acceleration adds more than in-memory capabilities, including a relational column store capability for speeding analytical processing and actionable compression, which can deliver up to 10 times storage space savings. It also offers Single Instruction Multiple Data chip exploitation, and data-skipping technology to improve query performance.
Microsoft SQL Server 2016 delivers in-memory capabilities, too, with a memory-optimized database engine integrated into the core SQL Server engine. To use In-Memory OLTP, tables must be defined as memory-optimized. Memory-optimized tables are fully ACID-compliant and are accessed using Transact-SQL in the same way as disk-based tables.
Queries and transactions can reference and update data in both memory-optimized tables and disk-based tables. The SQL Server In-Memory OLTP engine is designed for extremely high-session concurrency.
All of the relational DBMS vendors claim substantial performance gains when using the in-memory capabilities of their RDBMS, with performance improvements ranging from several percent to more than 20 times. Of course, performance results will vary based on implementation, usage and other criteria.
NewSQL DBMS
In addition to the in-memory DBMSes, there are NewSQL database systems offering support for modern architectures and including some combination of in-memory capabilities, cloud database support and/or highly scalable distribution.
Although there's no hard-and-fast rule for defining a NewSQL DBMS, examples include TransLattice and NuoDB.
High availability is the hallmark of the TransLattice Elastic Database (TED), a geographically distributed relational SQL database server. TED is built for OLTP and supports varying application loads with high-availability and performance for remote users with ACID transactions. It's scalable, running on all-active commodity servers, cloud instances and/or virtual machines. All nodes are independent; there's no master, there's no slave.
NuoDB is a Web-scale distributed DBMS offering a rich SQL implementation and true ACID transactions. NuoDB is designed for the modern data center, but it can also be used as a scale-out cloud database.
The NewSQL DBMSes are designed without all of the overhead and functionality built into the older RDBMSes that were originally designed in the 1970s. By designing and building a new RDBMS in the 21st Century, NewSQL products can benefit from not having to support legacy code and architecture.
Making the right choice
In-memory databases are becoming an increasingly popular option for improving the efficiency of analytical processing. In many cases, the same in-memory DBMS that's used to speed up analytical queries can also improve the speed of operational transactions. But as you can see, many variables and options must be considered when choosing an in-memory DBMS. For those applications requiring high performance, in-memory DBMSes are the most capable choice for delivering rapid response times. However, be sure to calculate the additional cost, not just in terms of hardware, but perhaps of supporting a new DBMS and all of the associated costs as well.