Gabi Moisa - Fotolia
Hazelcast grid tunes for data scalability tradeoffs
An in-memory data grid (IMDG) from Hazelcast lets designers tune subsystems to support consistency over availability, or the reverse, depending on what designers want.
In-memory data grid vendor Hazelcast today said it updated its platform to improve data consistency in large scale operations. Hazelcast 3.12 adds subsystems supporting settings for favoring consistency over availability -- or, conversely, availability over consistency -- when making data scalability tradeoffs.
While IMDGs are not literally databases, large enterprise and e-commerce companies have found them useful for accelerating database performance for real-time applications such as payment processing and fraud detection or recommendation engines, said Matt Aslett, analyst at 451 Research.
Designers of these applications have had to carefully match attributes of data consistency and system availability as their systems have gone online and global. IMDGs are often part of the mix, playing a key role as part of real-time streaming data systems.
"Lines have blurred some in terms of in-memory data grid providers delivering consistency and support for stream processing," Aslett said.
Interest in IMDGs has grown along with the growing need for hybrid operational and analytic processing, in which enterprises do analytics on operational data, he said.
A variety of products and services come under the in-memory data processing umbrella, including IMDGs, in-memory databases and stream processing, Aslett noted.
The blurring is apparent, Aslett said, in Hazelcast's new consistency and availability subsystems, as well as in a Hazelcast Jet stream processing engine.
Dialing in data scalability
For distributed data systems, consistency, availability and network partition settings have become the knobs and dials with which data architects tinker as they tune cloud-scale commerce systems.
In fact, users have sought to balance data consistency, data availability, and partition tolerance in distributed systems since the advent of the CAP Theorem developed in 2000 by Eric Brewer, an academic and vice president of infrastructure and fellow at Google.
Designers' interpretation of the CAP theorem underlie data designs ranging from Google's Cloud Spanner to Microsoft's Azure Cosmos DB -- not to mention numerous databases fielded by a host of NoSQL and NewSQL startups -- as systems encounter increased requirements to support full data consistency at large scale.
Kelly Harrell, Hazelcast CEO, said Hazelcast's new 3.12 subsystems let system designers tune the IMDG based on their overarching design objectives. Other additions to the platform include support for JSON data structures -- to complement existing key-value data support -- and automatic disaster recover failover.
The software is in beta now, with general availability expected in March, according to the company.
Grid yourself for data scalability
Users employed IMDGs like Hazelcast's in distributed data processing going back to the days of grid computing, a predecessor to cloud computing. The grids enabled designers to deal with capacity issues related to spikes in use.
Matt Aslettanalyst, 451 Research
Clustered relational databases with shared caches and distributed key-value NoSQL databases have come along to fill some of the gaps the IMDGs addressed. Meanwhile, some IMDGs have evolved to take on some database characteristics.
IMDG platforms from other vendors include: Oracle Coherence, Pivotal GemFire, SoftwareAG Terracotta BigMemory, Red Hat JBoss Data Grid and GridGain.
An open source version of GridGain's core IMDG, known as Apache Ignite, has gained attention as part of operational intelligence systems in some big data settings. In the complicated mix of big data scalability components for messaging, streaming and analytics processing, IMDGs continue to find a niche -- one that may be expanding.
Data grids support databases
Aslett said some IMDG providers are now positioning their wares as database replacements.
But Hazelcast's Herrell said: "Hazelcast is not a database. It is a data grid that is a complement to a database."
As more web applications require high-performance processing at scale, enterprises could increasingly use CAP-savvy grids.