Sergey Nivens - Fotolia

Aerospike database garners Spark, Kafka connectors

Apache Kafka and Apache Spark connectors ease use of the Aerospike NoSQL data store in high-speed applications such as analytics that are becoming more broadly supported.

Distributed NoSQL maker Aerospike this week released a set of add-on modules that ease integration of the Aerospike database's enterprise edition and support analytics workloads.

The new Aerospike Connect modules target large-scale systems incorporating event-oriented Apache Spark for data streaming and Apache Kafka components for data ingestion and extraction.

Aerospike users who like the technology have publicly commented that the platform's lack of analytics workload support was a downside. Support for Spark better positions Aerospike for use in analytics applications -- in effect extending the database beyond the high-speed operational transactions that have been the company's focus, according to James Curtis, an analyst at 451 Research.

"In general, NoSQL databases have not been used as analytical databases," Curtis said. But, he added, that is changing as NoSQL systems start appearing as part of larger analytical efforts, in which they link operational data with new analytical workhorses, such as Spark.

James Curtis, analyst, 451 ResearchJames Curtis

"The Spark connector points Aerospike more at the analytics space," he said.

Spark connectors have become a common part of the big data mix. Aerospike's connector joins similar ones from NoSQL competitors such as Couchbase, DataStax, Redis Labs and others as part of a general trend that sees multiple components applied to achieve operational analytics.

Kafka data pipelines -- again, supported by a variety of players -- also prove useful in IT shops that perform analytics and machine learning on quickly arriving pools of big data. Curtis said a first look at the Aerospike Kafka connector showed it to be "fairly robust" for data movement.

IPhone wake-up call

Srini Srinivasan, founder and chief development officer of Aerospike, based in Mountain View, Calif., saw the birth of large-scale distributed databases in an earlier development role at Yahoo. Later, the shortcomings of relational databases -- even when they were coupled with large caches -- led him to work with others to build the high-speed key-value data store that eventually took the form of the Aerospike database.

Some of the incentives to build new styles of database were apparent at Yahoo, he said. There, he worked to help create new clients for Yahoo email systems as their use in the then-new iPhone began to expand.

Today, people take for granted that personal data can show up almost immediately on their mobile devices. But, at the time of the iPhone launch in 2007, it was all new. The data demands of the iPhone were, Srinivasan said, "a wakeup call."

"Today, Aerospike is about helping you make real-time decisions," he said. In the system, emphasis has been placed on achieving low latency -- by his estimate, "in the millisecond range."

The Spark and Kafka connectors will further the real-time utility of the Aerospike NoSQL database, Srinivasan said. They are available immediately, with a new REST API for Aerospike development due in April.

Novel memories

Aerospike found early use in ad technology systems that drive real-time online ad brokering. Subsequent broader use cases have included e-commerce and online gaming.

The rigorous data latency, consistency and scalability demands of these types of systems are demands found in more and more new applications, Srinivasan said.

Along the way to supporting a variety of data objects beyond the simple key-value type, the Aerospike database has come to support several novel in-memory storage formats, according to Srinivasan.

Besides flash and solid-state drive devices, Aerospike supports use of non-volatile memory express modules. In December, he indicated, the platform was optimized for use with Intel Optane DC persistent memory.

For now, support for this latter memory type is something of a placeholder for Aerospike and others. The Intel Optane memory format was made available in beta for OEMs, with general availability anticipated for the first half of 2019. Optane and other advances suggest the speed and volume of big data will continue to vault forward.

Next Steps

Aerospike software appears in market data setting

Dig Deeper on Database management