Fotolia

Hazelcast Jet 4.0 boosts event streaming data platform

The vendor's real-time event streaming platform gets a boost as the newest version adds support for Python machine learning models to help make sense of the data deluge.

Hazelcast is looking to make it easier for organizations to benefit from machine learning alongside event streaming data, with the recent release of the Hazelcast Jet 4.0 update.

Hazelcast, based in San Mateo, Calif., has a number of products in its portfolio, including an in-memory data grid (IMDG), as well as Hazelcast Jet, a real-time data streaming platform. Jet fits into a category of technology that helps enable organizations to rapidly ingest data for business analytics and other use cases, including machine learning. Hazelcast Jet 4.0 integrates new capabilities to more easily enable Python- and Java-based machine learning models.

The need for data streaming is a core part of digital transformation, according to Forrester Research analyst Mike Gualtieri.

"Enterprise digital transformations are on a fast path to real-time applications," Gualtieri said. "That means the ability to sense, think and act on data as it originates from myriad external and internal sources."

According to Gualtieri, Hazelcast Jet brings streaming data capability, which by definition is real time. He added that a key challenge with analyzing streaming data is enriching it with reference data, which is where Hazelcast IMDG can also help play a role.

Hazelcast Jet 4.0 dashboard view
The latest release of Hazelcast Jet helps to improve performance of event streaming data platform.

From in-memory data grid to streaming data

Scott McMahon, senior solutions architect at Hazelcast, said the company got its start as an in-memory caching layer back in 2008.

Enterprise digital transformations are on a fast path to real-time applications. That means the ability to sense, think and act on data as it originates from myriad external and internal sources.
Mike GualtieriVice president and principal analyst, Forrester Research

"We call it a data grid, but you can think of it as a cluster," McMahon said. "The idea is that it was all about keeping data in memory, scaling the data layer, and basically giving an entire storage layer that was all in the RAM memory of computers, so it was much faster."

He added that during the past four years, there has been growth in sensor data coming from different endpoints, as well as connected IoT devices. That growth brought a different way of looking at data, so instead of putting data into a storage layer and then running analysis on it, the need for event stream data processing evolved.

"You basically have these infinite streams of data that are small, discrete, sort of messages, and they're just going to flow forever, you know, theoretically until those things stop," he said. "So, it requires a different way of processing that data; you have to do it in real time and you have to deal with distributed streams of events you have to process."

Hazelcast Jet 4.0 isn't an Apache Kafka competitor

Apache Kafka is among the most widely used event streaming technologies deployed today. According to McMahon, Kafka is best described as a messaging bus that helps move messages from one place to another.

"We don't view ourselves as a competitor to a message bus; we are a computation engine," McMahon said. "Kafka is probably the most common thing that we integrate with."

He added that Hazelcast Jet uses machine learning to help process messages, merge multiple event streams and enrich streams in real time with data stored in Hazelcast IMDG.

Hazelcast Jet 4.0 improves machine learning

Hazelcast Jet enables users to operationalize machine learning models with event streaming data. McMahon explained that the 4.0 update includes a new Python inference runner, which enables Python-based machine learning instances to be run in a distributed parallel fashion.

Previous versions of Hazelcast Jet supported only Java-based machine learning models. Looking forward to the 4.1 update, McMahon said that will add a C++-based inference runner to further extend the number of supported machine learning frameworks.

Dig Deeper on Data integration