Getty Images/iStockphoto

Confluent adds support for Apache Flink, unveils update

The streaming specialist has added a managed service for the popular open source compute layer so customers can use the tools of their choice to develop a data ecosystem.

Confluent on Tuesday launched a new managed service for Apache Flink that enables serverless data stream processing from the open source Apache Kafka platform.

In addition, the vendor unveiled the latest Confluent Cloud update highlighted by Tableflow, a new feature that lets customers convert Kafka topics, which are categories of data; schemas; and metadata to Apache Iceberg tables with one click.

Confluent Cloud for Apache Flink is now generally available across AWS, Google Cloud and Microsoft Azure. Tableflow, meanwhile, is currently part of an early access program with general availability coming soon, according to Confluent.

Confluent revealed the new managed service for Apache Flink and Confluent Cloud update during Kafka Summit London 2024, a conference for Kafka users.

Based in Mountain View, Calif., Confluent is a streaming data vendor whose platform is built on Apache Kafka -- technology developed by the open source community to enable users to stream data in real time.

Confluent Cloud is a managed service for the vendor's cloud customers. In addition, the vendor offers Confluent Platform for on-premises users.

In May 2023, Confluent's platform update aimed to improve the quality of streaming data so that customers could better trust the data on which they base real-time decisions. Two months later, the vendor unveiled a new technology partner program to broaden the connectivity of the Confluent data ecosystem.

In addition to Confluent, streaming data specialists include Cloudera and Aiven. Tech giants such as AWS, Google and Microsoft are also among those that offer streaming data platforms.

Support for Flink

Apache Flink is a data processing framework for data streaming, a compute layer that enables enterprises to filter, combine and enrich data in real time to enable real-time analysis and decision-making.

Similar platforms include Amazon Kinesis Data Streams, Azure Event Hubs and Confluent's own proprietary platform.

However, by providing Confluent Cloud for Apache Flink, a dedicated managed service for Apache Flink, Confluent is enabling customers to use Confluent's overall Kafka-based streaming data platform in conjunction with the open source compute layer preferred by many enterprises.

Among those that use Flink to fuel streaming data are Airbnb, Uber, Netflix and TikTok, according to Confluent.

Meanwhile, common applications for streaming data include customer service, e-commerce, fraud detection, supply chain management and predictive maintenance.

As a result of Flink's popularity and various applications, Confluent's launch of a dedicated managed service for Apache Flink is notable, according to Matt Aslett, an analyst at ISG's Ventana Research.

"The general availability of Confluent Cloud for Apache Flink is a significant addition that expands Confluent's ability to address stream processing [beyond] its existing expertise in event and data streaming," he said.

In particular, support for Flink provides Confluent customers with a distributed processing engine to develop applications for SQL-based streaming and batch analytics of event data, Aslett continued.

Specific capabilities that Confluent Cloud for Apache Flink now enables include the following:

  • An easy way to filter, join and enrich data.
  • Stream processing at scale without having to manage an infrastructure.
  • Kafka and Flint in a unified platform that includes monitoring, security and governance.

The impetus for developing Confluent Cloud for Apache Flink came from the vendor's customers, according to Jean-Sébastien Brunner, Confluent's director of product management.

Flink is a popular option for developers given that it can process large amounts of data with low latency and is designed for both batch file processing and streaming data, he noted. In addition, many Confluent users were already using Flink in concert with Confluent on their own.

As a result, support for Flink was a natural development for Confluent.

"We developed Confluent Cloud for Apache Flink because customers needed an easy and powerful way to process their data streams to derive the most value from them," Brunner said.

An explanation of the flow of a data streaming pipeline.
How data streaming works.

Additional new capabilities

In addition to a managed service for Apache Flink, Confluent unveiled a new version of Confluent Cloud, which features Tableflow.

Tableflow is a tool that makes it easier for customers to create Apache Iceberg tables from Kafka categories, schema and metadata. Users can quickly merge information to create data tables in data warehouses and data lakes that can then be used to inform analysis and decisions.

Tableflow lies within Confluent Cloud's Kora Engine, Confluent's cloud-native engine for Kafka and aims to improve upon Kora's previous ability to feed real-time analytics.

Beyond enabling users to create Iceberg tables with one click, Tableflow includes Confluent's Stream Governance capabilities and was developed to ensure that such tables are kept current with the latest streaming data from source systems.

"Tableflow is an interesting addition to Confluent Cloud that facilitates integration with data warehouses and data lakehouses," Aslett said.

In addition, the feature is useful because it eliminates the gap between batch files and streaming data by enabling users to turn Kafka topics, schemas and metadata into Iceberg tables, he continued.

That transformation of information into Iceberg tables, meanwhile, will help enterprises gain greater insight into their real-time data.

"In addition to facilitating long-term storage of event data, Tableflow will enable enterprises to take a more holistic view of their streaming and batch data," Aslett said. "[ISG's Ventana] asserts that by 2027, more than one-half of enterprises will adapt their data management and governance processes, taking a holistic approach to managing and governing data in motion alongside data at rest."

In addition to Tableflow, the latest Confluent Cloud update includes the following:

  • Stream Governance -- including Data Portal to discover and explore Kafka topics on Confluent Cloud -- automatically enabled in all customers' environments so users can better track data lineage and address data quality.
  • New partnerships and connectors that broaden the Confluent ecosystem, including partnerships with Kinetica, Redis and SingleStore, among many others.
  • New enterprise clusters are available in AWS and Azure designed to help customers lower costs by improving throughput.

Of particular benefit is Data Portal in Stream Governance, according to Aslett. The tool will enable users to discover and use data products based on their metadata.

"Data Portal builds on the schema and lineage capabilities of Stream Governance … and utilizes them as a foundation for data discovery and exploration," he said.

Future plans

Now that Confluent Cloud for Apache Flink and the latest Confluent Cloud update have been unveiled, Aslett said Confluent would be wise to do more to enable SQL-based processing of streaming data.

The general availability of Confluent Cloud for Apache Flink is a significant addition that expands Confluent's ability to address stream processing [beyond] its existing expertise in event and data streaming.
Matt AslettAnalyst, ISG's Ventana Research

Confluent Cloud for Apache Flink provides some such support, but it is only a start.

In addition, while many vendors have integrated with generative AI platforms to simplify the use of their tools, Confluent has not yet introduced similar generative AI capabilities, Aslett noted.

Independent vendors such as Informatica and Dremio have unveiled generative AI capabilities aimed at making their platforms accessible to a broader audience and helping existing users be more efficient by reducing time-consuming tasks. Likewise, tech giants AWS, Google and Microsoft have done the same.

"We haven't seen a lot from Confluent yet in terms of making use of generative AI to lower the skills barriers to working with streaming data," Aslett said. "As such, we would anticipate the company working on GenAI-based digital assistant capabilities to help customers accelerate the development and adoption of applications based on streaming data."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies