
Getty Images/iStockphoto
Confluent launches Tableflow to ease use of streaming data
The vendor's new feature enables users to convert event data to tables that developers and engineers can search and discover to inform AI and BI applications.
Confluent on Wednesday launched Tableflow, a feature within Confluent Cloud that enables users to easily convert event data to open table formats so the event data can be integrated, accessed and analyzed.
Apache Iceberg and Delta Lake are the two most popular open table formats. Both, along with other open table formats such as Apachi Hudi, provide a unified interface resembling a database for both streaming and batch data processing, which simplifies integration with data storage and analytics platforms.
Support for Apache Iceberg is now generally available, enabling users to represent Apache Kafka topics -- organizational units including schemas, metadata and categories of data -- as Iceberg tables that can load into any data warehouse, data lakehouse or analytics engine. Support for Delta Lake, meanwhile, is now in early access.
Because Tableflow simplifies access to data that can be used to inform analytics and AI applications, its launch is valuable for Confluent customers, according to Stephen Catanzano, an analyst at Enterprise Strategy Group, now part of Omdia, a division of Informa TechTarget.
"The general availability of Tableflow is significant because it enhances data accessibility, governance, and integration with Apache Iceberg and Delta Lake, making it easier to manage real-time and batch processing workloads," he said.
Based in Mountain View, Calif., Confluent is a streaming data specialist whose platform is built on Apache Kafka, an open source technology that enables users to stream data in real time.
Tableflow was first unveiled in preview in March 2024.
New capabilities
OpenAI's November 2022 launch of ChatGPT, which represented a significant improvement in generative AI technology, led to surging interest in AI development.
Generative AI tools such as assistants and agents can make employees better informed and more efficient, fueling an enterprise's growth. Generative AI development, however, is not simple, requiring engineers to combine an organization's proprietary data with generative AI models so an application can understand that organization's unique characteristics and be of use to its employees.
Access to data, therefore, is critical. So is ensuring data quality.
Tableflow simplifies integrations between real-time operational data and the systems developers and engineers use to store data and feed the retrieval-augmented generation and other data pipelines that inform analytics and AI applications. In addition, as part of Confluent Cloud, data governance is applied as data is generated, ensuring it meets organizational and regulatory standards.
Kevin Petrie, an analyst at BARC U.S., noted that agentic AI development is on the rise, with agents often deployed for real-time applications. Tables, meanwhile, are the leading source of AI and machine learning model inputs, he continued.
Converting streams of operational data to open format tables, therefore, meets the needs of many developers. In conjunction with Confluent's support for numerous vector databases -- a key means of combining structured and unstructured data as well as enabling similarity searches to discover relevant data -- Tableflow has the potential to simplify AI development.
"Tableflow addresses a compelling market opportunity," Petrie said. "Together, [Tableflow and vector search] help AI adopters prompt their GenAI language models with proprietary data using either vector or relational RAG workflows."
In addition to converting event data to open table formats, Tableflow's launch enables users to choose the storage bucket -- for example, Amazon S3 -- for Iceberg and Delta tables that best meets their needs and includes direct integrations with platforms from Confluent partners including AWS, Dremio, Snowflake and Starburst.
By allowing Tableflow users to choose their preferred storage bucket, Confluent is enabling customers to store data according to their own needs rather than forcing them to adjust to Confluent's preferred storage method, according to Catanzano.
"The standout feature in Tableflow's GA release is the bring-your-own-storage capability," he said. "It provides customers with full control over their data storage while ensuring compliance with unique data ownership requirements."
Regarding the impetus for developing Tableflow, Adi Polak, Confluent's director of advocacy and developer experience engineering, cited the explosion of interest in AI development as one factor.
Given its complexity, many enterprises have struggled to develop and deploy AI tools effectively. Providing capabilities that simplify development is, therefore, an opportunity for vendors.
"Many users are questioning whether they have the right streaming and analytics strategy in place, which has brought the challenges and complexities of streaming operational data into the analytical world to light," Polak said.
Beyond the general availability of Tableflow, Confluent introduced new Confluent Cloud for Apache Flink features.
Apache Flink is an open source stream processing framework that enables users to filter, combine and enrich real-time data as it's ingested to foster real-time decision-making.
Confluent's new features include the following:
- Flink Native Inference, which simplifies development workflows by letting users run open source or fine-tuned models directly in Confluent Cloud.
- Flink search, a feature that enables users to use one interface to access data from multiple vector databases, including MongoDB and Pinecone.
- Machine learning functions that simplify data science tasks such as forecasting and anomaly detection.
Flink Native Inference is perhaps the highlight given that it increases flexibility, security and cost control by allowing users to run models in Confluent Cloud rather than a third-party environment, according to Catanzano.
"Flink Native Inference is particularly significant," he said.
In the future
As evidenced by the launch of Tableflow, Confluent is focused on making it easier to use streaming data to train and update AI models and applications, according to Polak.
Toward that end, Confluent would be wise to broaden its partnership network and add integrations with AI development frameworks, according to Petrie.
"Confluent's ecosystem is a strategic piece of the puzzle," he said. "I might recommend exploring partnerships within the developer space with companies such as LangChain."
Other AI development frameworks include Hugging Face, TensorFlow and PyTorch.
Catanzano also suggested that Confluent could grow by adding more integrations with AI and machine learning platforms.
"Looking ahead, Confluent may continue its evolution by deepening AI and ML integrations," he said. "It's a direction most data vendors are going."
Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.