Getty Images/iStockphoto

Kafka streaming data gets governance from Confluent

Apache Kafka event streaming from Confluent helps online grocery provider Instacart deliver milk to customers when they need it, by helping to power real-time inventory updates.

Apache Kafka data streaming vendor Confluent enhanced its platform with governance capabilities to give users more control and security and ability to meet compliance requirements.

Streaming data provides a real-time approach for organizations to benefit from real-time events, though it does come with a few challenges, among them being governance.

Confluent, at the virtual Kafka Summit America 2021 on Sept. 14, said its new Stream Governance feature is now generally available to users of the Confluent Cloud platform.

Confluent's cloud platform is based on the open source Apache Kafka data event streaming technology, providing organizations with a managed service.

Stream Governance also integrates a stream lineage capability that enables administrators to track where data comes from and where it is going.

How Instacart uses Kafka event streams to improve grocery delivery

Among the users speaking at the conference was Dusty Pearce, vice president of infrastructure at Instacart. In a keynote chat with Confluent co-founder Jun Rao, Pearce detailed how the online grocery retailer has used Kafka and Confluent's services to help meet growing demand during the pandemic.

"One of the things that happens with really rapid growth is you don't have the staff and you can't wait around to build the engineering team to build these very large systems and take care of them at scale," Pearce said. "So we knew very early on that we wanted an aggressive managed services strategy."

Screenshot of Instacart vice president of infrastructure Dusty Pearce
Dusty Pearce, vice president of infrastructure at Instacart.

Pearce noted that data in motion, enabled by Kafka, is the backbone of many systems at Instacart that help to improve customer experience for ordering groceries online. Streaming data also helps power Instacart's inventory dashboard, which Pearce emphasized it is fundamental to his company's business.

"If you order milk and it's not there, you get upset, so we don't want you to order something that's not there so we're constantly trying to get updates," Pearce said.

What Confluent Stream Governance brings to Kafka event data streaming

During a keynote, Confluent CEO and co-founder Jay Kreps explained what the new Stream Governance capabilities add to the vendor's cloud platform.

If you order milk and it's not there, you get upset, so we don't want you to order something that's not there so we're constantly trying to get updates.
Dusty PearceVice president of infrastructure, Instacart

Kreps noted that Stream Governance has several components. One component involves stream quality and helping to ensure that data fits a certain schema so that it is usable. With the stream quality capability, users will be able to check and make sure that a given stream is compatible with the organization's data requirements.

Another key component is the stream catalog that Kreps said will enable users to discover data streams.

With the stream catalog a user will also be able to categorize and tag streams as well as determine who owns a stream. The catalog is also directly integrated with a stream lineage capability that helps to identify not only where data is coming from, but also provides a visualization of how data is flowing.

"This is like Google Maps for all your data flow created in real time off of the actual flow of data," Kreps said of stream lineage. "There is no extra work you have to do to instrument anything; it's just created as data flows and as those flows change. It's always got an up-to-date version of what got where and when it got there."

Dig Deeper on Data management strategies