E-Handbook: Metamorphosis of Kafka Confluent event streaming technologies Article 1 of 4

When data stream storage and analytics are special events

As data sources and their content multiply, data management, analytics and app development teams are shifting their focus from traditional data-at-rest architectures to advanced data-in-motion infrastructures and the infinite possibilities of real-time analytics to improve business decision-making, service delivery, network optimization, and customer interaction and retention. The driving forces behind this anticipated explosion in event streaming technologies over the next several years are the familiar four horsemen of today's technology: big data, the cloud, IoT and AI.

Three-fourths of companies "will shift from piloting to operationalizing AI," resulting in a five-fold increase in streaming data and analytics infrastructures by 2025, according to a Gartner June 2020 report on the top 10 trends in data and analytics. Over the next five years, MarketsandMarkets said it expects the global streaming and analytics market to grow 25% annually and surpass $38 billion, as companies "facing challenges related to the inflexibility of data infrastructure" start to "take a more in-depth look at capturing data about streaming events."

Vendors providing technologies that manage and analyze continuous streams of data in real time as well as historical data occupy the sweet spot of this transition to event streaming across a wide spectrum of businesses and industries. Occupants include a well-known tag team that shares bloodlines: the popular Apache Kafka open source event streaming platform commercialized by close relative Confluent, whose platform simplifies connecting data sources and systems to Kafka, building applications with Kafka, and securing, monitoring and managing Kafka infrastructure.

Jay Kreps, CEO of Confluent and co-creator of Kafka, called the "paradigm shift toward event streams" the "future of data," during his Kafka Summit NYC 2019 keynote. He pointed to four major industries -- transportation, banking, retail and entertainment -- where event streaming technologies are the "central nervous system" for a litany of real-time applications like sensor diagnostics, fraud detection, personalization, data transformation, machine learning models and recommendations.

This handbook examines the Kafka Confluent symbiotic relationship and the increasing popularity of live event streaming. Learn about Kafka's deployment and management features; the best practices for creating and managing data pipelines and scalable, real-time data streaming; and Confluent's latest developments, including Project Metamorphosis to improve Kafka scalability in the cloud, elasticity efforts to automate the scalability process and infinite retention to store event streaming data for long periods of time.