Getty Images

Post-lawsuit, Splunk and Cribl meet again in data pipelines

Weeks after a jury awarded Splunk $1 in its lawsuit against Cribl, the two vendors remain on a collision course, this time in the realm of data pipelines and federated analytics.

As observability data growth continues, spurred on by generative AI hype, log management rivals Splunk and Cribl are meeting again on a fresh battleground -- and this time, it's more on Cribl's turf than Splunk's.

The relationship between the former partners soured in 2022, when Splunk filed a lawsuit against Cribl alleging copyright and patent infringement. Cribl's business, which was founded in 2017 by an ex-Splunker, was initially based on log management software that reduced the amount of data sent to Splunk, where users had to pay for data ingest -- cutting costs for users and cutting into Splunk's revenues. In April, that suit, which had morphed into a matter of license and partnership agreement violations, was tried in court, where a jury found that Cribl had violated the terms of Splunk's General Terms for Splunk Enterprise and awarded Splunk damages of $1.

Much has shifted in the two companies' business models since the lawsuit was filed, even before Cisco announced its intent to acquire Splunk in September 2023. Cribl expanded beyond log management into data pipelines, federated search and its own data lake. Splunk also adjusted its pricing starting in 2021, with options for paying by workload rather than data ingest, along with its own federated search that encompassed data stored in cheaper Amazon S3 buckets.

As of this month, both companies were touting new product features meant to make data management more efficient and inclusive of disparate data sources. Splunk expanded its unified data platform during its .conf annual user conference to include Pipeline Builders and an Ingest Processor for Splunk Cloud, along with federated analytics features.

At its own conference the same week, Cribl released its first features since the lawsuit verdict, which included performance improvements and expanded third-party support for its federated search. Both companies also released new AI assistants.

The similarities weren't lost on industry observers.

"The Cribl-like capabilities in Splunk are good to see," said Gregg Siegfried, an analyst at Gartner. "Maybe too little too late, but I welcome them to the telemetry pipeline business. It is getting to be a popular place."

Faya Peng at Splunk .conf24
Faya Peng, vice president of product management at Splunk, presents new data management features during a .conf24 keynote presentation.

Splunk Data Management vs. Cribl Search

There are plenty of similarities between Splunk and Cribl data pipelines and federated search, and both rolled out new features last week meant to boost the performance of federated data analysis.

"Last year, we released Federated Search so that you can remotely search S3 data from Splunk and correlate it with your Splunk data … for historical and audit use cases," said Faya Peng, vice president of product management at Splunk, during a .conf keynote. "Now with Federated Analytics, starting with Amazon Security Lake, you can selectively fetch data … and build it into a short-term index. And this enables higher-performance use cases like monitoring and ad hoc investigations."

Cribl added Real Time Fast Query, a feature that will expand Cribl Search support beyond archival data in object stores such as S3 to access fresh data with faster response times. The product now also supports querying across multiple indexes, which one prospective customer had been waiting for before replacing Amazon OpenSearch with Cribl Search. 

"Prometheus data is [also] something we're interested in testing out," said Bob Chen, director of infrastructure engineering at iHerb, an online retailer for health and wellness products in Irvine, Calif. "[It] looks like we're ready to test it again and just need to find the bandwidth to do so." 

However, Splunk and Cribl take unique approaches accessing different data sources within search and analytics workflows. Cribl's Search documentation describes a process that doesn't require the selective fetching and indexing described by Splunk's Peng.

"Cribl Search … [can] seamlessly analyze all data right at its source," the website states. "Cribl Search allows users to search and analyze data wherever it is located -- from debug logs at the edge to archived data in cold storage -- using a single … query language."

A Cribl Search data sheet linked on the website adds that "[it] searches across multiple data stores and multiple data types … [using] a 'search then forward' model instead of the legacy 'forward then search' approach."

As of this month, Cribl Search supports more data sources than Splunk's Federated Search or federated analytics, including time-series databases and cloud data warehouses such as Prometheus, Snowflake, Elasticsearch, Azure Data Explorer and AWS OpenSearch.

Cribl and Splunk aren't alone in trying to cash in on the data pipeline concept. Other observability vendors including Dynatrace and Mezmo have introduced similar offerings in the last 18 months. Splunk is still getting started and is positioning Federated Analytics primarily for its Splunk Enterprise Security platform rather than for observability so far, according to Andi Mann, global CTO and founder of Sageable, a tech advisory and consulting firm in Boulder, Colo., who served as chief technology advocate at Splunk from 2015 to 2021.

Cribl is already well ahead of Splunk in their niche and has been for some time.
Andi MannGlobal CTO and founder, Sageable

"It looks for all the world like [Splunk is] trying to re-create a Cribl-like feature, finally," Mann said. "Splunk should have had Cribl as a feature five years ago but missed that opportunity in product planning, and couldn't fix it in court. Now Cribl is already well ahead of Splunk in their niche and has been for some time."

One customer of both vendors said their federated search and analytics features could potentially be complementary, given their different data access methods. As he sees it, Splunk is likely to make up for data ingest revenue by charging for ingest processing and analytics with Federated Analytics based on its Splunk Virtual Compute (SVC) workload pricing.

Splunk's existing Federated Search* isn't priced this way, and Splunk has not disclosed how it plans to price the new Federated Analytics product. But Steve Koelpin, lead Splunk engineer for a Fortune 1,000 company in the Midwest, speculated that Splunk might price Federated Analytics differently than Federated Search when it reaches general availability.

"Splunk enabling Federated [Analytics] to non-Splunk bucketed data is a really good idea but goes against their ingest-based license model," Koelpin said. "Now if customers are on SVC, that’s a different story. Federated searching can be very slow and will chew through SVC [resources]."

That resource cost might be worthwhile, depending on where data resides, Koelpin added.

"It's better to clean, transform, structure and index data into Splunk, and if that full fidelity copy is needed, just store a copy of that into S3," he said. "But if you [have] a lot of full fidelity copies of data and want to run a retroactive security investigation but don't want to index 100TB of data, then [Cribl] Search is the way to go."

In the age of AI, Cisco/Splunk argue size matters

Cribl has a lead on Splunk in data pipelines, but its AI assistant, Cribl Copilot, remains limited in its first release, analysts said.

Jon Brown, analyst, Enterprise Strategy GroupJon Brown

"Cribl has done a great job of sticking to [its] knitting, maintaining focus on the observability data pipeline problem," said Jon Brown, an analyst at TechTarget's Enterprise Strategy Group. "[Cribl] Copilot is a 'me-too,' and kind of light at that. [Cribl officials] openly admitted that they couldn't really get it to work right, so they've limited functionality for now. I appreciate their transparency." 

A Cribl spokesperson explained further in an email to TechTarget Editorial:

"As a first version the best part of Copilot is its continual improvement and learning as it gets used more. Today, if you ask it something that it doesn't know, it will simply admit it doesn't know instead of providing misleading information. We actually think this is very important. We know we're dealing with live environments in mission critical applications, so rather than allowing the AI to hallucinate we'd rather it halts."

Cribl Copilot is limited to Cribl.Cloud, although support for on-premises versions is on the near-term roadmap, the spokesperson said. The company's website describes it as primarily focused on easier setup of Cribl's products, pipelines and search queries, rather than analyzing observability data itself for application and infrastructure performance insights.

Splunk rolled out a Configuration Assistant for its IT Service Intelligence product that performs a similar set of functions, along with drift detection and alert threshold analysis, as well as an AI-driven natural language interface for its Splunk Search Processing Language. Those tools are generally available, and it unveiled another new AI Assistant in Observability Cloud in private preview as well as an AI Assistant in Security that will reach private preview in August.

Cribl can route data to other AIOps and security automation tools, but Splunk offers its own, plus fresh integrations with Cisco security and networking products. In contrast to the current version of Cribl Copilot, Splunk's AI Assistant in Observability Cloud automates the analysis of observability data and system documentation for troubleshooting, and provides step-by-step recommendations to remediate issues, according to .conf keynote demos.

As an emerging company, Cribl is naturally at a different scale and stage of maturity from Splunk and Cisco, and as such, is an increasingly ripe target for acquisition, according to both Brown and Mann.

"Given where Cisco is going with Splunk, Cribl appears to have a sustainable competitive advantage, and continues to out-innovate in the pipeline, even if ultimately it will still likely end up as a feature in some other mega-vendor's offering," Mann said.

Time will tell if that mega-vendor could actually be Cisco, but Brown said a venture capital firm or holding company is also a possibility.

"I worry about the future of this company because they seem like such an obvious buy for … holding companies or VCs for software companies that might destroy some of the uniqueness of this committed group of folks," he said. "Uniqueness like free training and licensing on a per-use basis instead of more commonly used metrics like data stored or data transformed."

*Update on July 8, 2024: Language changed to clarify differences between Splunk Federated Search and Federated Analytics pricing. 

Beth Pariseau, senior news writer for TechTarget Editorial, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out @PariseauTT.

Hao Yang Splunk at .conf24
Hao Yang, head of AI at Splunk, presents during a .conf24 keynote presentation.

Dig Deeper on Systems automation and orchestration

Software Quality
App Architecture
Cloud Computing
SearchAWS
TheServerSide.com
Data Center
Close