Getty Images

Data quality specialist Soda secures $14M to fuel expansion

The Europe-based vendor intends to use its latest funding to shift its growth focus to the U.S. while also continuing to invest in its AI-fueled observability platform.

Soda Data, a specialist in the emerging data quality sector, on Thursday revealed it has raised $14 million in venture capital funding that it plans to use to grow its presence in the U.S.

"The extension of the vendor's Series A financing, with existing investors Singular and Point Nine adding to their previous investment in 2021, brings Soda's total funding to $27.5 million."

Based in Brussels with a U.S. headquarters in New York City, Soda is a 2018 startup whose platform is designed to help ensure data quality with data observability, a technique that enables customers to monitor data throughout its lifecycle to address quality problems as they emerge.

With both data volume and data complexity increasing, even teams of humans are unable monitor every data point or data set for quality as it moves from ingestion through pipelines to fuel analysis and insight.

In response, data observability specialists have emerged with tools that automate the monitoring process -- often incorporating AI -- to discover anomalies and other potential problems. The tools then alert users so poor quality data doesn't make it into dashboards, models, applications and other data products, leading to misinformed decisions.

The increasing enterprise use of AI, including generative AI that enables more non-technical employees within organizations to work with data without relying on a data expert, only adds to the importance of data quality. However, many enterprises lack the right infrastructure to ensure data quality, leading them to use data observability tools, according to Kevin Petrie, an analyst at BARC U.S.

In addition to Soda, Monte Carlo and Acceldata are among the vendors specializing in data observability as a means of improving data quality.

"Our research shows that fewer than half of AI adopters believe they have the data quality and governance controls they need," Petrie said. "This drives up demand for data observability tools."

Before securing its latest funding, Soda in June 2023 unveiled SodaGPT, a generative AI-powered tool that enables customers to use natural language rather than code with the vendor's data quality platform. That was followed in May with the introduction of Soda AI, which combines the vendor's AI capabilities, including SodaGPT, in a single environment.

Capital infusion

Soda essentially plans to shift the focus of its growth from the European market to the U.S., according to Maarten Masschelein, the vendor's co-founder and CEO. The vendor will use the new funding to execute that shift with go-to-market efforts.

Soda currently does about 40% of its business in the U.S. and has a team of less than a dozen employees based in the U.S. With the $14 million, the vendor hopes to increase its business in the U.S. to more than 50% of its overall business and double its U.S.-based staff. In addition, Masschelein said he plans to move to New York to help guide Soda's growth initiatives.

"Originally, we were a European company," Masschelein said. "But for us it's clear that the market to win is the U.S. market. This funding definitely helps. We're following the model of a lot of other companies that were originally European."

In addition to fueling go-to-market efforts, the new capital -- in concert with a Series B funding round that Soda hopes to conduct a little over a year from now with U.S. investors -- will help add product development staff, Masschelein continued.

Soda hopes to employ specialized teams to work on each of its different products. For example, the vendor wants its generative AI suite to have a dedicated team of engineers with a similar team dedicated to its data pipeline testing suite.

More product engineering staff will enable Soda to add new products to its overall platform, the CEO said.

In particular, Masschelein identified data observability as an area for adding more engineering staff. Soda already offers data observability capabilities as part of its data quality platform but has plans to build out that core area further with a second product engineering team.

While product development is important for any data management vendor, expansion of its U.S. presence is vital for Soda, according to Petrie.

Six elements of data quality.

The vendor has the potential to attract new users because its platform is easy to use, which enables more than just data experts to address an organization's data quality. The recent focus on generative AI strengthens that ease of use.

SodaGPT marked the vendor's initial combination of generative AI and data quality. Soda AI represents its shift to providing a generative AI-based platform.

In addition to SodaGPT, Soda AI includes time series anomaly detection dashboards and generative AI assistants for regex patterns, SQL coding, scheduling and running routine data quality checks, and helping customers learn how to use Soda.

All but the Ask AI Assistant for guiding customers as they use Soda's platform -- in private preview -- are generally available.

"Soda has the right objective of expanding in North America," Petrie said. "This is the largest market, with many early adopters of observability solutions."

Popping out

While the $14 million capital infusion will enable global expansion, Soda is somewhat unique among data management and analytics vendors in simply being able to attract any venture capital funding.

As recently as early 2022, funding flowed relatively freely with data management and analytics vendors attracting large investments.

For example, in 2021 alone, Aiven, Reltio, SnapLogic, ThoughtSpot and TigerGraph all raised between $100 million and $160 million. Confluent, meanwhile, raised $828 million and Databricks raised $1 billion.

In the opening months of 2022, Sigma Computing raised $300 million and Aiven another $210 million.

In addition, vendors including Pyramid Analytics, Qlik, SAS and ThoughtSpot all expressed interest in initial public stock offerings with Qlik filing with the U.S. Securities and Exchange Commission to begin the process.

But then the stock market plunged amid worldwide events such as the Russia-Ukraine War, repeated supply chain disruptions and fears that the U.S. economy was headed for a downturn. Tech stocks were caught up in the sell-off and venture capital funding followed suit, almost disappearing for tech vendors.

Although the stock market has recovered, supply chains have mostly stabilized and the U.S. economy remains healthy, venture capital investing remains tight.

Since the first half of 2022, only select analytics and data management vendors have attracted significant funding. Databricks continues to raise vast sums. Denodo likewise was able to attract upwards of $100 million. But they are the exceptions.

The start of 2024 has shown some recovery in venture capital funding both broadly and among technology vendors with Aerospike, Coalesce and Sigma some of those that have been able to raise financing. Now Soda has joined that group.

Soda has the right objective of expanding in North America. This is the largest market, with many early adopters of observability solutions.
Kevin PetrieAnalyst, BARC U.S.

If there's a common theme among the analytics and data management vendors that are now raising capital, it's that they have embraced generative AI and are building generative AI into their platforms. But beyond simply being among the many vendors that are shifting their product development focus to generative AI, growth to date – and, perhaps more importantly, the potential for further growth -- are important factors for Soda, according to Petrie.

"This [funding] is another sign that we might be coming out of the trough of venture capital funding," he said. "I've seen a number of data startups raise money in the last six months, and founders seem cautiously optimistic on that front. Soda has done well to continue growing over the last two years, while some others have hit the wall."

Masschelein likewise said that Soda's new funding, which he noted came from existing investors, is more than just an investment in generative AI. Instead, it represents a belief in the stability of data management and growing need for data quality.

"Within our investors, there's a very strong belief in the market that we're in," Masschelein said. "They always say to us that this is a very long market. We're part of the core data infrastructure."

Next steps

Despite already launching generative AI capabilities, part of Soda's future product development will focus on adding more generative AI capabilities, according to Masschelein.

"We're just getting started," he said.

One goal is to add integrations to bring generative AI to users in tools such as GitHub where they do their work, Masschelein continued.

Beyond generative AI, Masschelein said Soda plans to integrate more machine learning into data quality detection. The vendor has significant amounts of diagnostic data that can be combined with AI to automate diagnoses.

In addition, Soda plans to continue its emphasis on data contracts -- agreements between Soda and members of its partner and customer ecosystem on the structure of data -- to move the market toward treating data as a product so it becomes more cohesive and scalable.

While generative AI is perhaps the flashiest of the areas Soda plans to focus its product development, machine learning observability is also a wise focal point, according to Petrie.

Data quality is vital to machine learning models. As a result, data observability and machine learning observability will likely converge in the near future with data quality tools such as Soda able to detect changing values in data that result in model drift.

"Because data quality drives the performance of AI/ML models … I'll be interested to see how Soda addresses converging requirements such as these," Petrie said.

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies