Getty Images
Monte Carlo boosts data pipeline observability insights
The startup continues to build out its namesake platform with new capabilities to provide organizations with visibility into how and where data pipelines are being used.
Data observability startup Monte Carlo on Nov. 3 made generally available its new Insights capability to help organizations better understand how data infrastructure is being used.
Monte Carlo, based in San Francisco, has had a busy year raising money and growing its data observability platform.
In July, the vendor introduced its Incident IQ capability designed to help organizations better understand problems with data pipeline downtime and performance degradation.
Now, with the new Insights capability, Monte Carlo is providing its users with visibility into overall trends of an organization's deployed data pipeline architecture. The goal is to help organizations better understand how and where data is being used and what value that data brings.
Among the early users of the new tool on the Monte Carlo platform is online retail service ShopRunner, based in Chicago.
The data architecture at ShopRunner involves multiple data pipelines going to and from various data sources, including cloud data in the Databricks lakehouse and Snowflake cloud data warehouse.
Valerie Rogoff, director of analytics data architecture at ShopRunner, explained that the company handles both operational data and data warehouse assets where the data is analyzed for business intelligence and analytics. Data moves from the operational side to the data warehouse, which is where the Monte Carlo data observability platform plays a role.
"Data observability helps us to understand all the nuances of our data from the ingestion source to the data lake," Rogoff said.
Running data observability insights at ShopRunner
ShopRunner was founded in 2009, and over the last decade the company has dramatically expanded its data architecture.
Valerie RogoffDirector of analytics data architecture, ShopRunner
Rogoff noted that ShopRunner has added many data elements and sources over the years. Before engaging with Monte Carlo, it wasn't particularly easy to identify if the data pipelines were all working properly.
"We had some older sources which were not necessarily reliable that had some bad data attributes," Rogoff said. "Monte Carlo helped us to highlight those issues in our environment and allowed us to fix the issues."
Going a step further, Rogoff said that she didn't always know what data pipelines she could turn off as it wasn't clear how or where the data was used.
"That's the beauty of Monte Carlo because it allows us to see who is using data and where it is being consumed," Rogoff said. "This has allowed us to actually free up some of our processing time from unused data elements which no one was using anymore and were no longer relevant."
How Monte Carlo enables data observability insights
Monte Carlo CTO and co-founder Lior Gavish said the observability platform started off with a focus on real-time alerts. That is, if there is an immediate problem with a data pipeline, the organization is alerted and provided with information on how to remediate the issue.
The vendor's goal with Insights is to go beyond the real-time visibility to provide a broader, long-term view. Insights can provide information on the service and performance of a data pipeline over a period of time, indicating how it is used and by whom.
Insights also can help an organization identify some of the costs and benefits associated with data pipelines. Gavish said that Monte Carlo has built out its own machine learning algorithms to surface suggestions for users about data usage to help improve utilization of data resources.
"We give our users very granular information around how they are spending resources," Gavish said.