fotomek/stock.adobe.com
Varada accelerates data virtualization with Presto
The new Varada Data Platform combines data virtualization with the open source Presto SQL query engine to help enable rapid searches on cloud data lakes.
Varada announced the general availability of its data platform Tuesday, bringing data virtualization and query acceleration capabilities to AWS cloud users.
The promise of data virtualization is that users don't have to copy data to a single repository in order to analyze it. Rather, with data virtualization an index can point to where the data resides, so it can be analyzed without the need to move the data. With the Varada Data Platform, data virtualization is combined with other technologies, including the open source Presto SQL query engine, to enable users to make use of data stored in disparate sources.
Matt Aslett, research director at S&P Global Market Intelligence, said he likes the fact that Varada has assembled some of the key components of what his firm has described as an "abstracted data architecture" into a single offering. Among the components needed for an abstracted data architecture are a distributed query engine, data virtualization and query acceleration, all of which are part of the Varada Data Platform. Additionally, S&P Global Market Intelligence expects there will be good interest from existing Presto users, he said.
"We anticipate growing competition, so expect Varada to increase the focus on its differentiating capabilities, including its adaptive indexing approach and machine learning-driven optimization engine," Aslett said.
A foundational element of Varada's data virtualization approach is the company's proprietary data indexing technology. Aslett said the index is designed to be optimized based on the data type, structure and distribution of data across what the company refers to as nanoblocks, enabling it to adapt and evolve as data changes.
How Varada tackles data virtualization
According to David Krakov, co-founder and CTO of Varada, the new platform does not engage in the discovery phase of data. Users will need to know where data is so it can be connected in the platform.
Varada Data Platform can connect with an existing metadata store, such as AWS Glue Data Catalog, which can identify where various sources of data are stored. Varada then takes the metadata store information and builds its own index, which is then used to connect the data and enable data virtualization.
Presto accelerates data virtualization queries
Varada Data Platform also integrates a query layer that enables users to search the data.
Krakov said Varada makes use of the open source Presto SQL technology as the query engine. Presto was originally created at Facebook and is an increasingly popular SQL query engine that is often seen as a rival to Spark. Varada is one of the founding members of the Presto Software Foundation; another backer, Starburst, is using the technology for its own data query platform.
Eran Vanounou, CEO of Varada, said his company didn't want to "reinvent the wheel" and build its own query engine, which is one reason why it uses Presto.
The initial launch of the Varada Data Platform is available on AWS. Vanounou said the company plans to support Google Cloud Platform and Microsoft Azure in 2021.