Getty Images

Dremio launches updated SQL query acceleration capabilities

The data lakehouse specialist's new version of its SQL query acceleration tool, Reflections, includes automated recommendations and automatic data refresh capabilities.

Dremio launched an updated version of its SQL query acceleration tool that enables customers to run queries significantly faster than unaccelerated queries.

Reflections has been part of Dremio's platform since the 2015 startup, based in Santa Clara, Calif., emerged from stealth in 2017. The latest version of Reflections became generally available on Sept. 13.

Using Reflections, customers can accelerate SQL queries anywhere from 10 to 100 times faster than when they don't deploy Reflections, according to Dremio.

Reflections are built on the data lakehouse vendor's query engine, which uses a matching algorithm to speed queries by matching a new Reflection with an existing Reflection. In addition, Reflections create optimized caches of source data. When combined, the matching algorithm and optimized caches of source data result in accelerated query performance and reduced compute consumption.

Dremio's core technology is similar to that of data lakehouse pioneer Databricks.

Data lakehouses are data storage repositories that combine the structured data management capabilities of data warehouses with the unstructured and semistructured data management capabilities of data lakes, enabling customers to combine their data to realize a complete view of their operations.

Because they are able to store all that data and enable it to be used together, data lakehouses are gaining popularity as potentially the optimal fit for generative AI, which is more accurate when trained on extensive and quality data.

In June, Dremio unveiled its first generative AI tools, including text-to-code translation capabilities that are generally available, and semantic layering and vector search and storage capabilities still in development.

In addition, the vendor named former Splunk executive Sendur Sellakumar its new CEO in July.

Need for speed

Query speed is important for data lakehouse users, according to Stephen Catanzano, an analyst at TechTarget's Enterprise Strategy Group.

While fast queries have always been enablers of efficiency, freeing data experts to do more by making their work faster, query speed is becoming more valuable as generative AI continues to gain momentum.

SQL query speed has always been critical. But it's growing even more in importance because AI is query-intensive as the volume of users demand more data and [want outputs] as close to real time as possible. Organizations are scrambling to cost-effectively ramp up [AI efforts].
Stephen CatanzanoAnalyst, Enterprise Strategy Group

Generative AI models are often trained on massive amounts of data, whether they're public large language models such as ChatGPT and Google Bard that use public data to become more intelligent or private language models trained on an organization's own data.

Data itself, meanwhile, is becoming both more complex and more voluminous as more organizations collect data from an ever-growing number of sources.

Data repositories, therefore, not only need to handle large amounts of data to enable generative AI training and analysis, but also must be able to perform efficiently no matter the scale of a given query or model.

"SQL query speed has always been critical," Catanzano said. "But it's growing even more in importance because AI is query-intensive as the volume of users demand more data and [want outputs] as close to real time as possible. Organizations are scrambling to cost-effectively ramp up [AI efforts]."

"The big announcement is SQL acceleration, which can decrease query speed time," he added. "Clearly, it's a goal that all customers have."

Beyond adding more speed to Reflections, the SQL query accelerator now includes Reflection Recommender. The feature automatically examines an organization's SQL queries and then delivers a recommended Reflection to help accelerate queries.

The result is reduced manual labor and lower cost than previously required to develop Reflections and generate efficient queries, according to the vendor.

In addition, with Reflections Refresh, Reflections now automatically update as organizations ingest new data, which further speeds SQL query performance and helps reduce costs, according to Tomer Shiran, the vendor's founder and chief product officer. Dremio uses the open source Apache Iceberg storage format to capture data changes and update Reflections.

Combined, the new Reflections capabilities are designed to further Dremio's goal of making data queries completely autonomous, Shiran said.

"Dremio's vision is to evolve to completely intelligent, autonomous data optimization," he said. "Our customers want -- and Dremio wants -- this to be simple and smart for our users. That's what drove this development."

Next steps

With the latest version of Reflections now available, Dremio's roadmap includes working toward making Reflections still faster and more efficient, according to Shiran.

In addition, with Autonomous Semantic Layer and Vector Lakehouse still in preview, generative AI is also a prominent part of the vendor's plans.

Catanzano, meanwhile, said vector search and storage is a smart area of focus for Dremio.

Vectors are what enable unstructured data such as text, audio and images to be transformed for analysis. From there, the previously unstructured data can be used in semantic searches and combined with structured data to inform models and decisions.

While Dremio's vector search technology is still in development, some of the vendor's rivals have already released it.

Neo4j recently added vector search and storage to its core database capabilities. MongoDB similarly added vector search capabilities in a recent update. Tech giant Google is still another to have recently added vector search capabilities with the launch of AlloyDB AI.

"Vector search is a new push by vendors in this space," Catanzano said.

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies