michelangelus - Fotolia

Trifacta unveils new integrations to enable data wrangling

Trifacta on Wednesday unveiled an updated tool to enable customers to work with their directly in Google BigQuery along with two new integrations designed to improve data preparation.

Trifacta on Wednesday introduced an updated integration that will enable data wrangling directly in Google BigQuery.

Trifacta, founded in 2012 and based in San Francisco, offers a data preparation platform that sorts through data to find only high-quality, relevant information and then transforms that information into a digestible format.

Through a new version of Google Cloud Dataprep by Trifacta, joint Trifacta and Google Cloud customers will now be able to execute their data wrangling and data preparation directly in BigQuery, Google's cloud data warehouse. Using SQL, rather than having to extract data from BigQuery for transformation and analysis and subsequently return it to BigQuery, customers can transform their data in-database and avoid the ETL process.

According to Trifacta, the system has the potential to make data engineering tasks up to 20 times faster.

In addition to the updated data wrangling capability for BigQuery, the vendor unveiled an integration with dbt Core, an open source analytics engineering tool maintained by Fishtown Analytics, and a new partnership with Databricks, a cloud data platform.

Two of the three are now generally available - dbt Core is in preview - and were revealed on Wednesday during Wrangle Summit, a virtual conference hosted jointly by Trifacta and Google Cloud. And all three were motivated by the wants of the vendor's users, according to Trifacta CEO Adam Wilson.

"This was very directly customer-driven," he said. "We realized that our users wanted a platform that allows them to move seamlessly between a visual experience for their data engineering and a code-centric approach. They also want the ability to leverage the power, flexibility and optimized performance of running their workloads directly inside of modern cloud data warehouses."

This was very directly customer-driven.
Adam WilsonCEO, Trifacta

While the updated version of Google Cloud Dataprep by Trifacta will enable customers to do data wrangling in BigQuery, the integration with dbt will enable Trifacta customers to connect the vendor's data engineering platform to dbt repositories. There, users can use both low-code and code-based tools to build data pipelines and collaborate with data engineers, data analysts, data scientists and business analysts.

The partnership with Databricks, meanwhile, centers around a joint system that integrates Trifacta's data preparation capabilities into the Databricks Lakehouse Platform, a hybrid of a data lake and data warehouse. The partnership is intended to enable faster data preparation by removing bottlenecks that slow the preparation of data for analytics and machine learning models, and sustainable data governance by tracking data lineage.

The end result of each of the new integrations will be streamlined workflows and time savings, according to Doug Henschen, principal analyst at Constellation Research.

An organization's customer data is displayed on a sample dashboard from Trifacta.
A sample Trifacta dashboard displays an organization's customer data.

"For Trifacta customers, it extends their familiar low-code/visual data engineering and prep work into the collaborative data pipelines supported by dbt, into efficient, scalable and performant in-database transformation within BigQuery, and into the popular data science and Lakehouse analytical environments of Databricks," he said.

In addition, the integrations could make Trifacta an appealing data preparation tool for customers of dbt, Google Cloud and Databricks that aren't yet Trifacta customers, Henschen continued.

Trifacta, he said, is perhaps the last independent self-service data preparation and data engineering vendor, and while many BI and analytics vendors now have data preparation capabilities, Trifacta is differentiated by providing deeper data wrangling capabilities that are agnostic to analytic environments and public clouds.

"For users of these three services who aren't using Trifacta currently, it gives them an appealing, low-code, visual data preparation and data engineering option that, in the case of dbt, is well integrated with their data pipeline environment, and in the case of BigQuery and Databricks [is well integrated] in their data platform of choice," Henschen said.

Similarly, Dave Menninger, research director of data and analytics research at Ventana Research, said that the integrations could expose Trifacta to new potential users.

"We are seeing more and more of dbt in the market, so the integration will increase the value of Trifacta to a broader set of customers," he said.

Regarding the integration with BigQuery, Menninger added that enhancement to Google Cloud Dataprep by Trifacta is a sign that the Trifacta tool has been well received since it was first introduced.

"The fact that they are making these enhancements is an indication there is enough customer traction and demand to justify the investment," he said.

With the new integrations now available, Trifacta's product development will focus on the vendor's core themes of openness, intelligence and self-service, according to Wilson.

"This includes more work on integrating with the modern analytics stack through open APIs, more work leveraging machine learning to synthesize data transformation logic automatically, and more work empowering end users through our … no-code/low code experience," he said.

In addition, Wilson said more optimizations for Snowflake and streaming analytics support are in the works.

Dig Deeper on Data management strategies