Getty Images/iStockphoto
SnapLogic launches generative AI tool for data integration
The vendor released SnapGPT, a generative AI tool that lets data engineers more efficiently develop data pipelines by using natural language. An early user is Barnard College.
SnapLogic this week launched the general availability of SnapGPT, a generative AI tool that enables customers to integrate data using natural language rather than code.
SnapGPT was developed using both open source large language models (LLMs) and SnapLogic's privately developed AI capabilities.
Based in San Mateo, Calif., SnapLogic is a data integration vendor whose competitors include Boomi and Fivetran, among data integration specialists. It also faces a spate of vendors such as Informatica and Tibco, whose wide-ranging data management platforms include integration capabilities.
SnapLogic's suite of tools comprises an integration platform as a service (iPaaS) called the Intelligent Integration Platform.
Dating back to its 2017 launch of IRIS, an AI assistant, AI has long been one of the ways the vendor is attempting to simplify data integration. For example, its August 2019 update added AI capabilities that automatically generates recommendations for data pipeline management.
In 2021, SnapLogic raised $165 million in venture capital funding to advance its AI capabilities. The investment brought the vendor's total funding to $371.3 million.
New capabilities
Generative AI has been the dominant trend in data management and analytics so far in 2023.
Generative AI vendor OpenAI launched ChatGPT in November 2022, marking a dramatic advance in generative AI and LLMs. Since then, the vast majority of vendors have focused product development around generative AI.
The reason is that generative AI has the potential simplify data management and analytics, making data engineers and analysts more productive.
And that is one of the goals of SnapGPT, which was first unveiled in preview in March 2023 and made generally available Aug. 2.
By enabling users to integrate data sources and develop data pipelines using freeform natural language rather than code or limited natural language processing capabilities, the tool is designed to eliminate time-sucking processes that result in bottlenecks and slow pipeline development.
In addition, SnapGPT automatically generates documentation for each data pipeline, develops sample data so engineers can test and validate pipelines and delivers suggestions as engineers build integration pipelines.
Doug Henschen, an analyst at Constellation Research, noted that the Intelligent Integration Platform already had some of the capabilities enabled by SnapGPT. The new generative AI tool, however, improves upon those existing capabilities while also adding new ones.
"SnapLogic already had AI/ML capabilities in place … which could recommend next-best integration steps and components," he said. "SnapGPT adds more detailed descriptions of those recommendations, as well as the ability to automatically describe and document integration processes, generate SQL queries, map data sources and generate sample data for use in testing integrations."
With data integration at the crux of readying data for eventual analysis, those improved descriptions for recommendations and added capabilities are significant, Henschen continued.
They'll enable some non-technical users to work with data pipelines, he noted. But more importantly, they will make trained engineers more efficient.
Doug HenschenAnalyst, Constellation Research
"Even if a generative feature can recommend next-best steps or, at some point in the future, complete pipelines, there's usually tuning, tweaking and last-mile configuration required that would be beyond the grasp of non-technical users," Henschen said. "The biggest benefit of generative capabilities in in the short term will be improving the productivity of experienced, technical users."
SnapLogic is not the first data management vendor to deliver generative AI capabilities beyond the preview stage. It is, however, among the first.
Vendors including Alteryx and Informatica are among the many that have unveiled product development plans and have tools in testing and preview. But to date, Dremio, Monte Carlo and now SnapLogic are among the few to make them publicly available.
Meanwhile, the impetus for building SnapGPT came from SnapLogic's foundational goal of making data integration as simple as possible so more than just data engineers can develop integrations, according to Manish Rai, the vendor's vice president of product marketing.
Nearly half of all IT bottlenecks are related to data integration, so reducing the burdens placed on data engineers is critical, he noted. Toward that end, the vendor has been making incremental progress with the development of IRIS in 2017, low-code/no-code capabilities and a cloud-native architecture.
SnapGPT represents further progress.
"We saw the opportunity to spread integration further," Rai said. "The solution [to easing bottlenecks] is to empower citizen integrators to self-serve so they only go to IT for more advanced integrations. That's where we see the power."
The college try
One of SnapGPT's early users is Barnard College, a women's college within Columbia University in New York City.
Barnard started using SnapLogic in 2020, just before the onset of the COVID-19 pandemic, according to Nancy Mustachio, the college's director of enterprise applications. Until then, Barnard manually integrated its data by uploading and downloading files and using Excel spreadsheets to bring data together.
But before deploying SnapLogic, Barnard came up with a list of requirements and did an extensive search for the data integration platform that best met those needs.
The school looked at every vendor listed in Gartner's annual Magic Quadrant report on iPaaS platforms, narrowed its list down to a semifinal group of six that it had come to Barnard to give a demonstration, and then a final group of three that included Informatica, Tibco and SnapLogic.
Ultimately, after running the same data integration job using the three finalists to do a direct comparison, it chose SnapLogic.
"We not only judged by how well we were able to accomplish the business case, but what their user support was," Mustachio said. "What really pushed SnapLogic over the edge was that the learning curve [to use the platform] was minimal."
Despite Barnard's move away from manual data integration to use of SnapLogic, with a small IT team of eight -- including just three data experts -- and the college's data volume continuing to increase in line with a worldwide increase in data volume, Barnard's data integration demands could still be overwhelming, Mustachio continued.
So when SnapLogic introduced SnapGPT with the expectation of greater efficiency, Mustachio jumped at the chance for Barnard be one its early adopters.
The results have been as promised.
"When I discussed data integrations or manipulations with product teams, I used to have a gut feeling of, 'Oh my gosh,'" Mustachio said. "Now, when I have conversations, I don't have that angst. I feel like I have a strong team because I have a strong tool. I don't have that anxiety about trying to deliver when I say we can deliver."
Barnard tested SnapGPT by creating a data pipeline to integrate photos and was able to do so in less than an hour. Previously, that same pipeline would have taken an engineer two days to build.
One of the keys to that increased efficiency is being able to query SnapLogic using natural language and then receive recommendations, according to Marrah Arenas, an integration developer at Barnard.
When tasked with building an integration workflow, Arenas has to figure out which connectors and other tools to use during development, which is not always obvious.
"One of the challenges I face when building a prototype is using the right [tools]," Arenas said. "With [the recommendations], I'm able to work faster and smarter in terms of developing my integrations."
As a result, Mustachio and her team are able to spend less time developing data pipelines and more time working on error reports, audits, data security and scenario planning.
Future plans
With SnapGPT now generally available, Rai said the vendor's roadmap will continue to focus on simplifying data pipeline development.
That includes enabling more complex pipeline development with SnapGPT, broader use of synthetic data to test pipelines, better documentation so engineers can show how they build pipelines and using LLMs to build new connectors so SnapLogic can more quickly ingest data from new applications.
Mustachio and Arenas said they want more from SnapGPT. They noted that its initial iteration is already enabling Barnard to be more efficient, but there are ways it could be even more helpful.
"The ability to create more complex pipelines [with SnapGPT] would be the organic, natural next step," Mustachio said. "That really comes with teaching the AI so that the AI then teaches us. There will be certain things that will pop up that I want the tool to be able to say, 'Wait a minute, did you think about this?'"
During the testing phase, Arenas said she requested a simple way to rate SnapGPT's responses -- a thumbs-up or thumbs-down -- so the tool could learn which responses were most beneficial. That was added.
Now, she said she'd like a way to see the history of her command prompts.
Henschen, likewise, said that SnapLogic has room to improve SnapGPT.
He noted that the tool will help increase the productivity of experienced SnapLogic customers, but there are steps the vendor could take to make it easier for new users of the Intelligent Integration Platform.
"With a bit more contextual help and user-interface navigational assistance, SnapGPT could become more supportive of less technical users who are new to the platform," he said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.