Google leader dives deep into AI agents
A technology fueled by generative AI is growing. Vendors like Google are providing products to help enterprises create their own agents. However, there are concerns.
In the world of large language models and generative AI, a new technology is gaining traction: AI agents.
AI agents are AI tools that can complete difficult tasks that require human assistance or help. While this technology can be confused with robotics process automation bots, AI agents are much more intelligent and tend to be grounded in generative AI technology. Vendors such as Google see AI agents as AI assistants to drive productivity in different industries and businesses.
In this Q&A, Google Clouds director of product management for Vertex AI Jason Gelman discusses Google's definition of AI agents and some of the challenges facing the emerging technology.
Editor's note: The following was edited for length and clarity.
How does Google define AI agents?
Jason Gelman: An agent is someone or something that acts on your behalf.
There's two pieces. It's empowered to act on your behalf. You give it permission and authority but also [make it] capable of completing a task on your behalf.
The empowered piece is you have to give it instructions. You have to give it authentication. If it has to access the system, it needs to know the rank password. It needs to know how to get there.
[Being] capable of completing something is, I think here -- there's a key here where LLMs are capable of planning. You have this concept of, 'Oh, it can break a task down and plan how it's going to accomplish the steps in a task.' That's the piece that's key.
Humans had to do that planning. Now you can let the model do that planning and then execute all the steps along the way, some of which are gathering information.
The authentication pieces are important. But some of it may just be a reasoning step that it needs to take in the middle to then take the next step. We do a lot of this implicitly, and we, as humans, have got a lot of practice doing it.
What are current use cases where AI agents can thrive?
Gelman: It's a wide variety of businesses.
Call centers are easy because they're we've been working in them for a long time, so the customers are already there and asking us for more. There's another one with a Mayo Clinic, and they're trying to navigate an impossible amount of information.
Different industries are looking at these in different places. We're seeing this everywhere.
What are some of the misconceptions of AI agents?
Gelman: There's a misconception that technology is further along than it is.
We're in the early days. We're building a lot of infrastructure -- things like authentication. If I'm going to give a password to some other system, we have to store it correctly and only use it in the right times. Things like function calling are other building blocks of how an agent creates the API call that it can then [use to] just talk to another computer system. You wish you could just give it the whole spec of the whole API and say, 'You go figure it out.' But right now, you have to give it a very specific call and say, 'If you're trying to accomplish this task, this is how you do it.'
It's sort of in the intern phase of life as opposed to the full-blown phase. It's not an executive. It's still early.
We're excited because we see the potential for it. But there's a misconception that LLMs seem like they're powerful. Therefore … next year, we'll have agents that do everything for us. It's going to take a little bit longer for the capabilities to be there but also for human trust to be there too.
I've mentioned it a few times when talking about agents. Waymo is probably safer than a human driver. Driverless cars are probably safer than a human driver. But we're still going to roll them out slowly because we're worried what happens when someone dies. Whose fault was it?
Agents don't have that physical component, right? Agents are just software. The bar may be a little bit lower because no one's going to get hit and killed by an AI agent like you would buy a car. But you want to have monitoring. You want to have visibility into debugging them. You don't want them to just be like a black box there.
That's the piece that we're working [on] to make sure that you have visibility into the agent so you understand what's going on.
There's a lot of tools that need to get built out for agents to be productive. The technology is early. The models show a lot of promise in planning capabilities. We're exploring how to make those that make planning and instruction-following work much more reliably.
It's already good, but we want human-level accuracy.
How can enterprises balance between trusting AI agents and knowing the technology is still young?
Gelman: Start simple. Start with lots of guardrails. Start with an agent that does one thing. Get it so that you have that agent doing something reliable.
You may be like, 'Okay, everyone knows how to do that. Why is this special?' But it's a matter of getting a proof of concept that the technology itself can do something. Every business needs to, at this point, prove it to themselves that this is capable, that the technology is capable of doing this. Then, you layer on the next one and the next one and the next one. Suddenly you have 100 different agents that can each do a single task well. Then, you start to combine them. That's where it gets interesting. It goes from being not interesting to interesting quickly.
But that first stage -- that proof stage -- is where we are with most agents right now. There are companies out there doing -- using our technology to do more complicated things. But I think that there's always a question of, 'Would you, as a financial services company, bet your actual dollars,' right? [Would you] make trades in the stock market based on things that agents are doing? Would you, as a doctor make final decisions based on what agents are doing? We're still the point where humans are in the loop on every one of those decisions. I don't know if we get away from that, where agents just collect and process information and bring it back to you after having sifted through a whole bunch of information that needed to be gathered.
That's kind of where we where we are.
At some point, you can say, 'Well, we can turn over this task to an agent.' But we're not there yet. It's going to be a little while.
What is the difference between Google's AI agent and Microsoft Copilot?
Gelman: Copilot is a product that a business user could use.
Jason GelmanDirector of product management for Vertex AI, Google
We think that the large implementation of that is going to be this API-driven, programmer-driven definition and execution of agents that you can integrate them into other applications versus copilots at destination for your own personal tasks.
One of [the places] where Vertex is impactful is that when our enterprise customers build an application, Vertex sits underneath, and your customers don't even know that it's Vertex. There's no reason for you to expose that versus Copilot. You know you're going to Copilot. That's the biggest differentiation there.
A lot of those solutions are sort of the continuation of a low-code/no-code [robotic process automation] world, and it's interesting. There's utility there, but it puts a lot of artificial restrictions on the impact of agents in a way that, if you're just at the API level, you can use them to do whatever it is that you need to do whatever task you're looking to accomplish.
Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.