Managing AI Agents: Early Reports From the Field

Hardly your robot overlords, AI agents exist to be managed, and we're beginning to see what that looks like

By Andrew Wig

To the knowledge worker seeking freedom from drudgery, the possibilities of agentic AI workflows are tantalizing. Who wouldn’t want to unload the most tedious aspects of their job to a tireless colleague who was born—no, lives—for that kind of work?

 

But those agents aren’t completely autonomous. Otherwise, they’d be working for themselves. Agents exist to be managed. Sometimes, they’re managed by other agents. But always, a human is the one ultimately pulling the strings, sitting atop the swirl of models, tools and data.

 

So what’s that like? 

New Tech, Familiar Structure

From an organizational standpoint, it’s a lot like managing humans, says Suzanne Livingston, VP of watsonx Orchestrate, IBM’s platform for building and managing AI agents. 

 

“Managing agents requires a high degree of professionalism and preparation, just as if you were to hire a contractor from the outside,” 
Livingston explains.

Human organizations have managers, team huddles and one-on-ones. They assign tasks and provide the necessary tools. They assess employees’ effectiveness. 

 

“You have the ceremonies and the processes that allow you to run a bigger and bigger organization. And I think real agentic workflows will resemble a lot of those patterns,” says Santiago Suarez Ordoñez, founder and CEO of Momentum.io, an AI-driven sales platform.

 

In Livingston’s case, each agentic task starts with a question or statement in Orchestrate—“I mean, any question that you have as an employee,” she says. 

 

And underneath that surface-level agent, there are 40-some agents her request could be routed to, and hundreds more agents under those agents, busying themselves calling tools and taking on more tasks.

 

Before, the task of transferring an employee, for instance, would start with a call to HR, where an employee would explain what steps need to be taken. “That's the old manual way of doing it,” Livingston says.  

 

In the agentic approach, that imaginary HR employee is not out of the job. Instead, Livingston explains, they are focused on higher-value activities, the stuff he got into this line of work to do in the first place—in other words, “the fun part of the job.” 

 

“Yeah, it's like giving you that back,” she says. 

More Time for Golf?

For AI to work, people have to trust the models’ output and the agents’ conduct. Trust can be thought of as the lubrication that keeps agentic workflows humming. If people can’t trust that the agents are doing their job, the workflow is paralyzed. And people have trust issues when it comes to AI.

 

“Humans derive trust from humans. It’s hard to trust an LLM alone,” 
Suarez Ordoñez says.

If someone is making a large purchase, Ordoñez explains, they want a human on the other end of the transaction. The way Suarez Ordoñez sees it, humans provide the reputation while the agents do the grunt work. In this world, facetime is paramount.

two golfers shaking hands
“I do think there's a possibility that we go back to a world where golf meetings and lunches become more critical than ever before, because you want to make sure you're dealing with a human and you want to break bread and get to know them a little bit,”  Suarez Ordoñez says. “They ultimately represent the business that you're going to engage with.”

In this vision, humans do what humans do best, socializing and communicating with each other in rich ways that digital workers cannot duplicate. That’s one reason Suarez Ordoñez believes they don’t represent the wholesale replacement of human workers. 

 

“It's just a lot easier to communicate and stay aligned with humans than it is with models,” he says.

A Morning Dialog

Paulo Pereira, worldwide director for mainframe modernization at AWS checks, on his AI agents every morning. He communicates with his agentic team using Telegram as the frontend chat, but he manages the agents with the popular open-source platform OpenClaw. 

 

As a personal project, Pereira is using those agents to parse and feature SMF records.
“Instead of me fixing the code myself or validating the code myself, I just tell it where the data is,” 
he explains. “I tell it to run, I tell it to show me what the results are, I look at the results and I tell it what's wrong or not.” 

Since the SMF records are documented, it’s an optimal use case for agentic AI, because the agent can then learn how to parse the data. “And then it's just this big deterministic code that it's creating,” Suarez Ordoñez says. 

 

A human conducting the same task would have to read through thousands of pages of documentation before writing every line of code. An effort that would have taken six months in 2019 now takes seven or eight days chatting in Telegram, Suarez Ordoñez notes. And through this back-and-forth, he doesn’t have to write a line of code himself.

 

The important caveat is that this is a pet project. “If you're thinking about the production scenario, then that becomes more tricky to manage,” he admits. 

The Dual-Role Human:

Managing Complexity

If humans continue to be required in the management of ever-more-complex agentic workflows, they can’t afford to be mindless functionaries. They have to know what they’re doing—or more importantly, what the agent is doing. 

 

Successfully working with agents requires attention to prompt quality and the behaviors assigned to the agents. And that requires deep knowledge of the process being replaced, Livingston says. “They have to know the thing that we're trying to create so deeply that they can guide the agent and give it the proper evaluations, give it the proper guardrails,” she says. 

 

So while conversations about the role of humans in an AI-infused world typically focus on their emotional intelligence or “soft” skills, building and maintaining agents will still require technical know-how. 

 

Often, the agents’ manager is also their creator. “I'm building these agents. I'm checking on their performance. I'm seeing how reliable they are. I'm evaluating them,” Livingston illustrates. 

 

In her employee transfer example, the human HR staffer still owns that process. They’re just overseeing the agents executing the tasks. And just like they would with human employees, they’re evaluating those agents and finding ways to help them improve.

 

This scenario is not yet a reality, at least not at scale. But as someone who’s already familiar with the agentic experience, it’s not hard for Livingston to imagine it: “I could definitely see it going in that direction.”
Share this article

Read more about ...