By Andrew Wig
As AI gets better at serving us, it’s also becoming more demanding.
It’s gone from the predictive work of analyzing and classifying data, to the generative work of creating text and images, to the point now where it has agency, capable of using tools to complete tasks on its own. AI is no longer just about finding connections in data or spitting out reports; agents are about action and complex coordination, driving workflows involving dozens or hundreds of steps across multiple models, tools and systems.
And all that infrastructure requires an “immense amount of innovation” in networking, storage and hardware, observes Rohit Badlaney, general manager of IBM Cloud, Product, Design and Industry Platforms. The unique requirements of agentic AI will force 80% of enterprises to modernize their “legacy cloud environments” to new platforms for AI workloads by 2027, market research firm IDC predicts.


The overarching component enabling agentic AI is the orchestration layer, where agents can be built, managed and deployed. As agents work under the orchestration layer, they rely on context and memory to know what to do.
Supporting agents in this capacity is the model context protocol (MCP) server, which standardizes the way agents interact with external tools, databases, APIs and other agents. The MCP server also helps support the vectorized knowledge bases that provide persistent memory for decision-making and multi-step reasoning.
“You create the agents. You bring the agents together. You make those agents intelligent by creating the vectorized knowledge base and putting them onto the MCP server so they can talk to each other,” Shrivastava explains.
The compute resources powering agentic AI vary depending on the type of model.
On one hand, producing the tokens that drive generative AI processing requires a lot of video memory and powerful GPU cards, notes Paulo Pereira, worldwide director for mainframe modernization at AWS. But the needs change when AI becomes agentic.
Though Pereira notes the delta in overall resource consumption between purely generative AI and agentic AI is “not that significant,” agentic AI requires a dedicated runtime environment—"actual computers for these agents to run those tools and perform the actions that will be driven by the token generated by the generative AI." That's in contrast to purely generative AI, which uses a thin client supported by massive compute on the backend, he explains.
Looking beyond hardware, both IBM and AWS are acknowledging the importance of open-source technologies for agentic AI. “There's a tremendous amount of innovation happening in open source,” Badlaney says.
And key to IBM’s AI strategy, he notes, is the integration of Red Hat OpenShift. The open-source Kubernetes platform, which is owned by IBM, can also be deployed through AWS.
The cloud is not a monolith. It’s home to all kinds of hardware to accommodate various workloads. In the IBM Cloud, for instance, that includes the full breadth of Nvidia silicon, but also AMD and Intel processors.
“We find certain AI accelerators better suited for inferencing and certain (accelerators) better suited for training. And so we've done extensive work on full-stack optimization depending on the kind of GenAI workload you're running,” Badlaney says. “ … “They all have unique characteristics when it comes to the throughput of inferencing and the total cost of ownership to the end clients.”
IBM’s approach to the cloud and AI is hybrid, where workloads requiring high transactional throughput and real-time inference are left on-premises, while GenAI workloads go to the cloud, he adds.