From Consuming Tokens to Producing Them: The New Business Challenge in Enterprise AI
The emergence of advanced reasoning models and AI agents is fundamentally transforming how companies plan and budget for their technological strategies. Token usage—the unit of measurement on which large language models’ usage is billed—is skyrocketing: new reasoning models consume between 10 and 20 times more tokens than standard models just to process a single problem, and this figure multiplies exponentially as AI agents chain tasks and use tools autonomously.
Faced with this scenario, organizations are experiencing a paradigm shift: it’s no longer enough to use tokens efficiently. Success now depends on controlling inference infrastructure, routing queries to the most cost-effective access point, and in many cases, running proprietary models hosted internally and optimized for specific business needs.
The ‘Metal to Agents’ Pathway
Red Hat describes this journey as a ‘Metal to Agents’ path: an open and integrated end-to-end stack where each layer—from physical AI accelerators to the agents themselves—is connected and designed with security as a priority. This infrastructure must be compatible with a diverse ecosystem of hardware, including processors from NVIDIA, AMD, Intel, and custom silicon from leading cloud providers.
At the core of this system is inference, the key factor in scaling any AI strategy. Red Hat asserts that their work on projects like vLLM and distributed inference with llm-d has enabled a tenfold reduction in time to generate the first token and a threefold improvement in response quality in real-world applications.
Agents: The Heart of Business Strategy
Beyond infrastructure, focus is now shifting toward services for AI agents. These agents have moved beyond experimental projects to become central to modern business strategies, though they introduce new governance challenges: different teams use different tools, requiring each agent to have a verified identity, managing their lifecycle with version control, and leveraging emerging standards like MCP Services to connect agents seamlessly with tools and data—without creating security gaps.
Real-World Cases: BNP Paribas and NASA
Some organizations are already on this path with tangible results. BNP Paribas has generated nearly $600 million in value by scaling up 1,000 AI use cases on a unified platform, transforming GPU provisioning—from a process that used to take weeks—into a service delivered in minutes. Similarly, NASA’s Marshall Space Flight Center has adopted comparable platforms to migrate thousands of legacy workloads to containerized environments, reducing deployment times from days to minutes for mission-critical operations.
These examples highlight a broader trend: AI strategies are shifting from solely focusing on efficiency and cost savings to becoming drivers of growth and revenue. The ultimate goal, according to this approach, is for companies to own the platform underpinning their most critical operations, combining access to cutting-edge models with the control and governance expected from any responsible IT team.

