The Hidden AI Fragmentation Tax: Why AI Innovation Speed Will Depend on Your AI Program Margins
Everyone's telling you to innovate faster with AI. Move quicker. Ship more features. Deploy more agents. But before we sprint headlong into the AI revolution, we need to have a proper dollars-and-cents conversation that most companies are avoiding.
Disclaimer: Statistics cited in this post were reported in the 2025 State of AI Cost Governance Report by Mavvrik.
AI is eroding margins at an alarming rate
The data is stark: 84% of companies report more than a 6% hit to gross margin from AI costs.
Within that alarming statistic, 58% see a 6–15% reduction, and 26% report erosion of 16% or more. This isn't a rounding error: this is a fundamental business problem threatening the viability of AI-powered products and services.
According to the report, AI costs are already eroding the margins that businesses have worked years to build. Yet most organizations are racing ahead without understanding the true cost of their AI infrastructure.
Why — with all of the promise that AI brings — is this happening at such an alarming rate?
3 Root Causes of AI Cost Chaos
1. Lack of visibility means lack of action
Here's a sobering reality: only about 35% of companies include on-premises components in their AI cost reporting. Even more concerning, roughly half don't include LLM API costs in their tracking — even when AI is a core product component.
When teams are asked what would most improve their cost management, the answer is clear: unified visibility across environments is the top priority, followed closely by clear cost attribution. You can't optimize what you can't see.
2. You can't plan without forecasting
Only 15% of companies can forecast AI costs within ±10% accuracy. Let that sink in. The majority (56%) miss their forecasts by 11–25%, and nearly one in four companies miss by more than 50%.
For CFOs and budget owners trying to protect gross profit targets, this level of unpredictability is a nightmare. As AI grows as a share of cost of goods sold (COGS), the inability to accurately forecast becomes an existential threat to financial planning.
3. Multiple vendors amplify the chaos
A full 61% of companies run AI workloads across a combination of public and private environments and different tools. This hybrid pattern spans all company sizes, including small businesses, and creates exponentially greater difficulty in achieving unified cost reporting and governance.
All of this — mostly driven by lack of a unified approach to managing AI resources and AI connectivity — results in the hidden AI fragmentation tax. Let’s break it down.

Source: 2025 State of AI Cost Governance Report by Mavvrik
Breaking down the hidden fragmentation tax: The LLM token consumption problem
Let’s take an MCP-enabled agentic workflow as our example.
Each MCP client connects to MCP servers. Those servers interact with LLMs. Your end consumers trigger these interactions. And every single one of those touchpoints generates LLM API token consumption that quietly eats into your margins.
The 2025 State of AI Cost Governance Report reveals a critical insight: "Even companies that do not charge for AI-enabled products are heavy users of third-party LLMs (73%), meaning token-based costs are quietly reducing gross margins without being offset by direct revenue."
Think about that. Nearly three-quarters of companies are hemorrhaging margin to LLM providers without generating corresponding revenue.
But wait . . . there’s more!
LLMs aren’t the only resource that AI consumes, and they certainly aren’t the only driver of excess networking / egress charges, which is the second most common unexpected AI cost according to the report.

Source: 2025 State of AI Cost Governance Report by Mavvrik
Whether the AI application itself directly consumes them, or the AI application consumes an MCP server that consumes them, there is a massive amount of agentic consumption of both APIs and real-time data that organizations must be aware of.
So, when you map out the full architecture of modern AI applications, you see a sprawling web of connectivity:
- Developers and agents connecting directly to APIs
- Developers and agents connecting directly to LLMs
- Developers and agents connecting directly to MCP servers
- Developers and agents connecting directly to event streams
- MCP servers connecting to LLMs
- MCP servers connecting to APIs
- MCP servers connecting to event streams
- API gateways routing to thousands of APIs
- LLM gateways routing to hundreds of different LLM providers
- MCP gateways routing to MCP servers
- Event gateways connecting to event brokers and event streams
- Service meshes managing the vast array of microservices powering AI applications

And of course all of this exists — for many businesses — accross multiple on-prem, hybrid, cloud, and multi-cloud vendor environments.
Are you starting to see the looming problem? It’s time for a reality check.
The infrastructure reality check
The typical AI infrastructure today suffers from three critical problems:
- Disparate visibility into cost and consumption: You can see parts of the picture, but never the whole thing
- No central enforcement: Cost controls exist in silos without coordination
- Fragmented developer experience: Engineers waste time navigating incompatible systems
This is why — when the report explores “Tactics for improvement,” it finds:
- The most common tactic cited for improving AI cost management is unified visibility
- Clear cost attribution ranked second
- Better collaboration between teams is third
But, given this mess of connectivity to manage, how is this possible?
A path forward: Unified AI cost governance
The solution isn't to stop innovating or slow down. It's to build a unified AI cost governance layer that can unify real-time cost visibility & enforcement.
From a technical implementation perspective, this means building a single cost control plane that provides:
- Real-time cost analysis across all AI connectivity
- Real-time metering of actual usage
- Usage attribution down to the team, product, or customer level
- Limit enforcement to prevent runaway costs
This governance layer needs to span your entire AI infrastructure:
- AI Agents
- Kubernetes clusters
- Cloud LLMs
- On-premises LLMs
- MCP servers
- Event streams
- Cloud APIs
- On-premises APIs
- API Gateways
- LLM Gateways
- MCP Gateways
- Event Gateways
- Service Meshes
- Kubernetes Ingress Controllers
The good news? This cost governance control plane has already been built!
This is why we are so excited about OpenMeter and what it brings to the Konnect platform. By combining the existing LLM, MCP, API, Eventing, and microservices runtime infrastructure already in Konnect with the real-time metering, billing, and usage attribution of OpenMeter (now powering Konnect Metering and Billing), you have a single platform that can:
- Track and attribute AI resource consumption and usage
- Set up tiering, metering, and billing for AI resource usage and connectivity
- Enforce consumption limits based on whatever consumption metric you so choose

And all of this is, of course, on top of the security, reliability, performance, and developer experience benefits of using Konnect to build, run, discover, and govern the resources that drive AI, API, eventing, and microservices innovation.
Today, Konnect Metering & Billing is in early access. If you’d like to inquire into the early access program, fill out the form here. We’ll get back to you soon.
The stakes have never been higher: It’s time to get un-fragmented
Take it from the report itself: "As AI transforms from 'nice to have' to 'must have,' the companies that master cost visibility and control will protect their margins while competitors watch profits disappear into untracked infrastructure costs."
Here's what this means practically:
1. Margins & cost efficiency will be THE differentiator
In a world where everyone has access to similar AI capabilities, competitive advantage comes from operational excellence. The companies that can deliver AI-powered experiences profitably will win.
2. You need both governance AND velocity
This isn't about choosing between speed and control. Modern AI Connectivity platforms must deliver both — fast deployment cycles with comprehensive cost governance built in from day one.
3. Unify visibility AND enforcement
Seeing your costs isn't enough. You need real-time enforcement mechanisms that prevent overruns before they impact margins, not months later in a financial review.
4: Act now, not later
Every day without unified AI cost governance is another day of margin erosion. The longer you wait, the more deeply embedded these costs become in your COGS structure, making them exponentially harder to unwind.
The bottom line?
AI innovation is critical. Speed matters. But sustainable AI businesses are built on a foundation of cost visibility and control. Before you accelerate, make sure you can see where you're going—and how much it's actually costing you to get there.
The companies that figure this out will thrive. The ones that don't will find themselves with impressive AI capabilities and vanishing margins.
At Kong, we’re here to help you win the AI cost governance battle. Just give us a ring.
Unleash the power of APIs with Kong Konnect

