The Hidden AI Fragmentation Tax: AI Innovation Speed and Program Margins

October 27, 2025

6 min read

Alex Drag

Head of Product Marketing

Everyone's telling you to innovate faster with AI. Move quicker. Ship more features. Deploy more agents. But before we sprint headlong into the AI revolution, we need to have a proper dollars-and-cents conversation that most companies are avoiding.

Disclaimer: Statistics cited in this post were reported in the 2025 State of AI Cost Governance Report by Mavvrik.

The data is stark: 84% of companies report more than a 6% hit to gross margin from AI costs.

Within that alarming statistic, 58% see a 6–15% reduction, and 26% report erosion of 16% or more. This isn't a rounding error: this is a fundamental business problem threatening the viability of AI-powered products and services.

According to the report, AI costs are already eroding the margins that businesses have worked years to build. Yet most organizations are racing ahead without understanding the true cost of their AI infrastructure.

Why — with all of the promise that AI brings — is this happening at such an alarming rate?

1. Lack of visibility means lack of action

Here's a sobering reality: only about 35% of companies include on-premises components in their AI cost reporting. Even more concerning, roughly half don't include LLM API costs in their tracking — even when AI is a core product component.

When teams are asked what would most improve their cost management, the answer is clear: unified visibility across environments is the top priority, followed closely by clear cost attribution. You can't optimize what you can't see.

2. You can't plan without forecasting

Only 15% of companies can forecast AI costs within ±10% accuracy. Let that sink in. The majority (56%) miss their forecasts by 11–25%, and nearly one in four companies miss by more than 50%.

For CFOs and budget owners trying to protect gross profit targets, this level of unpredictability is a nightmare. As AI grows as a share of cost of goods sold (COGS), the inability to accurately forecast becomes an existential threat to financial planning.

3. Multiple vendors amplify the chaos

A full 61% of companies run AI workloads across a combination of public and private environments and different tools. This hybrid pattern spans all company sizes, including small businesses, and creates exponentially greater difficulty in achieving unified cost reporting and governance.

All of this — mostly driven by lack of a unified approach to managing AI resources and AI connectivity — results in the hidden AI fragmentation tax. Let’s break it down.

Source: 2025 State of AI Cost Governance Report by Mavvrik

Let’s take an MCP-enabled agentic workflow as our example.

Each MCP client connects to MCP servers. Those servers interact with LLMs. Your end consumers trigger these interactions. And every single one of those touchpoints generates LLM API token consumption that quietly eats into your margins.

The 2025 State of AI Cost Governance Report reveals a critical insight: "Even companies that do not charge for AI-enabled products are heavy users of third-party LLMs (73%), meaning token-based costs are quietly reducing gross margins without being offset by direct revenue."

Think about that. Nearly three-quarters of companies are hemorrhaging margin to LLM providers without generating corresponding revenue.

LLMs aren’t the only resource that AI consumes, and they certainly aren’t the only driver of excess networking / egress charges, which is the second most common unexpected AI cost according to the report.

Source: 2025 State of AI Cost Governance Report by Mavvrik

Whether the AI application itself directly consumes them, or the AI application consumes an MCP server that consumes them, there is a massive amount of agentic consumption of both APIs and real-time data that organizations must be aware of.

So, when you map out the full architecture of modern AI applications, you see a sprawling web of connectivity:

Developers and agents connecting directly to APIs
Developers and agents connecting directly to LLMs
Developers and agents connecting directly to MCP servers
Developers and agents connecting directly to event streams
MCP servers connecting to LLMs
MCP servers connecting to APIs
MCP servers connecting to event streams
API gateways routing to thousands of APIs
LLM gateways routing to hundreds of different LLM providers
MCP gateways routing to MCP servers
Event gateways connecting to event brokers and event streams
Service meshes managing the vast array of microservices powering AI applications

And of course all of this exists — for many businesses — accross multiple on-prem, hybrid, cloud, and multi-cloud vendor environments.

Are you starting to see the looming problem? It’s time for a reality check.

The typical AI infrastructure today suffers from three critical problems:

Disparate visibility into cost and consumption: You can see parts of the picture, but never the whole thing
No central enforcement: Cost controls exist in silos without coordination
Fragmented developer experience: Engineers waste time navigating incompatible systems

This is why — when the report explores “Tactics for improvement,” it finds:

The most common tactic cited for improving AI cost management is unified visibility
Clear cost attribution ranked second
Better collaboration between teams is third

But, given this mess of connectivity to manage, how is this possible?

The solution isn't to stop innovating or slow down. It's to build a unified AI cost governance layer that can unify real-time cost visibility & enforcement.

From a technical implementation perspective, this means building a single cost control plane that provides:

Real-time cost analysis across all AI connectivity
Real-time metering of actual usage
Usage attribution down to the team, product, or customer level
Limit enforcement to prevent runaway costs

This governance layer needs to span your entire AI infrastructure:

AI Agents
Kubernetes clusters
Cloud LLMs
On-premises LLMs
MCP servers
Event streams
Cloud APIs
On-premises APIs
API Gateways
LLM Gateways
MCP Gateways
Event Gateways
Service Meshes
Kubernetes Ingress Controllers

This is why we are so excited about OpenMeter and what it brings to the Konnect platform. By combining the existing LLM, MCP, API, Eventing, and microservices runtime infrastructure already in Konnect with the real-time metering, billing, and usage attribution of OpenMeter (now powering Konnect Metering and Billing), you have a single platform that can:

Track and attribute AI resource consumption and usage
Set up tiering, metering, and billing for AI resource usage and connectivity
Enforce consumption limits based on whatever consumption metric you so choose

And all of this is, of course, on top of the security, reliability, performance, and developer experience benefits of using Konnect to build, run, discover, and govern the resources that drive AI, API, eventing, and microservices innovation.

Today, Konnect Metering & Billing is in early access. If you’d like to inquire into the early access program, fill out the form here. We’ll get back to you soon.

Take it from the report itself: "As AI transforms from 'nice to have' to 'must have,' the companies that master cost visibility and control will protect their margins while competitors watch profits disappear into untracked infrastructure costs."

Here's what this means practically:

1. Margins & cost efficiency will be THE differentiator

In a world where everyone has access to similar AI capabilities, competitive advantage comes from operational excellence. The companies that can deliver AI-powered experiences profitably will win.

2. You need both governance AND velocity

This isn't about choosing between speed and control. Modern AI Connectivity platforms must deliver both — fast deployment cycles with comprehensive cost governance built in from day one.

3. Unify visibility AND enforcement

Seeing your costs isn't enough. You need real-time enforcement mechanisms that prevent overruns before they impact margins, not months later in a financial review.

4: Act now, not later

Every day without unified AI cost governance is another day of margin erosion. The longer you wait, the more deeply embedded these costs become in your COGS structure, making them exponentially harder to unwind.

AI innovation is critical. Speed matters. But sustainable AI businesses are built on a foundation of cost visibility and control. Before you accelerate, make sure you can see where you're going—and how much it's actually costing you to get there.

The companies that figure this out will thrive. The ones that don't will find themselves with impressive AI capabilities and vanishing margins.

At Kong, we’re here to help you win the AI cost governance battle. Just give us a ring.

Learn More Get a Demo

Topics:AI

API Management

LLM

The Hidden AI Fragmentation Tax: Why AI Innovation Speed Will Depend on Your AI Program Margins

AI is eroding margins at an alarming rate

3 Root Causes of AI Cost Chaos

1. Lack of visibility means lack of action

2. You can't plan without forecasting

3. Multiple vendors amplify the chaos

Breaking down the hidden fragmentation tax: The LLM token consumption problem

But wait . . . there’s more!

The infrastructure reality check

A path forward: Unified AI cost governance

The good news? This cost governance control plane has already been built!

The stakes have never been higher: It’s time to get un-fragmented

1. Margins & cost efficiency will be THE differentiator

2. You need both governance AND velocity

3. Unify visibility AND enforcement

4: Act now, not later

The bottom line?

Unleash the power of APIs with Kong Konnect