December 18, 2025

8 min read

Augusto Marietti

CEO & Co-Founder of Kong

Kong was born to connect. The world is shifting from connecting cloud services with apps to connecting LLMs through agents. API calls and tokens are moving in tandem; a new unit of intelligence is forming. As AI traffic explodes into hypervolumes, speed is all that matters. The same principles of performance, security, and reliability behind Kong are essential in an agentic world. A new connectivity layer for AI is born.

From cloud native to agentic AI

A decade ago, we set out to connect the world through APIs, which we saw as fundamental building blocks of software. Before Kong, we founded Mashape as the first API marketplace to provide an assembly line for developers building apps, and then we open-sourced its core API runtime, Kong, the fastest, cloud-native API gateway.

Kong Inc. was officially born in the summer of 2017. A new era of microservices and cloud applications was unleashed. That moment took us from a small open-source gateway to the #1 API repo on GitHub and the leading cloud API management platform used by thousands of organizations worldwide — and recognized by Gartner, Forbes, and many others as a leader and one of the most visionary in cloud infrastructure.

Today, the world stands at another inflection point: agentic AI. Connectivity is a constant. Builders still need to build cloud-native apps that connect services to humans. But a new consumer entity is emerging in agents, the first true “productization” of AI.

For the first time, the internet is not just about graphical UI but programming interfaces (voice soon). APIs are the UI for an agentic world built for machine-to-machine consumption — not people. Agents will devour those interfaces at a scale we can’t even comprehend now. They discover, learn, and transact faster than humans. New stacks are emerging. Backends will change. For example, in cloud-native apps, you mostly use SQL databases; in the agentic world, we'll use storage/compute-separated databases, vector databases, lakehouses, and LLMs as sources of data gravity.

Mouse clicks, points, and scrolls will become less relevant. These will be used when things get stuck. An agentic workflow — with a voice overlay for human input — will become the primary method of internet consumption. Even if agent innovations plateau, more specialized agents will be deployed, thus increasing orchestration complexity and Kong's needs as traffic and connectivity challenges grow.

The next big quest will be the transition from connecting cloud services to connecting intelligence.

Welcome to the age of AI connectivity.

API calls + tokens

The connective tissue of the cloud era has been API calls in pure REST fashion. But in the age of AI, we stand at another inflection point: tokens. They reside at the intersection of API traffic and AI traffic.

APIs call. Now they think.

API traffic is adding an intelligence layer. Before a request leaves an endpoint or an MCP connection is established, it has context, memory, and reasoning. Soon enough, traffic itself will be intelligent.

The internet's next layer won’t just be made of API calls powering applications; it’ll be made up of models, AI agents, voice assistants, and co-pilots that flow intelligence back and forth via tokens. Every time a system sends a prompt to a model, every context window, every agent-to-agent conversation produces a new type of token traffic. But why does this matter?

Mountain pass.

I believe AI governance will be won right here. AI, to flow and transact, has to be securely governed. While initial market positioning puts this task near the data platform due to natural gravity, no company has a single data platform. Thus, the best place to manage the heterogeneous landscape of different clouds, data, and models is at the neutral traffic layer. It happened in the microservices revolution we led, and it will happen in AI.

It’s been our long-held belief that the most strategic wedge for building an API platform is to own the traffic-layer mountain pass. In the cloud-native world, API gateways were the broker technology. But in the agentic world, AI gateways are the next connectivity brokers orchestrating traffic between APIs and models – AI traffic will be 1000x larger. It's a machine scale.

Thus, the mountain pass is shifting.

Slow at first. Then all of a sudden. As new AI traffic emerges rapidly, new protocols appear (like REST in the cloud). MCP is the first mainstream form, but more AI protocols will emerge to enable AI to connect and talk in new ways. Our principles will always support what’s coming next for the most demanding companies.

We were born for this moment. The same “connectivity logic” used to manage APIs — or, better yet, the need for security, governance, acceleration, and observability — is even more critical for AI traffic.

Managing and delivering those tokens efficiently and securely is a new challenge that we solve.

APIs are for developers. MCP for LLMs. Just as Kong became the control plane for APIs, it will now become the control plane for AI traffic. The new unit of proxied value is tokens. Each token carries sensitive context and must be governed. Token throughput defines efficiency, and cost control is the difference between positive and negative margins.

Hypervolumes

As token factories multiply, a new AI scale law emerges. AI is bringing an order of magnitude and sizes that we have never seen before. Doing a billion API calls used to be a significant milestone for a business. Kong Gateway processes trillions of requests per day at ultra-fast speeds without dropping a single error. It's a huge number, yet it is not near what’s coming in the agentic AI era.

The AI scale race is about who wins at hypervolumes.

We now have to prepare our infrastructure for quadrillions and soon quintillions of API requests and AI traffic. The volume isn't just about calls and tokens, but also real-time data, events, MCP streamable sessions, and LLM prompts. It’s multi-modal. Its complexity is compounding. We not only have massive traffic volumes, but also heterogeneous ones. Most of the hardware and software infrastructure we’ve built until today will need to be redesigned from the ground up for hypervolumes. But not Kong. Kong was built for multi-modal hypervolumes.

For example, in May 2025, Google CEO Sundar Pichai presented a chart showing Gemini's monthly token processing had exploded nearly 50x from May 2024 to 480 billion tokens. By October, the company announced reaching 1.3 QUADRILLION monthly tokens across all Google AI products.

Trillions were a unit of scale we thought was hard to reach. Now we have to believe in quadrillions and soon quintillions as AI sets up a new scaling law. Hypervolumes.

Speed is all you need

Tokens are still highly inefficient. Prompt compression, notation optimizations, semantic caching, and intelligent routing will be needed at scale to reduce waste and distribute intelligence abundantly. These are all AI connectivity challenges that Kong helps solve.

In a world that’s already moving at a staggering pace and where data centers are becoming token-generation factories, revenue will be tied to power. And there are only two things that matter:

Tokens x Watt
Tokens x Second

The former is important because it's directly correlated to revenue and margins. The latter is essential because it's about revenue growth. The more tokens a business can push through, the more revenue it can produce. A half-second delay at the quadrillion-scale is a material revenue delta.

Tokens = Revenue.

We always wanted fast applications and low-latency backends with reliable API calls. Today, it’s not just about that faster digital experience — on someone waiting a few seconds longer to book a flight. This is about economic survival in the AI race. Businesses will have to transform themselves and acquire the fastest, most efficient AI stack from hardware to software. For example, in the CPU world, being 10% slower didn’t really matter, but in the GPU world, it's all about performance. Nvidia saw this a decade ago and captured most of the GPU market by building faster products than everyone else.

Speed is all you need.

As previously mentioned, the tokens x-second numbers we see today are still mainly human-driven. When agents go mainstream and consume tokens directly from other agents — to get work done — the throughput load will grow even more.

But there's a catch! To go fast and move tokens with speed, enterprises will have to build new guardrails to be able to move intelligence with security, reliability, and a limited blast radius if things go wrong. Once again, our opportunity is to act as a control tower on the traffic layer.

Tokens are how intelligence will move. Speed is how fast it will move the world.

Kong is speed.

A new connectivity layer

This unprecedented demand for speed requires new digital railroads. A connectivity layer for hypervolumes. For light-speed. With governance. For agents, API calls, models, and tokens. All wrapped around guardrails.

A connectivity layer for AI.

We have the opportunity to build the digital railroads to enable every business to flourish, not to be left behind, to speed up, and to unleash intelligence securely.

A new AI stack is emerging.

We have new AI cloud “neoclouds,” like CoreWeave, foundation models, LLMOps, and inference platforms, just to name a few. The “clients” of this new wave are not just websites, services, and applications but AI agents, eventually robotics. In sum, machines.

Where's the connectivity layer — the glue that holds it all — and why now?

The big bet is that the same architectural pattern that emerged in cloud-native microservices will reappear in agentic AI. About a decade ago, higher traffic was coming into the infrastructure of enterprises and tech companies; containers, Kubernetes, serverless, YAML files, CI/CD, and more were born to decouple monoliths into microservices to build new scalable systems.

There was a push to elastic-scale the backend into a self-sufficient, smaller piece of software. The cloud-native era was born.

This created an explosion of APIs to make it all talk, and HTTP RESTful traffic grew as more and more new apps and devices entered our lives. Initially, developers built business connectivity logic — rate-limiting, logging, authentication — directly into their microservices using libraries from their preferred programming languages. Then, as microservices grew into the hundreds, reinventing the wheel over and over again became R&D-intensive, and the API gateway was born.

The connectivity logic suddenly shifted to the proxy layer — abstracted, language-independent, and capable of dispatching the right controls (like rate limits) to each service. Engineering returned to focusing on building core IPs rather than traffic management. The API management market skyrocketed, and a complete API lifecycle was added around API gateways to help with building, running, discovering, and governing APIs.

This is repeating with AI. At first, there were only a handful of LLMs within an organization, and frameworks like LangChain emerged. As companies mature their AI-native efforts, more large/medium/small language models and agents will enter their infrastructure, driving AI traffic across calls, event streams, and tokens. As before, a new connectivity logic will need to be built. Guardrails won’t be constructed anymore in the backends/LLMops over and over again, but they'll be abstracted to (once again) a new pattern: AI gateway.

AI gateways aren't going to be the only kind of “AI middleware.” More types of traffic brokers will emerge from the edge of inference to internal agents-to-agents.

The initial connectivity layer of AI developing is similar to API gateways, but more complex and semantic in nature. We’ll support ingress/egress traffic for LLMs and MCP servers, applying guardrails from API calls to tokens in a unified experience. For example, request rate limiting will intertwine with token budgets.

The approach is the same; the connectivity logic will continue to live in the traffic layer, like an airport's control tower. Just as the control tower governs safety, routing, and pacing, AI Gateway governs tokens and agents' access and permissions. API and AI traffic commands will be dispatched to the right people, machines, services, and agents at the right moment. Intelligently flowing.

The future path for every business is AI.

Kong built the only unified API and AI platform — securing, managing, accelerating, governing, and monetizing the flow of intelligence across every data, model, and API — on any cloud.

A quadrillion tokens at a time. We’re deploying the modern railroads. The connectivity layer of AI to unleash intelligence securely.

The Age of AI Connectivity has just begun.

Learn More Get a Demo

Topics

AI Connectivity Agentic AI

Augusto Marietti

CEO & Co-Founder of Kong

In the Context Economy, Context is King

EnterpriseFebruary 25, 2026

Gartner's strategic planning assumption: by 2029, 50% of software application providers will be forced to share their context layer externally for third-party orchestrators to stay relevant. Today, that number is less than 2%. This isn't a gradual e

Augusto Marietti

The Platform Enterprises Need to Compete? Kong Already Built It

EnterpriseFebruary 25, 2026

A Response to Gartner’s Latest Research Gartner's strategic planning assumption stops you in your tracks: by 2029, 50% of software application providers will be forced to share their context layer externally for third-party orchestrators to stay r

Alex Drag

What is a MCP Gateway? The Missing Piece for Enterprise AI Infrastructure

Learning CenterJanuary 21, 2026

AI agents are spreading across organizations rapidly. Each agent needs secure access to different Model Context Protocol (MCP) servers. Authentication becomes complex. Scaling creates bottlenecks. The dreaded "too many endpoints" problem emerges.

Kong

Agentic AI Integration: Why Gartner’s "Context Mesh" Changes Everything

EnterpriseJanuary 16, 2026

The report identifies a mindset trap that's holding most organizations back: "inside-out" integration thinking. Inside-out means viewing integration from the perspective of only prioritizing the reuse of legacy integrations and architecture (i.e., s

Alex Drag

Kong Wins AI Innovator of the Year for Pioneering AI Connectivity

NewsMarch 2, 2026

SiliconANGLE Media runs this annual awards program to recognize companies, technologies, and people moving the needle in B2B tech. Winners go through a review process by industry analysts and experts. Kong was recognized as the 2026 AI Innovator

Eric Pulsifer

From Pixels to APIs: The Programmable Economy is the Agentic Economy

NewsFebruary 27, 2026

Augusto Marietti

AI Input vs. Output: Why Token Direction Matters for AI Cost Management

EnterpriseMarch 10, 2026

The Shifting Economic Landscape: The AI token economy in 2026 is evolving, and enterprise leaders must distinguish between low-cost input tokens and high-premium output tokens to maintain profitability. Agentic AI Financial Risks: The transition t

Dan Temkin

The Age of AI Connectivity

From cloud native to agentic AI

API calls + tokens

Hypervolumes

Speed is all you need

A new connectivity layer

Unleash the power of APIs with Kong Konnect

Recommended posts

In the Context Economy, Context is King

The Platform Enterprises Need to Compete? Kong Already Built It

What is a MCP Gateway? The Missing Piece for Enterprise AI Infrastructure

Agentic AI Integration: Why Gartner’s "Context Mesh" Changes Everything

Kong Wins AI Innovator of the Year for Pioneering AI Connectivity

From Pixels to APIs: The Programmable Economy is the Agentic Economy

AI Input vs. Output: Why Token Direction Matters for AI Cost Management

Ready to see Kong in action?