The big bet is that the same architectural pattern that emerged in cloud-native microservices will reappear in agentic AI. About a decade ago, higher traffic was coming into the infrastructure of enterprises and tech companies; containers, Kubernetes, serverless, YAML files, CI/CD, and more were born to decouple monoliths into microservices to build new scalable systems.
There was a push to elastic-scale the backend into a self-sufficient, smaller piece of software. The cloud-native era was born.
This created an explosion of APIs to make it all talk, and HTTP RESTful traffic grew as more and more new apps and devices entered our lives. Initially, developers built business connectivity logic — rate-limiting, logging, authentication — directly into their microservices using libraries from their preferred programming languages. Then, as microservices grew into the hundreds, reinventing the wheel over and over again became R&D-intensive, and the API gateway was born.
The connectivity logic suddenly shifted to the proxy layer — abstracted, language-independent, and capable of dispatching the right controls (like rate limits) to each service. Engineering returned to focusing on building core IPs rather than traffic management. The API management market skyrocketed, and a complete API lifecycle was added around API gateways to help with building, running, discovering, and governing APIs.
This is repeating with AI. At first, there were only a handful of LLMs within an organization, and frameworks like LangChain emerged. As companies mature their AI-native efforts, more large/medium/small language models and agents will enter their infrastructure, driving AI traffic across calls, event streams, and tokens. As before, a new connectivity logic will need to be built. Guardrails won’t be constructed anymore in the backends/LLMops over and over again, but they'll be abstracted to (once again) a new pattern: AI gateway.
AI gateways aren't going to be the only kind of “AI middleware.” More types of traffic brokers will emerge from the edge of inference to internal agents-to-agents.
The initial connectivity layer of AI developing is similar to API gateways, but more complex and semantic in nature. We’ll support ingress/egress traffic for LLMs and MCP servers, applying guardrails from API calls to tokens in a unified experience. For example, request rate limiting will intertwine with token budgets.
The approach is the same; the connectivity logic will continue to live in the traffic layer, like an airport's control tower. Just as the control tower governs safety, routing, and pacing, AI Gateway governs tokens and agents' access and permissions. API and AI traffic commands will be dispatched to the right people, machines, services, and agents at the right moment. Intelligently flowing.
The future path for every business is AI.
Kong built the only unified API and AI platform — securing, managing, accelerating, governing, and monetizing the flow of intelligence across every data, model, and API — on any cloud.
A quadrillion tokens at a time. We’re deploying the modern railroads. The connectivity layer of AI to unleash intelligence securely.
The Age of AI Connectivity has just begun.