What is an AI Gateway?
The rise of AI and LLMs in our world is revolutionizing the applications we’re building and the customer experiences we’re delivering. This is one of the pivotal moments in our industry where we cross over an intersection in our technology evolution to enter a new journey with a paradigm shift. Past intersections were the rise of mobile, the rise of the cloud, and the rise of microservices, among others. Some people may even say that AI is as revolutionary as the birth of the internet.
Involved as I am in the world of APIs in my role as CTO of Kong, I can’t help but notice that AI usage is driven by APIs as its backbone:
- When we use AI, we do it via an API — even when there is a browser prompt which is powered by an underlying API.
- When AI interacts with the world, it does so with APIs, which will fundamentally drive an exponential increase in the number of APIs as more and more products and services want to enable themselves to be consumed by AI.
As such, the secular tailwinds of APIs are being strengthened by the adoption of AI. Today, more than 83% of Internet traffic is API traffic. Now that we have a new category of AI traffic, I wonder how much API traffic will grow in the coming years and how much of that traffic will be related to AI. The majority of our API traffic today is already driven by non-browser experiences like mobile and wearable devices, and AI is poised to take a big chunk of net-new API traffic in the world.
This graph shows the types of API traffic. How much API traffic will be captured by AI in the coming years?
Adopting AI
To start using AI in our products, developers and organizations around the world need to develop new best practices and adopt new technologies to fundamentally establish how AI consumption is being governed and secured.
There are many areas that need to be addressed in this regard:
- AI and data security: We must prevent customer and sensitive data from being fed into AI/LLMs, which will cause privacy leaks, security escalations, and potential data breaches.
- AI governance: We need to be able to manage how applications and teams are consuming AI across all providers and models, with a single control plane to manage, secure, and observe AI traffic being generated by the organization. Without a control plane for AI that gives this level of visibility and control, the organization is blind to how teams are adopting AI in their products, and if they’re doing it in the right way.
- Multi-AI adoption: We should be able to leverage the best AI for the job and lower the time it takes to integrate different LLMs and models. Generalist LLMs and specialized LLMs are going to be adopted to cater to different tasks, and developers may adopt different cloud-hosted or open source models based on their requirements, with OSS models rapidly catching up in performance and intelligence.
Open source models are rapidly improving and even surpassing some private models. Source: ARK Invest
As the adoption of AI increases in the organization, we want developers to rapidly iterate and build new capabilities without having to manage the specific cross-cutting concerns around AI usage. Therefore, to improve the productivity of the application teams and to securely and responsibly leverage AI in the organization, AI needs to become a service offered by the core platform in the organization, which is available for consumption from any product that may want to use it. By doing so, we can avoid reinventing the proverbial wheel when it comes to AI adoption across our teams.
Introducing the AI Gateway pattern
To accelerate the adoption of AI in the organization with the right level of observability, security, and governance, organizations will start adopting the AI gateway to provide a distributed AI egress to any LLM and model that developers may want to use. By doing so, we have a centralized place to manage AI consumption across every team, whether the AI models are hosted in the cloud or self-hosted.
The AI gateway operates in a similar way to a traditional API gateway: instead of acting as a reverse proxy for exposing our internal APIs to other clients, it is being deployed as an egress proxy for AI traffic generated by our applications. That traffic is being directed either inside or outside the organization depending on where the backend AI models are being hosted (in the cloud or self-hosted).
Kong AI Gateway: Multi-LLM Adoption Simplified. AI-Native Gateway for governance & control.
Some of the many cross-cutting concerns that need to be addressed with AI adoption.
As the adoption of AI increases in the organization, we need to accelerate our teams to be able to innovate without having to address the same cross-cutting concerns in every application. Without an AI gateway, we risk introducing complexity, fragmentation, security blindspots, shadow IT, and overall lower efficiency and higher costs.
In the same way that we adopt an API gateway to cater to API cross-cutting concerns, we will adopt an AI gateway to do the same for all AI consumption. To simplify the architecture, ideally, the AI gateway will be managed by the platform team and offered as an internal core service to every application that needs to use it. This has the benefit of providing a unified control plane to manage all AI consumption generated by the organization and therefore gives us the opportunity to more quickly implement security policies, observability collection, and potentially implement developer pipelines for automating the onboarding of different teams whenever they need to access AI in their applications.
AI gateway becomes a core platform service that every team can use.
An AI gateway will support multiple AI backends (like OpenAI, Mistral, LLaMA, Anthropic, etc.) but still will provide one API interface that the developers can use to access any AI model they need. We can now manage the security credentials of any AI backend from one place so that our applications don’t need to be updated whenever we rotate or revoke a credential to a third-party AI.
Then it can implement prompt security, validation, and template generation so that the prompts themselves can be managed from one control plane and changed without having to update the client applications. Prompts are at the core of what we ask AI to do, and being able to control what prompts our applications are allowed to generate is essential for responsible and compliant adoption of AI. We wouldn’t want developers to build an AI integration around restricted topics (political, for example), or to mistakenly set the wrong context in the prompts which can then be later exploited by a malicious user.
Because different teams will most likely use different AI services, the AI gateway can offer a standardized interface to consume multiple models, simplifying the implementation of AI across different models and even switching between one another.
The AI gateway could also implement security overlays like AuthN/Z, rate-limiting, and full API lifecycle governance to further manage how AI is being accessed internally by the teams. At the end of the day, AI traffic is API traffic.
AI observability can be managed from one place and even sent to third-party log/metrics collectors. And since it’s being configured in one place, we can easily capture the entirety of AI traffic being generated to further make sure the data is compliant, and that there are no anomalies in the usage.
Last but not least, as we all know AI models are quite expensive to run, being able to leverage the AI gateway will allow the organization to learn from its AI usage to implement cost-reduction initiatives and optimizations.
Conclusion
Artificial Intelligence driven by recent developments in GenAI and LLM technologies is a once-in-a-decade technology revolution posed to disrupt the industry and forever change the application experiences that every organization is building for their customers.
APIs have always been driven by new use cases (think of the mobile revolution, microservices, and so on) and AI further increases the amount of API traffic in the world: either APIs that our applications use to communicate with the AI models themselves, or APIs that the AI models use to interact with the world.
AI consumption is hard. There are serious data compliance ramifications when using models that can be shared across different applications, and perhaps even shared across different organizations (think of fully managed cloud models hosted in a shared environment). In addition, AI traffic needs to be secured, governed, and observed like we already do for all other types of API traffic coming inside or outside the organization. It turns out there is a great deal of infrastructure we need to put in place to use AI in production.
With an AI gateway, we can offer an out-of-the-box solution for developers to use AI in a quick and self-service way, without asking them to build AI infrastructure and all the cross-cutting capabilities that it requires. By adopting an AI gateway, the organization has the peace of mind that they have full control of the AI traffic being generated by every team and every application, and they finally have the right abstraction in place to ensure that AI is being used in a compliant and responsible way.
Marco Palladino,
CTO at Kong