LangGraph Server
Now that we have our final version of the AI Agent, it's time to build a LangGraph Server based on it. You have multiple deployment options to run your LangGraph Server but we're going to use our own Minikube cluster in a deployment called Standalone Container.
For details, you can refer to the links below:
Agent Docker Image
The first step is to create the Docker image for the server. The code below removes the lines where we execute the graph. Another change is for the Kong Data Plane address, referring to the Kubernetes FQDN Service.
langgraph.json
The Docker image requires a “langgraph.json” file with the dependencies and the name of the graph variable inside the code, in our case “graph”.
Docker image creation
Create the image with the “langgraph” CLI command. It requires Docker installed in your environment.
or
Push it to Docker Hub:
Agent Deployment
Install your LangGraph Service using the Helm Chart available:
The “values.yaml” defines the service as “LoadBalancer” to make it available. Currently, only Postgres is supported as a database for LangGraph Server and Redis as the task queue. The file specifies Postgres resources for its Kubernetes deployment. Finally, LangGraph Server requires a LangSmith API Key. LangSmith is a platform used to monitor your server. Log to LangSmith and create your API Key.
Deploy the LangGraph Server:
If you want to uninstall it, run:
LangGraph Server API
If the LangGraph Server is deployed, you can use its API to send requests to your graph.
Look for your assistants with:
The expected response is:
Use the assistant's name to invoke graph.
The expected response is:
Kong AI Gateway 3.11 and Support for New GenAI Models
With Kong AI Gateway 3.11, we'll be able to support other GenAI infrastructures besides LLMs - which include video, images, etc. The following diagram lists the new modes supported:

Here's an example of a Kong Route declaration with the AI Proxy Advanced plugin enabled to protect the text-to-image OpenAI's Dall-E 2 model,
In order to do it, Kong AI Gateway 3.11 defines new configuration parameters like:
- genai-category: is used to configure the GenAI infrastructure that the gateway protects. Besides
image/generation, it supports, for example, text/generation and text/embeddings for regular LLMs and embedding models, audio/speech and audio/transcription for audio based models implementing speech recognition, audio-to-text, etc. - route_type: this existing parameter has been extended to support new types, such as:
- LLM:
llm/v1/responses, llm/v1/assistants, llm/v1/files and llm/v1/batches - Image:
image/v1/images/generations, image/v1/images/edits - Audio:
audio/v1/audio/speech, audio/v1/audio/transcriptions and audio/v1/audio/translations - Realtime:
realtime/v1/realtime
Conclusion
This blog post has presented a basic AI Agent using Kong AI Gateway and LangGraph. Redis was used as a vector database and a local Ollama was the infrastructure that provided the Embedding Model.
Behind the Gateway, we've three LLM infrastructures (OpenAI, Mistral and Anthropic) and three external functions were used as tools by the AI Agent.
The Gateway was responsible for abstracting the LLM infrastructures and protecting the external functions with specific policies including Rate Limiting and API Keys.
You can discover all the features available on the Kong AI Gateway page.