December 11, 2025

14 min read

Claudio Acquaviva

Principal Architect, Kong

Jason Matis

Staff Solutions Engineer, Kong

"Too much information running through my brain." When The Police sang this opening line on their 1981 album Ghost in the Machine, they weren't thinking about artificial intelligence, but the sentiment perfectly captures the state of modern AI.

The song warns of an overload of data that parallels how modern AI agents process extensive collections of messages and data. Model Context Protocol (MCP) is designed to manage this potential mess by creating channels for models to receive only the information they need. In a world where AI systems risk becoming "information junkies," MCP acts as a framework that organizes data, allowing AI to operate in a standardized and much cleaner way.

Kong MCP Gateway Introduction

In the latest Kong Gateway 3.12 release, announced October 2025, specific MCP capabilities have been released:

AI MCP Proxy plugin: it works as a protocol bridge, translating between MCP and HTTP so that MCP-compatible clients can either call existing APIs or interact with upstream MCP servers through Kong
AI MCP OAuth2 plugin: implements the OAuth 2.0 specification for MCP servers
The Prometheus plugin has been extended to generate MCP-specific metrics

The Kong AI Gateway sits in between the AI applications we build and the MCP Servers and GenAI models we consume.

In fact, the Kong MCP Gateway sits side-by-side with the Kong LLM Gateway. Here is a diagram illustrating the components. As you can see the Kong AI Gateway implements the fundamental capabilities as the LLM Gateway and MCP Gateway are responsible for their specific flows: MCP servers and GenAI models.

This blog post explores how developers can leverage Kong MCP Gateway to implement better and smarter AI agents. We are going to focus on:

Converting Kong Gateway Services into MCP Servers
Consuming and protecting existing MCP Servers through Kong MCP Gateway

Kong and MCP

MCP Introduction

We’ve seen a lot of momentum around the Model Context Protocol (MCP), the new standard proposed by Anthropic in November 2024. For a great introduction to MCP, check the blog Michael Field, Principal Technical Product Marketing Manager here at Kong, wrote early this year.

As described in its official portal, MCP follows a client-server architecture where an AI application (tool or AI Agent) connects to MCP Servers. The MCP specification defines three participants as illustrated above in the diagram taken from the MCP portal:

MCP Host: The AI application (tool, application, agent) that coordinates and manages one or multiple MCP clients
MCP Client: A component that maintains a connection to an MCP server and obtains context from an MCP server for the MCP host to use
MCP Server: A program that provides context to MCP clients

Each MCP Server is responsible for exposing MCP primitives to the Clients:

Tools: functions that a MCP Client can invoke.
Resources: Data sources that provide contextual information to the Client.
Prompts: Reusable templates that help structure interactions with language models (e.g., system prompts, few-shot examples)

Check the Architecture Overview page in the MCP documentation portal to learn more about it.

Kong AI/MCP Gateway

Currently most organizations still don’t have a consistent way to standardize the MCP development process or enforce best practices. As you’d expect, this often results in risky MCP server deployments and, in the long run, it also means longer and more expensive evolving and troubleshooting processes.

With Kong MCP Gateway we expose and secure all MCP Servers in a single place. Also, the MCP Gateway controls the consumption of the MCP Servers with OAuth 2.1, as described in the MCP documentation portal.

MCP Communication Fundamentals

MCP introduces the notion of Layers:

Data layer: defines the protocol for MCP Client and MCP Server communication, based on JSON-RPC 2.0.

Transport Layer: defines the mechanisms to enable the data exchange between the MCP Client and MCP Server. Basically it supports two mechanisms:

Stdio: based on standard input/output streams. It's helpful for communication between tools like Cursor, which plays the MCP Host role, and local MCP Server running in Docker, for example.
Streamable HTTP: the MCP Client and MCP Server is based on HTTP.

For the purpose of this blog post, where the Kong AI Gateway mediates the communication between the MCP Client and MCP Server, we are going to focus on the Streamable HTTP mechanism.

JSON-RPC 2.0

To get started, let's exercise a little bit the JSON-RPC messages we typically have in a MCP communication. To put simply, JSON-RPC is an RPC-based mechanism where the parts communicate through JSON based messages.

For example, here's a simple JSON-RPC server, written in Python, which exposes the well-known WeatherAPI public service. To give it a try, you can call the service directly with the following request. It assumes you have an environment variable set with your WeatherAPI API Key.

curl "https://api.weatherapi.com/v1/current.json?q=Milan" -H key:$WEATHERAPI_API_KEY

The architecture is quite simple:

The code uses the "json-rpc" Python package, which implements the JSON-RPC specification. Some comments about the code:

The json-rpc library defines the “dispatcher” construct where we define the methods supported by the JSON-RPC server.
The server uses the “http.server” package so it can be called using HTTP protocol by tools like Curl.

from http.server import BaseHTTPRequestHandler, HTTPServer
from jsonrpc import JSONRPCResponseManager, dispatcher
import json, os, requests


# Constants
BASE_URL = "https://api.weatherapi.com/v1/current.json"
WEATHERAPI_API_KEY = os.getenv("WEATHERAPI_API_KEY");

# Register multiple JSON-RPC methods on the same dispatcher
@dispatcher.add_method
def ping():
    return "pong"

@dispatcher.add_method
def add(x: int, y: int):
    return x + y

@dispatcher.add_method
def get_current_weather(city: str):
    """Fetch current weather data for a given city."""
    params = {"key": WEATHERAPI_API_KEY, "q": city}
    
    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()
    
    data = response.json()
    location = data["location"]["name"]
    temp_c = data["current"]["temp_c"]
    temp_f = data["current"]["temp_f"]
    condition = data["current"]["condition"]["text"]

    return f"Weather in {location}: {temp_c} Celsius, {temp_f} Fahrenheit, {condition}"




# Define simple JSON-RPC handler
class JSONRPCHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        content_length = int(self.headers.get("Content-Length", 0))
        request_json = self.rfile.read(content_length).decode("utf-8")

        # Handle JSON-RPC request
        response = JSONRPCResponseManager.handle(request_json, dispatcher)
        response_json = json.dumps(response.data)

        # Return response
        self.send_response(200)
        self.send_header("Content-Type", "application/json")
        self.end_headers()
        self.wfile.write(response_json.encode("utf-8"))

if __name__ == "__main__":
    server = HTTPServer(("127.0.0.1", 4000), JSONRPCHandler)
    print("JSON-RPC server running at http://127.0.0.1:4000")
    server.serve_forever()

After starting the server, you can send requests like these ones:

# curl -X POST http://127.0.0.1:4000 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "method":"ping", "id":1}'

# curl -X POST http://127.0.0.1:4000 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "method":"add", "params":{"x":4,"y":5}, "id":1}'

# curl -X POST http://127.0.0.1:4000 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0", "method":"get_current_weather", "params":{"city":"Milan"}, "id":1}'

Let's take a look at the messages going back and forth. The latest request sends a message like

{
  "jsonrpc":"2.0",
  "method":"get_current_weather",
  "params":{
    "city":"Milan"
  },
  "id":1
}

The message is fully compliant with the JSON-RPC 2.0 specification where it defines the Request Object with the following members:

jsonrpc: specifies the version of the JSON-RPC protocol, in order case, "2.0".
method: contains the name of the method we want to invoke. Note the methods are defined as “dispatchers” methods in our code.
params: contains the structured value that holds the parameters to be used for the invocation. They are defined in our Python functions.
id: it's an identifier defined by the caller. It acts as a correlation key so the client knows which result belongs to which call. When omitted, the request is just a notification and the server must not send a response.

The Server is expected to return a Response Object. That's done by the JSONRPCResponseManager class imported from the same “jsonrpc” package. Here's ours:

{
  "result": "Weather in Milan: 9.1 Celsius, 48.4 Fahrenheit, Sunny",
  "id": 1,
  "jsonrpc": "2.0"
}

The Object has the following members:

jsonrpc: the same member we found in the Request Object.
result: the actual response.
error: required on errors, which is not our case.
id: this is required and must be the same as the value of the id member in the Request Object. In our case, “1”.

This is a good exercise but does not really implement our goal. What we really want is to have the Kong Data Plane playing the MCP Server role. That means we need to configure the Data Plane with the new MCP Gateway capabilities. However, before doing so, we need to dive a bit deeper into the MCP Transport Layer.

MCP Transport Layer - Streamable HTTP

As stated before, MCP defines two transport mechanisms: Stdio and Streamable HTTP. Since we are going to add Kong AI/MCP Gateway in our cases, we are going to focus on Streamable HTTP.

We can build Streamable HTTP-based MCP Servers using multiple programming languages, for example, TypeScript, Python or C#. Specifically for Python, FastMCP packages are recommended. In fact, the official MCP Python SDK has incorporated FastMCP to provide a simpler MCP Server development.

MCP Server

Here's the MCP Server version, using FastMCP, of our JSON-RPC 2.0 Server. As you can see the topology and the messages going back and forth are quite similar to it.

import os

import requests
from mcp.server.fastmcp import FastMCP

# Initialize FastMCP server
mcp = FastMCP("weather", stateless_http=True)

# Constants
BASE_URL = "https://api.weatherapi.com/v1/current.json"
WEATHERAPI_API_KEY = os.getenv("WEATHERAPI_API_KEY");

# Register multiple MCP tools
@mcp.tool()
def ping() -> str:
    """Return pong"""
    return "pong"

@mcp.tool()
def add(x: int, y: int) -> int:
    """Add two numbers"""
    return x + y

@mcp.tool()
def get_current_weather(city: str = "Firenze") -> str:
    """Fetch current weather data for a given city."""
    params = {"key": WEATHERAPI_API_KEY, "q": city}
    
    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()
    
    data = response.json()
    location = data["location"]["name"]
    temp_c = data["current"]["temp_c"]
    temp_f = data["current"]["temp_f"]
    condition = data["current"]["condition"]["text"]

    return f"Weather in {location}: {temp_c} Celsius, {temp_f} Fahrenheit, {condition}"


if __name__ == "__main__":
    mcp.run(transport='streamable-http')

Some comments here:

By default, a MCP Server is stateful, meaning it starts and keeps sessions with MCP Clients. The official MCP documentation describes the Session Management implemented by a Server and supported by the Client. At the same time, the FastMCP constructs allow a Stateless implementation. Read the Python SDK documentation to learn more. You can check that in the line mcp = FastMCP("weather", stateless_http=True).
The code defines correspondent MCP Tools to each dispatcher we have in our JSON-RPC 2.0 Server.
By default, the MCP Server listens to the port 8000.

With the MCP Server running, we can send similar JSON-RPC requests to it. For example, to list all tools available in the MCP Server run:

curl -X POST http://127.0.0.1:8000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{
    "jsonrpc":"2.0",
    "id": 1,
    "method": "tools/list"
  }'

Note that the Server provides a specific “/mcp” path where we should send requests. Also, the client (in our case, Curl) must accept both “application/json” and “text/event-stream” media types.

The MCP Server provides several methods to expose its capabilities, including Prompts, Resources, and Tools. For example, our MCP Server code defines three Tools: “ping”, “add”, and “get_current_weather”. With the request above, we can send a request to the MCP Server and get the list of its Tools with all their specifications.

Here's the output of the request:

event: message
data: {"jsonrpc":"2.0","id":1,"result":{"tools":[{"name":"ping","description":"Return pong","inputSchema":{"properties":{},"title":"pingArguments","type":"object"},"outputSchema":{"properties":{"result":{"title":"Result","type":"string"}},"required":["result"],"title":"pingOutput","type":"object"}},{"name":"add","description":"Add two numbers","inputSchema":{"properties":{"x":{"title":"X","type":"integer"},"y":{"title":"Y","type":"integer"}},"required":["x","y"],"title":"addArguments","type":"object"},"outputSchema":{"properties":{"result":{"title":"Result","type":"integer"}},"required":["result"],"title":"addOutput","type":"object"}},{"name":"get_current_weather","description":"Fetch current weather data for a given city.","inputSchema":{"properties":{"city":{"default":"Firenze","title":"City","type":"string"}},"title":"get_current_weatherArguments","type":"object"},"outputSchema":{"properties":{"result":{"title":"Result","type":"string"}},"required":["result"],"title":"get_current_weatherOutput","type":"object"}}]}}

The request and the corresponding response follow the Transport specification for sending messages to a server.

And this second request calls the “get_current_weather” Tool:

curl -X POST http://127.0.0.1:8000/mcp \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{
    "jsonrpc":"2.0",
    "method":"tools/call",
    "params":{
      "name": "get_current_weather",
      "arguments": {"city":"Milan"}
    },
    "id": 1
}'

And the response should follow the Tools specification and be something like:

event: message
data: {"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"Weather in Milan: 4.1 Celsius, 39.4 Fahrenheit, Light rain"}],"structuredContent":{"result":"Weather in Milan: 4.1 Celsius, 39.4 Fahrenheit, Light rain"},"isError":false}}

Kong version

Now, let's add the Kong API Gateway to the picture. Our goal is to have the Konnect Data Plane to keep exposing regular RESTful-based services as well as playing the role of the MCP Server. In order to do it, we need to introduce the new AI MCP Proxy plugin available in Kong AI Gateway 3.12.

The new plugin can be configured for the following main use cases:

Convert regular RESTful APIs, defined as Kong Gateway Services, into MCP tools. In this use case, the plugin acts as a protocol bridge between MCP and HTTP protocols.
Expose grouped MCP Tools as an MCP Server.
Proxy MCP requests to existing MCP Servers.

As you can see, for any of these use cases, Kong, playing the MCP Gateway role, provides critical values to the Agent development processes, normalizing the access to any MCP Tool or MCP Server you might have in your environment.

Convert Kong Gateway Service into MCP Tool

Let's get started converting an existing Kong Gateway Service into an MCP Tool. The following diagram illustrates the architecture:

Consider the following decK (Declarations for Kong) declaration defines a fundamental Kong Gateway Service and Kong Route to expose the same WeatherAPI service, this time, through the Gateway.

_format_version: "3.0"
services:
  - name: weather-service
    url: https://api.weatherapi.com/v1/current.json
    routes:
      - name: weather-route
        paths:
        - "/weather"
        plugins:
          - name: request-transformer-advanced
            enabled: true
            config:
              add:
                headers:
                - key:${{ env "DECK_WEATHERAPI_API_KEY" }}

Besides the Kong Service and Route, the declaration enables the Request Transformer Advanced Plugin to the Route to inject the WeatherAPI API Key. Again, we have an environment variable with the API Key. The env variable is named with the DECK_ prefix so decK can take care of it.

After submitting the declaration, you should be able to consume the Kong Route like this:

# curl -sX POST "$DATA_PLANE_URL/weather?q=Milan" | jq
{
  "location": {
    "name": "Milan",
    "region": "Lombardia",
    "country": "Italy",
    "lat": 45.4667,
    "lon": 9.2,
    "tz_id": "Europe/Rome",
    "localtime_epoch": 1764078087,
    "localtime": "2025-11-25 14:41"
  },
  "current": {
    "last_updated_epoch": 1764077400,
    "last_updated": "2025-11-25 14:30",
    "temp_c": 8.3,
    "temp_f": 46.9,
………
}

The AI MCP Proxy Plugin

Up to this point, the Kong Gateway Service is not configured to be consumed as a MCP Tool. In order to do it, we need to extend the decK declaration:

_format_version: "3.0"
services:
  - name: weather-service
    url: https://api.weatherapi.com/v1/current.json
    routes:
      - name: weather-route
        paths:
        - "/weather"
        plugins:
          - name: request-transformer-advanced
            enabled: true
            config:
              add:
                headers:
                - key:${{ env "DECK_WEATHERAPI_API_KEY" }}
          - name: ai-mcp-proxy
            enabled: true
            config:
              mode: conversion-listener
              tools:
              - description: Get current weather for a location
                method: GET
                parameters:
                - name: q
                  in: query
                  required: true
                  schema:
                    type: string
                  description: Location query. Accepts US Zipcode, UK Postcode, Canada Postalcode,
                    IP address, latitude/longitude, or city name.
              server:
                timeout: 60000

This time, the Gateway Service has its route configured with the AI MCP Proxy plugin. The “conversion-listener” mode says the Gateway Service can be consumed as a MCP Tool. Inside the “tools” section, the “parameters” configuration has an OpenAPI snippet describing the parameters the plugin should send to the Gateway Service.

Differently to what we did before with our basic MCP Server, the AI MCP Proxy plugin implements the full MCP Connection Lifecycle as described in the MCP documentation. The lifecycle comprises the main phases:

Initialization: Capability negotiation and protocol version agreement
Operation: Normal protocol communication
Shutdown: Graceful termination of the connection

Let's send specific requests to each one of them:

Initialization

# curl -isX POST $DATA_PLANE_URL/weather \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{
  "method":"initialize",
  "params":{
    "protocolVersion":"2025-06-18",
    "capabilities":{},
    "clientInfo":{
      "name":"mcp",
      "Version":"0.1.0"
    }
  },
  "jsonrpc":"2.0",
  "id":1
}'

You should get a response like this. Note that, following the specification, the plugin has generated a “mcp-session-id” which can be accessed in the output header. The next request should include the header.

HTTP/1.1 200 OK
Date: Tue, 25 Nov 2025 15:20:55 GMT
Content-Type: text/event-stream
Transfer-Encoding: chunked
Connection: keep-alive
mcp-session-id: 1b9f596a-82a3-43bf-80cd-14c65b0f1968
Cache-Control: no-store, no-transform
X-Kong-Response-Latency: 1
Server: kong/3.13.0.0-enterprise-edition
X-Kong-Request-Id: 6783e52a3908692c39fcdf5f903e6405

event: message
data: {"jsonrpc":"2.0","result":{"capabilities":{"tools":{"listChanged":false}},"instructions":"This is a MCP Server which exposes API to MCP clients.","serverInfo":{"name":"Kong","version":"3.13.0.0"},"protocolVersion":"2025-06-18"},"id":1}

The second request should be an “initialized” notification:

# curl -isX POST $DATA_PLANE_URL/weather \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: 1b9f596a-82a3-43bf-80cd-14c65b0f1968" \
  -d '{
  "method":"notifications/initialized",
  "jsonrpc":"2.0"
}'

And here's the response:

HTTP/1.1 202 Accepted
Date: Tue, 25 Nov 2025 15:22:00 GMT
Transfer-Encoding: chunked
Connection: keep-alive
X-Kong-Response-Latency: 0
Server: kong/3.13.0.0-enterprise-edition
X-Kong-Request-Id: ef5ec06b73712d28f935a755a0421a44

Operation

Now, with the MCP Session defined, let's send some requests to the MCP Gateway. The first one asks the Gateway to list the Tools available:

# curl -X POST $DATA_PLANE_URL/weather \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: 1b9f596a-82a3-43bf-80cd-14c65b0f1968" \
  -d '{
    "jsonrpc":"2.0",
    "id": 0,
    "method": "tools/list"
  }'

The expected and formatted response is the following. Note that the plugin names the Tool following the format “weather-route-1”. The parameter is also defined with the format “query_q”.

event: message
data: {
  "jsonrpc":"2.0",
  "result":{
    "tools":[
      {
        "description":"Get current weather for a location",
        "name":"weather-route-1",
        "id":"35129163-b328-422b-9d3a-0ef17a037b80",
        "inputSchema":{
          "type":"object",
          "properties":{
            "query_q":{
              "type":"string",
              "description":"Location query. Accepts US Zipcode, UK Postcode, Canada Postalcode, IP address, latitude/longitude, or city name."
            }
          },
          "required":[
            "query_q"
          ],
          "additionalProperties":false
        }
      }
    ]
  },
  "id":0
}

And here's the actual Tool call. The request should refer the actual name defined by the plugin and parameter name.

# curl -X POST $DATA_PLANE_URL/weather \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: 1b9f596a-82a3-43bf-80cd-14c65b0f1968" \
  -d '{
    "jsonrpc":"2.0",
    "method":"tools/call",
    "params":{
      "name": "weather-route-1",
      "arguments": {"query_q":"Milan"}
    },
    "id": 1
  }'

The response should be like:

event: message
data: {"jsonrpc":"2.0","result":{"isError":false,"content":[{"type":"text","text":"{\"location\":{\"name\":\"Milan\",\"region\":\"Lombardia\",\"country\":\"Italy\",\"lat\":45.4667,\"lon\":9.2,\"tz_id\":\"Europe/Rome\",\"localtime_epoch\":1764084568,\"localtime\":\"2025-11-25 16:29\"},\"current\":{\"last_updated_epoch\":1764083700,\"last_updated\":\"2025-11-25 16:15\",\"temp_c\":8.2,\"temp_f\":46.8,\"is_day\":1,\"condition\":{\"text\":\"Partly Cloudy\",\"icon\":\"//cdn.weatherapi.com/weather/64x64/day/116.png\",\"code\":1003},\"wind_mph\":3.1,\"wind_kph\":5.0,\"wind_degree\":247,\"wind_dir\":\"WSW\",\"pressure_mb\":999.0,\"pressure_in\":29.5,\"precip_mm\":0.0,\"precip_in\":0.0,\"humidity\":71,\"cloud\":0,\"feelslike_c\":7.7,\"feelslike_f\":45.9,\"windchill_c\":6.8,\"windchill_f\":44.2,\"heatindex_c\":7.4,\"heatindex_f\":45.3,\"dewpoint_c\":2.4,\"dewpoint_f\":36.3,\"vis_km\":10.0,\"vis_miles\":6.0,\"uv\":0.2,\"gust_mph\":4.4,\"gust_kph\":7.1}}"}]},"id":1}

Shutdown

The last thing a client should do is to send a Shutdown message to the MCP Tool to gracefully terminate the connection.

# curl -iX DELETE $DATA_PLANE_URL/weather \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: 1b9f596a-82a3-43bf-80cd-14c65b0f1968" \

The response should like:

HTTP/1.1 200 OK
Date: Tue, 25 Nov 2025 15:40:29 GMT
Transfer-Encoding: chunked
Connection: keep-alive
X-Kong-Response-Latency: 0
Server: kong/3.13.0.0-enterprise-edition
X-Kong-Request-Id: b2387c39eff31360ca3cd25349eaa957

MCP Client

To get a more realistic client, here's a very simple example of a MCP Client consuming the MCP Tool, exposed by the Kong AI Gateway. The MCP_DN and MCP_PORT define where the Konnect Data Plane is located.

import asyncio
import os
import json

from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

MCP_DN = os.getenv("MCP_DN");
MCP_PORT = os.getenv("MCP_PORT");
mcp_url = f"http://{MCP_DN}:{MCP_PORT}/weather"

print ("MCP: " + mcp_url)


async def main():
    # Connect to a streamable HTTP server
    async with streamablehttp_client(mcp_url) as (read_stream, write_stream, _,):
        print("Create a session using the client streams")
        async with ClientSession(read_stream, write_stream) as session:
            print("Initialize the connection")
            await session.initialize()
            print("List available tools")
            tools = await session.list_tools()
            print(f"Available tools: {[tool.name for tool in tools.tools]}")

            result = await session.call_tool("weather-route-1", {"query_q": "Milan"});
            print(result)


if __name__ == "__main__":
    asyncio.run(main())

MCP Authorization

As you might have noticed, until now, there's no security mechanism defined to protect the MCP Tool exposed by the Kong AI Gateway. As usual, you can take advantage of the historical plugins the Kong API Gateway provides to implement Authentication and Security processes to your MCP Tool, such as Key Auth, OpenID Connect, Open Policy Agent, etc.

On the other hand, the MCP community has defined Authorization specifications, based on OAuth 2 standards. To be fully compliant with the specs, the Kong AI Gateway provides a second plugin, responsible for implementing those specs: the AI MCP OAuth2 Plugin.

The AI MCP Proxy Plugin

To exercise the new plugin, let's extend the decK declaration one more time:

_format_version: "3.0"
services:
  - name: weather-service
    url: https://api.weatherapi.com/v1/current.json
    routes:
      - name: weather-route
        paths:
        - "/weather"
        plugins:
          - name: request-transformer-advanced
            enabled: true
            config:
              add:
                headers:
                - key:${{ env "DECK_WEATHERAPI_API_KEY" }}
          - name: ai-mcp-proxy
            enabled: true
            config:
              mode: conversion-listener
              tools:
              - description: Get current weather for a location
                method: GET
                #path: "/weather"
                parameters:
                - name: q
                  in: query
                  required: true
                  schema:
                    type: string
                  description: Location query. Accepts US Zipcode, UK Postcode, Canada Postalcode,
                    IP address, latitude/longitude, or city name.
              server:
                timeout: 60000
          - name: ai-mcp-oauth2
            enabled: true
            config:
              resource: ${{ env "DECK_MCP_AUTH_URL" }}
              authorization_servers:
              - ${{ env "DECK_KEYCLOAK_AUTHZ_URL" }}
              introspection_endpoint: ${{ env "DECK_KEYCLOAK_INTROSPECTION_URL" }}
              client_id: ${{ env "DECK_CLIENT_ID" }}
              client_secret: ${{ env "DECK_CLIENT_SECRET" }}
              insecure_relaxed_audience_validation: true

The exercise relies on Keycloak, as the external Identity Provider. In this sense, Keycloak is responsible for issuing Access Tokens and validating them during the MCP Tool consuming time. The validation follows the OAuth2 Token Introspection specification.

The following diagram shows the relationship between the Kong AI Gateway and the Identity Provider (IdP) for the Introspection flow:

The MCP Consumer presents its credentials (Client ID + Client Secret) to the IdP.
The IdP authenticates the Consumer, issues an Access Token and returns it to the Consumer.
The MCP Consumer sends a request to the Gateway with the Access Token injected.
The AI/MCP Gateway sends a request to the IdP's Introspection Endpoint to get the Access Token validated.
If the Token is still valid the Gateway calls the Tool.

To get Keycloak installed you can follow the Quickstarts available.

The declaration refers to some Keycloak endpoints and secrets, besides the actual Authorization URL. Let's go through them:

MCP_AUTH_URL: that's the MCP Tool URL, exposed by the Kong Data Plane. Considering our declaration should be like this: http://<DATA_PLANE_DN>:<DATA_PLANE_PORT>/weather.
KEYCLOAK_AUTHZ_URL: that's the Keycloak Authorization endpoint. Assuming an existing Keycloak realm, called “kong”, in our case, it should be something like: http://<KEYCLOAK_DN>:<KEYCLOAK_PORT>/realms/kong/protocol/openid-connect/auth.
KEYCLOAK_INTROSPECTION_URL: that's the Keycloak Introspection endpoint. In our case, http://<KEYCLOAK_DN:KEYCLOAK_PORT/realms/kong/protocol/openid-connect/token/introspect.
CLIENT_ID and CLIENT_SECRET: these are the secrets of an existing Keycloak Client.

Keycloak provides the standard endpoint http://<KEYCLOAK_DN:KEYCLOAK_PORT/realms/kong/.well-known/openid-configuration where you can get these and several other Keycloak configuration parameters.

Exploring the Introspection Endpoint

To exercise the Introspection Endpoint, let's send some requests to Keycloak, acting as the Consumer and the Gateway.

In the first request, we play the Consumer role, using the Client Credential Grant to get our Access Token

TOKEN=$(curl -s -X POST "http://$KEYCLOAK_LB:$KEYCLOAK_PORT/realms/kong/protocol/openid-connect/token" \
--header "Content-Type: application/x-www-form-urlencoded" \
--data-urlencode "client_id=$CLIENT_ID" \
--data-urlencode "client_secret=$CLIENT_SECRET" \
--data-urlencode "grant_type=client_credentials" | jq -r '.access_token')

You can decode the Access Token with:

echo $TOKEN | jwt decode -

You should get an output like this:

Token header
------------
{
  "typ": "JWT",
  "alg": "RS256",
  "kid": "OpQdtUoRz_ZAv-3mq5S92ewVVFzU4-XsDd5raulokp4"
}

Token claims
------------
{
  "acr": "1",
  "allowed-origins": [
    "http://k8s-kong-kongkong-ad510e7789-cf3de00caeb82cef.elb.us-east-2.amazonaws.com:80"
  ],
  "aud": "account",
  "azp": "kong_id",
  "clientAddress": "192.168.90.156",
  "clientHost": "192.168.90.156",
  "client_id": "kong_id",
  "email_verified": false,
  "exp": 1764098690,
  "iat": 1764098630,
  "iss": "http://k8s-keycloak-keycloak-7158af03e9-5a453106278d5b6b.elb.us-east-2.amazonaws.com:8080/realms/kong",
  "jti": "trrtcc:84644440-93ae-5733-28d0-c97c25205f61",
  "preferred_username": "service-account-kong_id",
  "realm_access": {
    "roles": [
      "offline_access",
      "uma_authorization",
      "default-roles-kong"
    ]
  },
  "resource_access": {
    "account": {
      "roles": [
        "manage-account",
        "manage-account-links",
        "view-profile"
      ]
    }
  },
  "scope": "profile email",
  "sub": "0583c218-038b-446b-a9cc-22ce0d5f7b93",
  "typ": "Bearer"

Now, playing the Gateway role, we are going to consume the Introspection Endpoint asking the IdP to validate the Access Token:

curl -s -X POST "http://$KEYCLOAK_LB:$KEYCLOAK_PORT/realms/kong/protocol/openid-connect/token/introspect" \
  -d "token=$TOKEN" \
  -d "client_id=$CLIENT_ID" \
  -d "client_secret=$CLIENT_SECRET" | jq

Here's a typical response. The “active” at the bottom says the plugin is still good.

{
  "exp": 1764098858,
  "iat": 1764098798,
  "jti": "trrtcc:8be1136d-d8de-eb82-875a-15b35781aef5",
  "iss": "http://k8s-keycloak-keycloak-7158af03e9-5a453106278d5b6b.elb.us-east-2.amazonaws.com:8080/realms/kong",
  "aud": "account",
  "sub": "0583c218-038b-446b-a9cc-22ce0d5f7b93",
  "typ": "Bearer",
  "azp": "kong_id",
  "acr": "1",
  "allowed-origins": [
    "http://k8s-kong-kongkong-ad510e7789-cf3de00caeb82cef.elb.us-east-2.amazonaws.com:80"
  ],
  "realm_access": {
    "roles": [
      "offline_access",
      "uma_authorization",
      "default-roles-kong"
    ]
  },
  "resource_access": {
    "account": {
      "roles": [
        "manage-account",
        "manage-account-links",
        "view-profile"
      ]
    }
  },
  "scope": "profile email",
  "email_verified": false,
  "preferred_username": "service-account-kong_id",
  "client_id": "kong_id",
  "username": "service-account-kong_id",
  "token_type": "Bearer",
  "active": true
}

However, if you wait for the Access Token timeout (in Keycloak is 10min, by default), the Endpoint returns a different output saying so:

{
  "active": false
}

MCP Client and OAuth2

Now, let's evolve our MCP Client and protect it with the AI MCP OAuth2 Plugin. The code has a new function called “get_access_token” which is responsible for hitting the Keycloak's Authorization Endpoint to get the Access Token. The token is then used to start the Streamable HTTP Session.

The code checks if a specific environment variable is defined with an Access Token. If it is, the code uses it to start the Session. One nice way to test the code would be running the code, setting the environment variable with the token, waiting for the timeout and running the code again. You should see an error message related to it.

import asyncio
import os
import json

from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

import aiohttp


MCP_DN = os.getenv("MCP_DN");
MCP_PORT = os.getenv("MCP_PORT");
mcp_url = f"http://{MCP_DN}:{MCP_PORT}/weather"

print ("MCP: " + mcp_url)


KEYCLOAK_LB = os.getenv("KEYCLOAK_LB");
KEYCLOAK_PORT = os.getenv("KEYCLOAK_PORT");
token_url = f"http://{KEYCLOAK_LB}:{KEYCLOAK_PORT}/realms/kong/protocol/openid-connect/token"
client_id = os.getenv("DECK_CLIENT_ID")
client_secret = os.getenv("DECK_CLIENT_SECRET")

print ("KEYCLOAK: " + token_url)
print ("CLIENT_ID: " + client_id)
print ("CLIENT_SECRET: " + client_secret)


async def get_access_token():
    """
    Fetch OAuth2 token using Client Credentials grant.
    """
    token_data = os.getenv("TOKEN")
    if token_data is not None:
        print("Env Var Token is set")
        return token_data

    async with aiohttp.ClientSession() as http:
        async with http.post(
            token_url,
            data={
                "grant_type": "client_credentials",
                "client_id": client_id,
                "client_secret": client_secret,
            },
            headers={"Content-Type": "application/x-www-form-urlencoded"}
        ) as resp:
            if resp.status != 200:
                raise RuntimeError(f"OAuth token error: {resp.status} {await resp.text()}")
            token_data = await resp.json()
            return token_data["access_token"]


async def main():
    token = await get_access_token()
    print("Received OAuth Access Token")
    print(token)

    # Connect to a streamable HTTP server
    async with streamablehttp_client(mcp_url, headers={"Authorization": f"Bearer {token}"}) as (read_stream, write_stream, _,):
        print("Create a session using the client streams")
        async with ClientSession(read_stream, write_stream) as session:
            print("Initialize the connection")
            await session.initialize()
            print("List available tools")
            tools = await session.list_tools()
            print(f"Available tools: {[tool.name for tool in tools.tools]}")

            result = await session.call_tool("weather-route-1", {"query_q": "Milan"});
            print(result)


if __name__ == "__main__":
    asyncio.run(main())

Kong MCP Server

Now we are going to explore the other main use case implemented by the AI MCP Proxy Plugin which is to expose and protect an existing MCP Server. For the use case we are going to use two of them: Kong MCP Server and AWS Knowledge MCP Server.

The Kong MCP Server was introduced last April, 2025 and it's available in GitHub. The Kong MCP Server allows you to interact with Konnect and get current configuration about your Control Planes, Kong Objects (Service, Routes, Plugins, Consumers), etc.

Installation

For the purpose of this blog post, you can deploy the Kong MCP Server in Kubernetes using these declarations.

Create a namespace and two secrets: one for the Konnect region and another for the your Konnect Access Token (e.g. PAT)

kubectl create namespace kong

kubectl create secret generic konnect-region -n kong --from-literal=region='us'

kubectl create secret generic konnect-api-access-token -n kong --from-literal=token='<YOUR_PAT>'

Now submit the declaration for the Kubernetes Deployment and Service. The Services annotations are specific for Amazon EKS and AWS Load Balancer Controller. Ideally, the Service should be deployed with an internal Load Balancer; however, to make it consumable even without the Gateway, we are deploying it with an internet-facing NLB (Network Load Balancer).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-konnect
  namespace: kong
  labels:
    app: mcp-konnect
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mcp-konnect
  template:
    metadata:
      labels:
        app: mcp-konnect
    spec:
      containers:
      - name: mcp-konnect
        image: claudioacquaviva/mcp-konnect
        imagePullPolicy: Always
        ports:
        - containerPort: 3001
        env:
        - name: KONNECT_REGION
          valueFrom:
            secretKeyRef:
              name: konnect-region
              key: region
        env:
        - name: KONNECT_ACCESS_TOKEN
          valueFrom:
            secretKeyRef:
              name: konnect-api-access-token
              key: token
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-konnect
  namespace: kong
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    "service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
    "service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "ip"
spec:
  selector:
    app: mcp-konnect
  ports:
  - protocol: TCP
    port: 3001
    targetPort: 3001
  type: LoadBalancer

Proxy MCP requests with MCP Gateway

Once you have the Kong MCP Server deployed, you can start consuming it just like you did with the MCP Server we crafted before. However, we want this MCP Server to sit behind the MCP Gateway and get controlled by it.

That's where the second AI MCP Proxy use case comes in: to proxy MCP requests to existing MCP Server. The architecture is illustrated below:

Here’s our new decK declaration:

_format_version: "3.0"
_info:
  select_tags:
  - mcp
_konnect:
  control_plane_name: kong-aws
services:
- name: mcp-service
  url: http://mcp-konnect.kong:3001/mcp
  routes:
  - name: mcp-route
    paths:
    - /mcp
    plugins:
    - name: ai-mcp-proxy
      enabled: true
      config:
        mode: passthrough-listener

The declaration creates a new Kong Gateway Service based on the Kong MCP Server URL. In this case, the Kong MCP Server was deployed in the same Kubernetes as the Kong Data Plane so we can use the Kong MCP Server's Kubernetes FQDN to connect to it. As expected the Route is reachable with the “/mcp” path.

The AI MCP Proxy plugin is one more time configured, but this time with the config mode as “passthrough-listener”, meaning the MCP Gateway will listen for incoming MCP requests and proxies them to the upstream URL of the Gateway Service (http://mcp-konnect.kong:30001/mcp). The main benefit is the MCP observability metrics generation for traffic.

MCP Inspector

After submitting the declaration, we can check the MCP Server. This time, we’ll use the MCP Inspector tool provided by the MCP project. You can run it as a container in your Docker or Podman environment. For example, for Podman, run the following command:

podman run -d --name mcp-inspector \
  -e HOST=0.0.0.0 \
  -e "ALLOW_ORIGINS=http://localhost:6274" \
  -e "PUBLIC_URL=http://localhost:6274" \
  -p 6274:6274 \
  -p 6277:6277 \
  ghcr.io/modelcontextprotocol/inspector:latest

If you want to stop it and delete it, run:

podman stop mcp-inspector
podman container rm -v mcp-inspector
podman image rm ghcr.io/modelcontextprotocol/inspector:latest

Check the MCP Inspector logs and get the Session Token:

podman logs mcp-inspector

In MacOS, start the application with the following command. You should get redirected to the MCP Inspector home page.

open -a "Google Chrome" "http://localhost:6274?MCP_PROXY_AUTH_TOKEN=<YOUR_SESSION_TOKEN>"

For “Transport Type” choose “Streamable HTTP” and add your Kong Data Plane URL (e.g. http://k8s-kong-kongkong-fae959197b-6192eafba1fb8e62.elb.us-east-2.amazonaws.com/mcp). Make sure you have set the Connection Type as “Via Proxy” so the requests go through the MCP Inspector's MCP Proxy.

Click “Connect". MCP Inspector should open the “Tools” box. Inside the box, click “List Tools” to see all MCP Tools exposed by the Kong MCP Server. Choose “list_control_planes” and you should see a new box with all parameters available to run the Tool. Click “Run Tool” and see the results:

Kong AI Gateway Analytics

At the same time, Kong AI Gateway is generating observability data related to the MCP Server and Tool consumption. Here's the Kong Gateway Service “mcp-service” Analytics landing page.

AWS Knowledge MCP Server

AWS provides a long list of MCP Servers. A particularly interesting one is the AWS Knowledge MCP Server which provides real-time access to AWS documentation.

A similar decK declaration can be used to consume it:

_format_version: "3.0"
_info:
  select_tags:
  - aws-mcp
_konnect:
  control_plane_name: kong-aws
services:
- name: aws-mcp-service
  url: https://knowledge-mcp.global.api.aws
  routes:
  - name: aws-mcp-route
    paths:
    - /aws-mcp
    plugins:
    - name: ai-mcp-proxy
      enabled: true
      config:
        mode: passthrough-listener

Kong Insomnia

This time, to consume the external MCP Server, we are going to use Insomnia, the AI-native API platform for developers to design and test any endpoint, including MCP client support. Read the documentation to learn more about it.

Inside Insomnia, start a new or use an existing Project. Click on “+” for MCP Client.

For the HTTP box, add the Kong AI Gateway Data Plane URL for the AWS MCP Server we just defined. E.g.: http://k8s-kong-kongkong-fae959197b-6192eafba1fb8e62.elb.us-east-2.amazonaws.com/aws-mcp
Click “Connect”. You should see on the list of available Tools on your left. Choose “aws__read_documentation.
In the middle pane, inside the “url” box type: “https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html” or any AWS documentation link. Click “Call Tool”.
You should be able to see the response inside the “Preview” tile:

Kong AI Gateway Analytics Explorer

Return to Kong AI Gateway Analytics options and choose “Explorer”. Change the time period to “Last 15 minutes”, and show “Request Count” per “10 seconds” by “Gateway service”. You should see a diagram showing the consumption of both MCP Tools.

Conclusion

In this blog post, we explored how Kong MCP Gateway can serve as a powerful bridge between AI agents and enterprise systems by exposing internal capabilities through the Model Context Protocol. With Kong’s flexible plugin ecosystem and first-class support for MCP, it becomes straightforward to integrate tools, data sources, and downstream APIs in a secure and scalable way. Beyond simple tool invocation, Kong MCP Gateway enables advanced agentic patterns such as multi-step workflows, contextual enrichment, and safe function calling, while maintaining strong governance over every interaction.

Just like we do with traditional API workloads, the MCP Gateway can be combined with the rich set of Kong plugins to strengthen and enhance AI-based use cases. For example, you can protect MCP tool endpoints using the MCP OAuth2 Plugin.

To explore the full set of capabilities available for building MCP-enabled architectures, visit the Kong AI Gateway product page, or check out the Kong + AWS partnership hub at https://konghq.com/partners/aws to learn more.

Learn More Get a Demo

Topics

MCP Kong Gateway AI Gateway AI OAuth decK

Claudio Acquaviva

Principal Architect, Kong

Jason Matis

Staff Solutions Engineer, Kong

RAG Application with Kong AI Gateway, AWS Bedrock, Redis and LangChain

EngineeringOctober 17, 2024

For the last couple of years, Retrieval-Augmented Generation (RAG) architectures have become a rising trend for AI-based applications. Generally speaking, RAG offers a solution to some of the limitations in traditional generative AI models, such as

Claudio Acquaviva

Insights from eBay: How API Ecosystems Are Ushering In the Agentic Era

EngineeringDecember 15, 2025

APIs have quietly powered the global shift to an interconnected economy. They’ve served as the data exchange highways behind the seamless experiences we now take for granted — booking a ride, paying a vendor, sending a message, syncing financial rec

Amit Dey

AI Voice Agents with Kong AI Gateway and Cerebras

EngineeringNovember 24, 2025

Kong Gateway is an API gateway and a core component of the Kong Konnect platform . Built on a plugin-based extensibility model, it centralizes essential functions such as proxying, routing, load balancing, and health checking, efficiently manag

Claudio Acquaviva

Kong AI Gateway Goes GA, New Enterprise Capabilities Added

Product ReleasesMay 29, 2024

More easily manage AI spend, build AI agents and chatbots, get real-time AI responses, and ensure content safety We're introducing several new Kong AI Gateway capabilities in Kong Gateway 3.7 and Kong Gateway Enterprise 3.7, including enterprise-o

Marco Palladino

AI Observability: Monitoring and Troubleshooting Your LLM Infrastructure

Learning CenterFebruary 27, 2026

AI observability extends traditional monitoring by adding behavioral telemetry for quality, safety, and cost metrics alongside standard logs, metrics, and traces Time-to-First-Token (TTFT) and token usage metrics are critical performance indicator

Kong

AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

EngineeringAugust 25, 2025

Why AI guardrails matter It's natural to consider the necessity of guardrails for your sophisticated AI implementations. The truth is, much like any powerful technology, AI requires a set of protective measures to ensure its reliability and integrit

Jason Matis

Securing Enterprise AI: OWASP Top 10 LLM Vulnerabilities Guide

EngineeringJuly 31, 2025

Introduction to OWASP Top 10 for LLM Applications 2025 The OWASP Top 10 for LLM Applications 2025 represents a significant evolution in AI security guidance, reflecting the rapid maturation of enterprise AI deployments over the past year. The key up

Michael Field

Kong AI/MCP Gateway and Kong MCP Server Technical Breakdown

Kong MCP Gateway Introduction

Kong and MCP

MCP Introduction

Kong AI/MCP Gateway

MCP Communication Fundamentals

JSON-RPC 2.0

MCP Transport Layer - Streamable HTTP

MCP Server

Kong version

Convert Kong Gateway Service into MCP Tool

The AI MCP Proxy Plugin

Operation

Shutdown

MCP Client

MCP Authorization

The AI MCP Proxy Plugin

Exploring the Introspection Endpoint

MCP Client and OAuth2

Kong MCP Server

Installation

Proxy MCP requests with MCP Gateway

MCP Inspector

Kong AI Gateway Analytics

AWS Knowledge MCP Server

Kong Insomnia

Kong AI Gateway Analytics Explorer

Conclusion

AI-powered API security? Yes please!

Recommended posts

RAG Application with Kong AI Gateway, AWS Bedrock, Redis and LangChain

Insights from eBay: How API Ecosystems Are Ushering In the Agentic Era

AI Voice Agents with Kong AI Gateway and Cerebras

Kong AI Gateway Goes GA, New Enterprise Capabilities Added

AI Observability: Monitoring and Troubleshooting Your LLM Infrastructure

AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

Securing Enterprise AI: OWASP Top 10 LLM Vulnerabilities Guide

Ready to see Kong in action?