[AI Gateway](/blog/tag/ai-gateway)AI Gateway

June 19, 2026

11 min read

Hugo Guerrero

Principal Tech PMM, Kong

*This is part of a three-part series. For the full story, see *[*Your AI Agent Knows What, It Doesn't Know Why*](https://konghq.com/blog/enterprise/durable-commit-log-ai-observability)*Your AI Agent Knows What, It Doesn't Know Why** and *[*The Case for Kafka as the Agent's Memory Layer*](https://konghq.com/blog/engineering/kafka-as-the-agents-memory-layer)*The Case for Kafka as the Agent's Memory Layer**.*

We made the case for why [agentic AI systems need a durable commit log](https://konghq.com/blog/enterprise/durable-commit-log-ai-observability)agentic AI systems need a durable commit log rather than static state snapshots. The argument is conceptual: state tells you what is, and a log tells you how it became so. But conceptual arguments only get you so far. At some point you have to answer the question architects actually ask: *what does this look like when you deploy it?*

This post goes deeper. We'll walk through the full reference architecture — the topic catalog, the schema contracts, the two Kong data planes and how they connect, and the complete lifecycle of one agent turn from the moment a request arrives to the moment a downstream projection updates. This is the implementation detail behind the narrative.

## Two data planes, one governed path

The architecture is built around two data planes, both managed by Kong.

The first is the sync data plane — [Kong AI Gateway](https://konghq.com/products/kong-ai-gateway)Kong AI Gateway — which handles all synchronous traffic between your agents and the outside world. Every inbound client request, every outbound call to an LLM provider like Anthropic or OpenAI, every [MCP (Model Context Protocol)](https://konghq.com/blog/learning-center/what-is-mcp)MCP (Model Context Protocol) tool invocation, every A2A (agent-to-agent) gRPC call flows through here. Kong AI Gateway enforces authentication via mTLS or OIDC (OpenID Connect), applies role-based access control at the tool level, runs prompt firewall checks, redacts PII (personally identifiable information) before requests reach the model, enforces token-budget and cost guardrails, routes to the right model, and emits distributed traces via OTLP (OpenTelemetry Protocol). None of this requires changes to agent code.

The second is the async data plane — [Kong Event Gateway](https://konghq.com/products/event-gateway)Kong Event Gateway — which governs all event-driven communication. It sits in front of your Apache Kafka cluster and enforces who can publish to which topic, validates every incoming event against its registered AsyncAPI schema before it hits the broker, handles PII redaction at the broker side, manages retention and compaction policies per topic, routes failed events to dead-letter queues, and exposes a replay API so you can rewind and reprocess any partition of the log.

The connection between the two planes is what the architecture calls the **tap rail**. Every interaction that flows through Kong AI Gateway — every LLM completion, every tool call, every agent response — gets tapped and published as a structured event to Kong Event Gateway and on to Kafka. This happens at the gateway layer, not in the agent framework. Your agents call the model. Kong AI Gateway proxies the call, captures the full interaction, and publishes it to the commit log. The agent never has to know the log exists, and crucially, the log is complete regardless of which agent framework you're running.

Both data planes share a single control plane: Kong Konnect. The same platform that surfaces Kong AI Gateway traffic — every LLM call, every MCP invocation, every A2A message — also surfaces Kong Event Gateway traffic: every topic publish, every schema validation result, every subscription. That unified visibility is what makes cross-cutting governance — consistent policy, unified audit lineage, single-pane observability — actually achievable across your entire AI connectivity surface. No stitching together separate dashboards for your sync and async layers. One control plane sees it all.

## Designing the topic catalog

Before you deploy, you need a deliberate topic catalog. Not a single catch-all topic — a set of topics, each representing a semantically distinct event type in the agent's reasoning lifecycle.

The logical reference architecture (R1) organizes the commit log into two streams. `agent.actions` is the primary action log: every tool call, decision, context retrieval, reasoning step, and final response the agent produces. It's durable, ordered, and replayable by design. `agent.judgments` is the quality evaluation stream: the scores, labels, and rationale produced by the judge LLM as it evaluates each turn. These two streams are logically separate so downstream consumers can subscribe to exactly what they need. Your compliance store needs `agent.actions`. Your quality dashboard needs `agent.judgments`. Your model training pipeline probably wants both.

At the implementation level, the reference architecture defines ten primary topics: `agent.invoked` (session start), `agent.context.retrieved` (data fetched from an external source), `agent.tool.invoked` (an MCP tool was called), `agent.tool.result` (the tool's response), `agent.reasoning.step` (each step in the chain of thought), `agent.decision` (the agent's decision point), `agent.response` (the final response before delivery to the caller), `audit.envelope` (a compliance-grade record of the full interaction), `judge.scored` (the quality judgment from the judge LLM), and `agent.dlq.*` (dead-letter queues for failed events).

Each topic is keyed on `session_id`. This is the most important partitioning decision in the design. When all events from a single agent session land in the same Kafka partition — determined by `hash(session_id)` — you get strict ordering within that session for free. Every tool call follows its invocation. Every decision follows its context. The causal chain is preserved at the storage layer, before any consumer touches it.

The operational configuration is: 24 partitions per topic, replication factor of 3, `acks=all` (every write must be acknowledged by all in-sync replicas before the producer gets confirmation back). Retention is infinite for compacted topics and seven days for dead-letter queues. `min.insync.replicas=2` means the cluster can tolerate one broker failure without losing write availability.

## Schema governance with AsyncAPI 3.1

Every topic has a registered schema in the schema registry. The architecture uses AsyncAPI 3.1 (a specification for event-driven APIs, roughly the equivalent of OpenAPI for async communication) and enforces BACKWARD compatibility — new schema versions must be readable by consumers built against the previous version.

This constraint is more consequential than it sounds. An agent framework upgrade that changes the shape of an a`gent.tool.invoked` event will be caught at publish time if it breaks the registered schema. Kong Event Gateway validates every event at the broker side before it lands. Events that fail schema validation are rejected and routed to the DLQ with the original payload and a validation failure reason attached. Your SIEM consumer, your compliance store, and your vector materializer all built against `ToolInvoked.v1` keep working when `ToolInvoked.v2` ships, because BACKWARD compat guarantees that.

The schema registry also drives the replay API. When you want to replay a session — to debug a reasoning failure or run a new judge model over historical data — the replay API serves events with their original schema version. Consumers can process them correctly even if the current schema has evolved.

## Walking one agent turn end-to-end

Let's trace a single request through the full stack. One turn. Thirteen steps. This is the anatomy of everything the architecture captures.

**Steps 1–2: Request ingress.**

The client — a browser, a CLI, or an upstream agent — sends a request to Kong AI Gateway over HTTPS with mTLS (mutual TLS, two-way authentication). Kong AI Gateway authenticates the caller, checks rate limits, runs the prompt firewall, applies any PII redaction on the input, and routes the request downstream to the agent runtime via HTTP/2 sidecar. Every event in the ledger that follows carries four identifiers: `event_id` (a uuidv7 — a time-ordered UUID format that sorts chronologically), `turn_id`, `session_id`, and `causation_id`. The `causation_id` links each event back to the event that triggered it, making the full causal graph reconstructable from the log without any external metadata.

**Step 3: Agent publishes its invocation.**

The agent runtime starts the turn and immediately publishes to the `agent.invoked` topic via Kong Event Gateway. Schema: `AgentInvoked.v1`. This is the agent's declaration that a turn has begun, logged before any reasoning happens.

**Step 4: Event Gateway commits to Kafka.**

Kong Event Gateway validates the schema, enforces the topic ACL (access control list), and writes the event to the Kafka commit log on the partition determined by `hash(session_id)`. The broker sends back an `ack` once all in-sync replicas confirm the write.

**Steps 5–6: LLM call and tap.**

The agent calls the LLM provider via Kong AI Gateway. Kong AI Gateway enforces the token budget, redacts PII from the prompt, and proxies the request to Anthropic or OpenAI over HTTPS. When the completion comes back, Kong AI Gateway taps it and publishes `llm.completion` → `LlmCompletion.v1` to Kafka. The agent receives its completion. The log receives a structured record of exactly what the model was asked and what it returned — captured at the wire level, not inside the agent.

**Steps 7–9: Tool invocation and tap.**

The agent calls an MCP tool over JSON-RPC. Kong AI Gateway enforces tool-level RBAC before the call goes through. On the way back, Kong AI Gateway taps both the invocation and the result, publishing `agent.tool.invoked` and `agent.tool.result` → `ToolInvoked.v1` to Kafka. Kong Event Gateway writes both events to the commit log. If the session has parallel tool calls in flight, they land in the same partition in the order they complete, preserving the causal sequence.

**Steps 10–11: Judge LLM scores the turn.**

A separate judge LLM running inside ksqlDB subscribes to `agent.*` topics and consumes the turn as events land. The critical distinction: it evaluates, it does not act. It scores reasoning quality, flags anomalies, and publishes `judge.scored` → `JudgeScored.v1` back to Kafka via Kong Event Gateway with three fields: a numeric score, a classification label, and a natural-language rationale. The log is now a record not just of what the agent did, but how well it did it. Aggregated over time, that's continuous quality trending — drift in reasoning quality shows up as a trend in the scores before it shows up as a production incident. A failure judgment can trigger a downstream alert in Splunk or, if you wire it, become an input event for an automated remediation workflow. The judgment stream also produces curated training data: every scored interaction, labeled by the judge, is a candidate for fine-tuning the next model version. One caveat worth stating directly: judge models miscalibrate and drift. They need periodic human calibration. Most run asynchronously, which means judgment lag is a real operational variable. The value of the judge loop depends entirely on consistent, schema-governed event output — if the underlying action events are malformed or inconsistent, the judge's scores are meaningless. This is why schema governance isn't optional infrastructure; it's what makes the quality loop reliable.

**Step 12: Vector materializer updates the read-side store.**

Apache Flink consumes agent.* events and upserts embeddings into Pinecone. This is how your vector database stays consistent with the commit log — not as a primary store but as a derived projection that is eventually consistent with the truth. Redis gets updated at the same time with current session state, scoped to a session TTL.

**Step 13: Response returns to caller.**

The agent sends its final response back through Kong AI Gateway via HTTPS mTLS. Kong AI Gateway returns `InvokeResponse.v1`. One complete turn, 13 governed steps, and a fully reconstructable record of every interaction in between.

## The read-side projections

Everything downstream of the Kafka commit log is a projection — a derived, eventually-consistent view of the truth. It helps to think of the commit log as doing three jobs simultaneously. First, it remembers: every agent action is durably recorded and ordered. Second, it evaluates: the judge LLM loop consumes the action stream, scores it, and publishes judgments back, turning the log into a record of both what the agent did and how well. Third — and this connects to a separate use case — the event backbone can trigger: a failure judgment becomes an input event that kicks off a remediation workflow, closing the loop back to automated response.

The downstream projection layer serves six roles. Pinecone handles vector search (cosine similarity, index `agent-memory-v2`, populated by Flink) — retrieval projection, not the truth itself. Redis holds current session state (key-value, TTL-scoped to the session) — current state, not history. Snowflake serves analytics and trends (ingested via Fivetran from the Kafka stream). S3 handles compliance archival (the `audit.envelope` topic written to object storage with S3 Object Lock for WORM — write-once, read-many — retention, seven-year compliance hold). Splunk runs security monitoring (live tail on `agent.*` with rules and incident alerting). And the judge LLM loop feeds a training pipeline — every scored interaction is a labeled training candidate for fine-tuning.

Each of these subscriptions is governed. Kong Event Gateway enforces read and write policy on the action log and judgment stream, schema governance on every event flowing into a downstream consumer, retention and redaction rules before data hits the broker, PII masking for sensitive data, and geographic and regulatory boundaries — so data that needs to stay within a specific jurisdiction does, by configuration rather than by custom code. If the log is the source of truth for agent behavior, the governance of who can subscribe to that log and what they can see is the foundation for everything built on top.

The key architecture decision here is that none of these stores are the source of truth. Pinecone is fast but derived. Redis is convenient but ephemeral. Snowflake is analytical but lagging. S3 is durable but write-only. The Kafka commit log is the source of truth. If you need to rebuild Pinecone from scratch, you replay the log and re-run the materializer. If your Redis cache is stale, you replay the last N events for a session and rehydrate it. The projections are views. The stream is the database.

## What Kong governs versus what you own

The boundary of Kong's governance surface in this architecture is specific and worth stating explicitly. Kong governs the sync data plane (all traffic through Kong AI Gateway), the async data plane (all events through Kong Event Gateway), the Kafka cluster (managed by Kong), and the schema registry. On the sync side, that means authentication, authorization, rate limiting, prompt safety, PII handling, cost control, schema validation, and distributed tracing. On the async side, that means topic access control (read and write policy on the action log and judgment stream), schema governance on every action, decision, and judgment event, retention and redaction rules, PII masking, geographic and regulatory data boundaries, governed subscriptions for every downstream consumer, audit lineage, consumer authentication, and schema versioning. Both surfaces are governed consistently, across every agent and every framework running in your environment.

You own the agent runtime (any framework, any Kubernetes configuration), the read-side projections (your Pinecone account, your Redis cluster, your Snowflake instance), and the stream processors (your ksqlDB deployment, your Flink jobs). Kong gives you the governed event stream to consume from. What you build on top of it is yours.

This separation is the practical answer to the framework question. Your platform team adopts Kong once. Every agent team — whether they're using LangChain, AutoGen, CrewAI, or a custom orchestrator — gets governance, observability, and compliance automatically, because it lives in the infrastructure, not in their code. And because both data planes surface into Konnect, the same control plane that sees every LLM call and MCP invocation also sees every event publish and subscription — giving you a unified view of your entire AI connectivity surface that neither an API gateway alone nor an event gateway alone can provide.

## The architecture is deployable today

This is not a reference architecture for a future version of your stack. The components described here — Apache Kafka on KRaft (Kafka's built-in metadata management mode, which removes the ZooKeeper dependency), Kong AI Gateway, Kong Event Gateway, AsyncAPI 3.1 schema governance, OTLP-based distributed tracing, ksqlDB for streaming evaluation, Flink for materialization — are all production-grade and available today.

The commit log pattern has been the backbone of reliable distributed systems for decades. Applying it to the reasoning traces of AI agents is the same idea, applied to a new kind of workload. The schemas are richer. The consumers are more varied. The governance requirements are newer. But the fundamental insight is unchanged: if you want a system you can trust, you need to be able to reconstruct what it did, in order, from a source of truth that nobody can edit after the fact.

That's what this architecture gives you. One governed data path. A complete causal record. And the infrastructure to turn that record into every downstream capability your AI strategy requires.

*Ready to build this in your environment? Explore *[_*Kong AI Gateway*_](https://konghq.com/products/kong-ai-gateway)_*Kong AI Gateway*_* and *[_*Kong Event Gateway*_](https://konghq.com/products/kong-event-gateway)_*Kong Event Gateway*_*, or reach out to our solutions engineering team for a reference architecture review.*

**Topics**

- [AI Gateway](/blog/tag/ai-gateway)AI Gateway- [Event Gateway](/blog/tag/event-gateway)Event Gateway- [Kafka](/blog/tag/kafka)Kafka- [Agentic AI](/blog/tag/agentic-ai)Agentic AI

Hugo Guerrero

Principal Tech PMM, Kong

# Kafka Was Built for This: The Case for Kafka as the Agent's Memory Layer

[Engineering](/blog/tag)EngineeringJune 17, 2026

Before making the Kafka argument, it's worth being precise about what the memory log actually needs to do. A durable commit log for agentic AI isn't just a message bus. It's the substrate that makes replay, auditability, and governance possible. Tha

Hugo Guerrero

# Your AI Agent Knows What. It Doesn't Know Why.

[Enterprise](/blog/tag)EnterpriseMay 19, 2026

When teams build agentic systems — AI that can take autonomous actions, call tools, make decisions, and chain reasoning steps across a session — the conversation focuses on models, frameworks, protocols like MCP (Model Context Protocol) and A2A (

Hugo Guerrero

# Kafka in a DMZ: Protecting AWS MSK with Kong Event Gateway

[Engineering](/blog/tag)EngineeringJuly 14, 2026

The MSK exposure problem Amazon MSK brokers live in private subnets by default. That's the right default. Kafka's protocol wasn't designed for untrusted networks — it has no concept of rate limiting, no built-in field-level encryption, and its ACL

Hugo Guerrero

# A Unified Gateway for APIs + Agentic Applications on VMware VKS with Kong Konnect

[Engineering](/blog/tag)EngineeringMay 20, 2026

Built on top of Kong API Gateway, the Kong AI Gateway is designed to address key challenges in enterprise AI adoption. Modern AI applications rarely rely on a single model; instead, they orchestrate multiple GenAI providers, agent frameworks, Age

Anika Suri

# Dynamic Kafka ACLs: Implementing Identity-Aware Policies with Kong Event Gateway

[Engineering](/blog/tag)EngineeringApril 27, 2026

The Problem with Traditional Kafka ACLs Kafka ACLs are powerful, but they come with significant tradeoffs: Static Definition: They are defined at the broker level and lack context awareness (e.g., who the caller is, their role, or current environm

Hugo Guerrero

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

[Enterprise](/blog/tag)EnterpriseJuly 2, 2026

The Langflow CVEs and Dify Vulnerabilities: What Actually Happened Langflow's security problems arrived in waves. CVE-2025-3248 introduced a code injection vulnerability allowing remote code execution through unsanitized user input \10\]. Months la

Kong

# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway

[Engineering](/blog/tag)EngineeringMarch 7, 2026

Claude Code is Anthropic's agentic coding and agent harness tool. Unlike traditional code-completion assistants that suggest the next line in an editor, Claude Code operates as an autonomous agent that reads entire codebases, edits files across mult

Alex Drag

# Building the Agentic Commit Log: A Technical Blueprint with Apache Kafka and Kong

## Two data planes, one governed path

## Designing the topic catalog

## Schema governance with AsyncAPI 3.1

## Walking one agent turn end-to-end

## The read-side projections

## What Kong governs versus what you own

## The architecture is deployable today

Recommended posts

# Kafka Was Built for This: The Case for Kafka as the Agent's Memory Layer

# Your AI Agent Knows What. It Doesn't Know Why.

# Kafka in a DMZ: Protecting AWS MSK with Kong Event Gateway

# A Unified Gateway for APIs + Agentic Applications on VMware VKS with Kong Konnect

# Dynamic Kafka ACLs: Implementing Identity-Aware Policies with Kong Event Gateway

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway

# Kafka Was Built for This: The Case for Kafka as the Agent's Memory Layer

# Your AI Agent Knows What. It Doesn't Know Why.

# Kafka in a DMZ: Protecting AWS MSK with Kong Event Gateway

# A Unified Gateway for APIs + Agentic Applications on VMware VKS with Kong Konnect

# Dynamic Kafka ACLs: Implementing Identity-Aware Policies with Kong Event Gateway

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway

# Kafka Was Built for This: The Case for Kafka as the Agent's Memory Layer

# Your AI Agent Knows What. It Doesn't Know Why.

# Kafka in a DMZ: Protecting AWS MSK with Kong Event Gateway

# A Unified Gateway for APIs + Agentic Applications on VMware VKS with Kong Konnect

# Dynamic Kafka ACLs: Implementing Identity-Aware Policies with Kong Event Gateway

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway

## Ready to see Kong in action?

## step-0