# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway
Alex Drag
Head of Product Marketing
The AI coding and Agent Harness approach is no longer experimental. This is likely the most impactful agentic AI use case in production today, and Claude Code is one of the solutions really leading the charge. But as engineering teams race to adopt Claude Code across their organizations, a critical question emerges: who's governing all that LLM traffic?
Without centralized LLM governance, every Claude Code session becomes an unmonitored pipeline between your developers, your proprietary codebase, and Anthropic's API. That's a problem with real security, cost, and compliance consequences.
This blog breaks down the situation we currently find ourselves in. We will discuss:
- What Claude Code is and why AI-assisted coding has become the killer app of the agentic era
- What can go wrong when Claude Code rollouts aren't properly governed
- How an AI Gateway (specifically Kong AI Gateway) gives platform teams the control plane they need to manage Claude Code at scale
## What is Claude Code?
Claude Code is Anthropic's agentic coding and agent harness tool. Unlike traditional code-completion assistants that suggest the next line in an editor, Claude Code operates as an autonomous agent that reads entire codebases, edits files across multiple directories, runs terminal commands, executes tests, interprets error messages, and iterates on solutions–and does it all from natural language instructions.
Claude Code can run in many places: the terminal, IDEs like VS Code and JetBrains, through a desktop app, and even in the browser. It works with models across the Claude family, including Opus 4.6, Sonnet 4.5, and Haiku 4.5. Enterprise users can also run Claude Code through Amazon Bedrock or Google Cloud Vertex AI instances.
Rather than just passively responding to prompts, it actively searches codebases for context, spawns subagents for parallel work, integrates with external tools via the Model Context Protocol (MCP), and maintains project-level memory through CLAUDE.md configuration files. It doesn't just generate code — it plans, executes, tests, and ships.
Done right, Claude Code is a *major *accelerant for any engineering org.
## AI Coding and Agent Harness: The Killer Use Case for Enterprise AI
There is no shortage of hype around AI agents, but AI-assisted coding through an Agent Harness is where agent capabilities are delivering measurable, production-grade results today.
But the impact extends well beyond individual productivity. Startups are shipping faster with leaner teams. Enterprises are rethinking engineering productivity per headcount. Companies across the technology industry have begun adjusting workforce strategies in response to AI-assisted coding gains.
This is promising. And — like we mentioned earlier — the potential is massive.
But so is the risk.
Enterprise-wide rollouts of Claude Code (or any other AI coding tool) don’t just result in productivity gains. They also result in an entirely new category of LLM traffic to manage — traffic that is high-volume, high-context, and deeply intertwined with proprietary source code and business logic. And this brings real business risk.
## The Business Risks of Claude Code Without LLM Governance
Rapid Claude Code adoption *without**governance* creates a set of risks that compound quickly across an engineering organization.
**Uncontrolled cost exposure: **Without centralized visibility *and *enforcement, it is remarkably easy for costs to spiral. Developers iterating on complex tasks may burn through substantial token volumes without anyone in finance or platform engineering having a clear picture of spending. There are no built-in organizational budget controls in Claude Code itself — that responsibility falls to whatever sits between the developer and the API.
**Sensitive data leakage: **Claude Code sends code context to Anthropic's servers for processing. That means proprietary source code, business logic, environment variables, API keys, configuration files, and potentially customer data are all transmitted over the network. Security researchers have already identified and patched vulnerabilities in Claude Code (including CVE-2025-54794 for path restriction bypass and CVE-2025-54795 for command injection). Without a governance layer inspecting and controlling what flows to the LLM, organizations have limited ability to enforce data loss prevention policies or prevent sensitive data leakage
**No audit trail:** Regulated industries require clear records of AI usage — what was sent, what was returned, which models were used, by whom, and when. Claude Code out of the box does not produce the centralized, structured audit logs that compliance teams need. Every ungoverned session is a blind spot that makes it impossible to answer: "Who asked Claude to modify this production configuration?"
**Model and provider lock-in:** Without an abstraction layer, organizations hardcode their tooling and workflows directly to Anthropic's API. If pricing changes, models are deprecated, or the organization wants to evaluate alternative providers, migration becomes expensive and disruptive.
These aren't theoretical risks. They are the predictable consequences of rolling out a powerful, autonomous, API-consuming agent to every developer in an organization without a centralized AI Gateway for LLM governance.
## Why the AI Gateway Pattern Solves Claude Code Governance
The AI gateway pattern addresses these risks by inserting a centralized control tower that governs the relationship between all AI-consuming applications (like Claude Code) and the models they call. It is the same architectural principle that API gateways brought to REST and gRPC traffic, now extended to handle the specific requirements of LLM workloads.
An AI gateway centralizes authentication and access control, so individual developers never need direct access to API keys. It enforces rate limits and token-based budgets to prevent runaway costs. It provides prompt and response logging for auditability. It enables content filtering, PII detection, and data loss prevention at the network layer. It offers observability through pre-built dashboards and AI-specific analytics. And it creates a provider abstraction layer, allowing organizations to route traffic across models and providers without changing client configurations.
**Direct API vs. AI Gateway: A Governance Comparison**
To understand the value of the AI Gateway pattern, compare it to a standard direct integration:
This pattern has emerged as essential infrastructure. Gartner's Hype Cycle for Generative AI 2025 identifies AI gateways as a critical infrastructure component — no longer optional, but required for scaling AI responsibly. And the [EU AI Act](https://konghq.com/blog/enterprise/eu-ai-act-compliance)EU AI Act and frameworks like the OWASP LLM Governance Checklist point to API gateways as central observability and enforcement points for AI traffic.
For Claude Code specifically, the AI gateway pattern is a natural fit because Claude Code already communicates with Anthropic's API over standard HTTP. Routing that traffic through a gateway requires no changes to the developer's workflow — just a configuration change to point Claude Code at the gateway endpoint instead of directly at Anthropic.
## How Kong AI Gateway Governs Claude Code Traffic
Kong AI Gateway extends Kong's mature, battle-tested API management platform to AI workloads. It provides enterprise-grade governance, security, and observability for LLM traffic, including traffic from Claude Code sessions.
Here's how it works in practice.
**Routing Claude Code traffic through Kong AI Gateway**
ANTHROPIC_BASE_URL=http://localhost:8000/anything \ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 \
claude
On the Kong side, the AI Proxy plugin is configured to forward requests to Anthropic's API with the real API key, the correct model, and the `llm_format: anthropic` setting that ensures schema compatibility between Claude Code's native request format and the gateway. The configuration also supports raising the maximum request body size to 512 KB to handle the large prompts that Claude Code generates when working with substantial codebases.
But what are some of the specific benefits of using Kong AI Gateway to govern your Claude Code rollouts?
**Centralized Authentication and Key Management**
With Kong AI Gateway in the path, individual developers never hold or manage Anthropic API keys. The gateway injects the `x-api-key` header on behalf of the developer, using credentials managed centrally by the platform team. This eliminates credential sprawl, simplifies key rotation, and prevents API key leakage through developer workstations or version control.
**Cost Control and Rate Limiting**
Kong's plugin ecosystem includes rate limiting, token-based quotas, and usage tracking. Platform teams can set per-developer, per-team, or per-project limits on token consumption. The AI Proxy plugin logs token usage statistics — prompt tokens, completion tokens, total tokens, and cost — for every request, giving finance and platform teams the data they need to allocate and manage AI spend. Semantic caching further reduces costs by caching responses to semantically similar prompts. If you’re able to achieve industry-standard cache hit rates, you can save serious token spend per prompt across the enterprise (and this is [_already becoming a __*massive *__problem_](https://konghq.com/blog/enterprise/ai-cost-management-stopping-margin-erosion)_already becoming a __*massive *__problem_).
**Traffic Observability and LLM Audit Logging **
Kong's File Log plugin (or any of its logging integrations) captures the full request and response metadata for every Claude Code session routed through the gateway. A typical log record includes the user agent (`claude-cli`), the model used, token counts, latency metrics (including time-to-first-token), and the provider name. This gives compliance teams a structured, centralized audit trail of all Claude Code usage across the organization and answers the critical question: "How to audit Claude Code usage enterprise-wide?"
**Security and Content Filtering**
Kong AI Gateway [_supports PII sanitization across 12 languages_](https://developer.konghq.com/how-to/protect-sensitive-information-with-ai/)_supports PII sanitization across 12 languages_, semantic prompt guards for enforcing content policies, and guardrails that determine what behaviors are allowed or blocked. These capabilities can prevent sensitive data — like customer PII, internal API keys, or classified business logic — from being transmitted to Anthropic's API, even if a developer inadvertently includes it in a Claude Code session. For example, you can configure the gateway to detect and redact patterns matching credit card numbers or specific internal project codenames before the request ever leaves your infrastructure.
**Provider Abstraction and Model Routing**
Kong AI Gateway provides a universal LLM API that routes traffic across OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure AI, and other providers. This means organizations can start with Claude Code on Anthropic's API today and seamlessly route traffic to Bedrock-hosted or Vertex-hosted Claude models tomorrow–without changing developer workflows.
## Start Governing Claude Code with an AI Gateway Today
Claude Code is a transformative tool. It makes developers faster, codebases more accessible, and engineering organizations more productive. But like any powerful tool that consumes external APIs, processes sensitive data, and operates autonomously, it requires governance.
An AI gateway is the architectural pattern that makes governance possible without slowing developers down. Kong AI Gateway delivers that pattern with enterprise-grade maturity, a rich plugin ecosystem, and native support for routing and governing Claude Code traffic.
**What is the difference between Claude Code and a standard API Gateway?**
A standard API gateway manages REST or gRPC traffic, focusing on connectivity and basic security. An AI Gateway, like Kong, includes specific logic for Large Language Models (LLMs), such as token-based rate limiting (not just request-based), prompt engineering capabilities, PII redaction within prompts, and model routing. It understands the "language" of AI, whereas a standard gateway just sees raw data packets.
**How do I route Claude Code through an API gateway without breaking the CLI?**
You can route Claude Code by changing the `ANTHROPIC_BASE_URL` environment variable to point to your local or hosted Kong AI Gateway endpoint (e.g., `http://localhost:8000/anything`). You must also configure the gateway to accept the request and forward it to Anthropic with the correct authentication headers, as detailed in our setup guide.
**Can Kong AI Gateway prevent developers from sending sensitive code to Anthropic?**
Yes. Kong AI Gateway offers plugins for PII (Personally Identifiable Information) detection and semantic guardrails. You can configure regex patterns or semantic rules to identify sensitive data (like API keys, customer IDs, or specific proprietary code blocks) and block the request or redact the sensitive information before it is sent to the LLM provider.
**Does using an AI Gateway add latency to Claude Code coding sessions?**
The latency added by a high-performance gateway like Kong is negligible (milliseconds) compared to the inference time of the LLM itself (often seconds). Furthermore, features like semantic caching can actually reduce total latency by serving pre-computed answers for repeated queries, making the overall experience faster for developers.
**How does an AI Gateway help with Claude Code audit compliance?**
An AI Gateway acts as a centralized logging point. It captures exactly what prompt was sent, what code was generated, who sent it, and when. Unlike scattered local logs on developer machines, the gateway pushes these logs to your SIEM or analytics platform, creating an immutable audit trail required for frameworks like SOC2, HIPAA, or the EU AI Act.
The evolution of AI agents and autonomous systems has created new challenges for enterprise organizations. While securing API endpoints is well-understood, controlling access to individual AI agent tools presents a unique authorization problem. Toda
MCP servers expose all tools by default. There are two problems with this: security (agents get capabilities they shouldn't have) and performance (too many tools degrade LLM tool selection). The solution? Put a gateway between agents and MCP server
Agent-to-agent communication is the next frontier of AI infrastructure. As teams decompose monolithic AI workflows into specialized agents — a research agent, a booking agent, a summarization agent — the calls between those agents become as importa
The Shifting Economic Landscape: The AI token economy in 2026 is evolving, and enterprise leaders must distinguish between low-cost input tokens and high-premium output tokens to maintain profitability. Agentic AI Financial Risks: The transition t
Why AI guardrails matter It's natural to consider the necessity of guardrails for your sophisticated AI implementations. The truth is, much like any powerful technology, AI requires a set of protective measures to ensure its reliability and integrit
The Stakes Keep Rising
The security implications are severe. OWASP's 2025 Top 10 for LLM Applications ranks prompt injection as the number one critical vulnerability. Attackers manipulate LLM inputs to override instructions, extract sensitive data,
Traditional APIs are, in a word, predictable. You know what you're getting: Compute costs that don't surprise you Traffic patterns that behave themselves Clean, well-defined request and response cycles AI APIs, especially anything that runs on LLMs