Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway
Alex Drag
Head of Product Marketing
The AI coding and Agent Harness approach is no longer experimental. This is likely the most impactful agentic AI use case in production today, and Claude Code is one of the solutions really leading the charge. But as engineering teams race to adopt Claude Code across their organizations, a critical question emerges: who's governing all that LLM traffic?
Without centralized LLM governance, every Claude Code session becomes an unmonitored pipeline between your developers, your proprietary codebase, and Anthropic's API. That's a problem with real security, cost, and compliance consequences.
This blog breaks down the situation we currently find ourselves in. We will discuss:
What Claude Code is and why AI-assisted coding has become the killer app of the agentic era
What can go wrong when Claude Code rollouts aren't properly governed
How an AI Gateway (specifically Kong AI Gateway) gives platform teams the control plane they need to manage Claude Code at scale
What is Claude Code?
Claude Code is Anthropic's agentic coding and agent harness tool. Unlike traditional code-completion assistants that suggest the next line in an editor, Claude Code operates as an autonomous agent that reads entire codebases, edits files across multiple directories, runs terminal commands, executes tests, interprets error messages, and iterates on solutions–and does it all from natural language instructions.
Claude Code can run in many places: the terminal, IDEs like VS Code and JetBrains, through a desktop app, and even in the browser. It works with models across the Claude family, including Opus 4.6, Sonnet 4.5, and Haiku 4.5. Enterprise users can also run Claude Code through Amazon Bedrock or Google Cloud Vertex AI instances.
Rather than just passively responding to prompts, it actively searches codebases for context, spawns subagents for parallel work, integrates with external tools via the Model Context Protocol (MCP), and maintains project-level memory through CLAUDE.md configuration files. It doesn't just generate code — it plans, executes, tests, and ships.
Done right, Claude Code is a major accelerant for any engineering org.
AI Coding and Agent Harness: The Killer Use Case for Enterprise AI
There is no shortage of hype around AI agents, but AI-assisted coding through an Agent Harness is where agent capabilities are delivering measurable, production-grade results today.
But the impact extends well beyond individual productivity. Startups are shipping faster with leaner teams. Enterprises are rethinking engineering productivity per headcount. Companies across the technology industry have begun adjusting workforce strategies in response to AI-assisted coding gains.
This adoption curve shows no signs of slowing. Stack Overflows’s 2025 Developer survey found that 85% of developers are either already using or are planning to use AI coding tools. Gartner predicts that by 2027, 75% of hiring processes will include certification or testing for AI proficiency. AI coding agents are becoming standard development infrastructure, not optional enhancements. AI coding is already driving the future of AI-augmented software engineering. There is simply no way around it.
This is promising. And — like we mentioned earlier — the potential is massive.
But so is the risk.
Enterprise-wide rollouts of Claude Code (or any other AI coding tool) don’t just result in productivity gains. They also result in an entirely new category of LLM traffic to manage — traffic that is high-volume, high-context, and deeply intertwined with proprietary source code and business logic. And this brings real business risk.
The Business Risks of Claude Code Without LLM Governance
Rapid Claude Code adoption withoutgovernance creates a set of risks that compound quickly across an engineering organization.
Uncontrolled cost exposure: Without centralized visibility and enforcement, it is remarkably easy for costs to spiral. Developers iterating on complex tasks may burn through substantial token volumes without anyone in finance or platform engineering having a clear picture of spending. There are no built-in organizational budget controls in Claude Code itself — that responsibility falls to whatever sits between the developer and the API.
Sensitive data leakage: Claude Code sends code context to Anthropic's servers for processing. That means proprietary source code, business logic, environment variables, API keys, configuration files, and potentially customer data are all transmitted over the network. Security researchers have already identified and patched vulnerabilities in Claude Code (including CVE-2025-54794 for path restriction bypass and CVE-2025-54795 for command injection). Without a governance layer inspecting and controlling what flows to the LLM, organizations have limited ability to enforce data loss prevention policies or prevent sensitive data leakage
No audit trail: Regulated industries require clear records of AI usage — what was sent, what was returned, which models were used, by whom, and when. Claude Code out of the box does not produce the centralized, structured audit logs that compliance teams need. Every ungoverned session is a blind spot that makes it impossible to answer: "Who asked Claude to modify this production configuration?"
Shadow AI proliferation: When LLM governance is absent, developers find their own paths. Different teams may configure Claude Code differently, use different models, apply different security practices, or bypass organizational controls entirely. This "shadow AI" fragmentation makes it impossible to enforce consistent security policies, accurately forecast costs, or demonstrate compliance.
Model and provider lock-in: Without an abstraction layer, organizations hardcode their tooling and workflows directly to Anthropic's API. If pricing changes, models are deprecated, or the organization wants to evaluate alternative providers, migration becomes expensive and disruptive.
These aren't theoretical risks. They are the predictable consequences of rolling out a powerful, autonomous, API-consuming agent to every developer in an organization without a centralized AI Gateway for LLM governance.
Why the AI Gateway Pattern Solves Claude Code Governance
The AI gateway pattern addresses these risks by inserting a centralized control tower that governs the relationship between all AI-consuming applications (like Claude Code) and the models they call. It is the same architectural principle that API gateways brought to REST and gRPC traffic, now extended to handle the specific requirements of LLM workloads.
An AI gateway centralizes authentication and access control, so individual developers never need direct access to API keys. It enforces rate limits and token-based budgets to prevent runaway costs. It provides prompt and response logging for auditability. It enables content filtering, PII detection, and data loss prevention at the network layer. It offers observability through pre-built dashboards and AI-specific analytics. And it creates a provider abstraction layer, allowing organizations to route traffic across models and providers without changing client configurations.
Direct API vs. AI Gateway: A Governance Comparison
To understand the value of the AI Gateway pattern, compare it to a standard direct integration:
This pattern has emerged as essential infrastructure. Gartner's Hype Cycle for Generative AI 2025 identifies AI gateways as a critical infrastructure component — no longer optional, but required for scaling AI responsibly. And the EU AI Act and frameworks like the OWASP LLM Governance Checklist point to API gateways as central observability and enforcement points for AI traffic.
For Claude Code specifically, the AI gateway pattern is a natural fit because Claude Code already communicates with Anthropic's API over standard HTTP. Routing that traffic through a gateway requires no changes to the developer's workflow — just a configuration change to point Claude Code at the gateway endpoint instead of directly at Anthropic.
How Kong AI Gateway Governs Claude Code Traffic
Kong AI Gateway extends Kong's mature, battle-tested API management platform to AI workloads. It provides enterprise-grade governance, security, and observability for LLM traffic, including traffic from Claude Code sessions.
Here's how it works in practice.
Routing Claude Code traffic through Kong AI Gateway
We’ve published a step-by-step guide for routing Claude Code CLI traffic through Kong AI Gateway. The setup involves configuring Claude Code's API key helper to use a dummy key (since Kong handles authentication upstream), then pointing Claude Code at the local Kong gateway endpoint using the ANTHROPIC_BASE_URL environment variable:
ANTHROPIC_BASE_URL=http://localhost:8000/anything \ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 \
claude
On the Kong side, the AI Proxy plugin is configured to forward requests to Anthropic's API with the real API key, the correct model, and the llm_format: anthropic setting that ensures schema compatibility between Claude Code's native request format and the gateway. The configuration also supports raising the maximum request body size to 512 KB to handle the large prompts that Claude Code generates when working with substantial codebases.
But what are some of the specific benefits of using Kong AI Gateway to govern your Claude Code rollouts?
Centralized Authentication and Key Management
With Kong AI Gateway in the path, individual developers never hold or manage Anthropic API keys. The gateway injects the x-api-key header on behalf of the developer, using credentials managed centrally by the platform team. This eliminates credential sprawl, simplifies key rotation, and prevents API key leakage through developer workstations or version control.
Cost Control and Rate Limiting
Kong's plugin ecosystem includes rate limiting, token-based quotas, and usage tracking. Platform teams can set per-developer, per-team, or per-project limits on token consumption. The AI Proxy plugin logs token usage statistics — prompt tokens, completion tokens, total tokens, and cost — for every request, giving finance and platform teams the data they need to allocate and manage AI spend. Semantic caching further reduces costs by caching responses to semantically similar prompts. If you’re able to achieve industry-standard cache hit rates, you can save serious token spend per prompt across the enterprise (and this is already becoming a massive problem).
Traffic Observability and LLM Audit Logging
Kong's File Log plugin (or any of its logging integrations) captures the full request and response metadata for every Claude Code session routed through the gateway. A typical log record includes the user agent (claude-cli), the model used, token counts, latency metrics (including time-to-first-token), and the provider name. This gives compliance teams a structured, centralized audit trail of all Claude Code usage across the organization and answers the critical question: "How to audit Claude Code usage enterprise-wide?"
Security and Content Filtering
Kong AI Gateway supports PII sanitization across 12 languages, semantic prompt guards for enforcing content policies, and guardrails that determine what behaviors are allowed or blocked. These capabilities can prevent sensitive data — like customer PII, internal API keys, or classified business logic — from being transmitted to Anthropic's API, even if a developer inadvertently includes it in a Claude Code session. For example, you can configure the gateway to detect and redact patterns matching credit card numbers or specific internal project codenames before the request ever leaves your infrastructure.
Provider Abstraction and Model Routing
Kong AI Gateway provides a universal LLM API that routes traffic across OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure AI, and other providers. This means organizations can start with Claude Code on Anthropic's API today and seamlessly route traffic to Bedrock-hosted or Vertex-hosted Claude models tomorrow–without changing developer workflows.
MCP Governance
As Claude Code's integration with the Model Context Protocol deepens–connecting it to external tools like Google Drive, Jira, Slack, and custom internal systems–Kong AI Gateway extends governance to MCP traffic as well. This includes observability into the tools, workflows, and prompts that comprise MCP interactions, as well as the ability to automatically generate secure MCP servers from Kong-managed APIs.
We are already seeing some organizations pausing MCP rollouts due to lack of governance around MCP access to enterprise context and sensitive data. While we understand this decision due to the potential risk, these organizations will fall behind those organizations that don’t have to pause MCP rollouts because they invested in governance to begin with.
Start Governing Claude Code with an AI Gateway Today
Claude Code is a transformative tool. It makes developers faster, codebases more accessible, and engineering organizations more productive. But like any powerful tool that consumes external APIs, processes sensitive data, and operates autonomously, it requires governance.
An AI gateway is the architectural pattern that makes governance possible without slowing developers down. Kong AI Gateway delivers that pattern with enterprise-grade maturity, a rich plugin ecosystem, and native support for routing and governing Claude Code traffic.
What is the difference between Claude Code and a standard API Gateway?
A standard API gateway manages REST or gRPC traffic, focusing on connectivity and basic security. An AI Gateway, like Kong, includes specific logic for Large Language Models (LLMs), such as token-based rate limiting (not just request-based), prompt engineering capabilities, PII redaction within prompts, and model routing. It understands the "language" of AI, whereas a standard gateway just sees raw data packets.
How do I route Claude Code through an API gateway without breaking the CLI?
You can route Claude Code by changing the ANTHROPIC_BASE_URL environment variable to point to your local or hosted Kong AI Gateway endpoint (e.g., http://localhost:8000/anything). You must also configure the gateway to accept the request and forward it to Anthropic with the correct authentication headers, as detailed in our setup guide.
Can Kong AI Gateway prevent developers from sending sensitive code to Anthropic?
Yes. Kong AI Gateway offers plugins for PII (Personally Identifiable Information) detection and semantic guardrails. You can configure regex patterns or semantic rules to identify sensitive data (like API keys, customer IDs, or specific proprietary code blocks) and block the request or redact the sensitive information before it is sent to the LLM provider.
Does using an AI Gateway add latency to Claude Code coding sessions?
The latency added by a high-performance gateway like Kong is negligible (milliseconds) compared to the inference time of the LLM itself (often seconds). Furthermore, features like semantic caching can actually reduce total latency by serving pre-computed answers for repeated queries, making the overall experience faster for developers.
How does an AI Gateway help with Claude Code audit compliance?
An AI Gateway acts as a centralized logging point. It captures exactly what prompt was sent, what code was generated, who sent it, and when. Unlike scattered local logs on developer machines, the gateway pushes these logs to your SIEM or analytics platform, creating an immutable audit trail required for frameworks like SOC2, HIPAA, or the EU AI Act.
The evolution of AI agents and autonomous systems has created new challenges for enterprise organizations. While securing API endpoints is well-understood, controlling access to individual AI agent tools presents a unique authorization problem. Toda
Michael Field
Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways
MCP servers expose all tools by default. There are two problems with this: security (agents get capabilities they shouldn't have) and performance (too many tools degrade LLM tool selection). The solution? Put a gateway between agents and MCP server
Deepak Grewal
From APIs to Agentic Integration: Introducing Kong Context Mesh
Agents are ultimately decision makers. They make those decisions by combining intelligence with context, ultimately meaning they are only ever as useful as the context they can access. An agent that can't check inventory levels, look up customer his
Alex Drag
Agentic AI Governance: Managing Shadow AI and Risk for Competitive Advantage
Why Risk Management Will Separate Agentic AI Winners from Agentic AI Casualties
Let's be honest about what's happening inside most enterprises right now. Development teams are under intense pressure to ship AI features. The mandate from leadership
Alex Drag
Building the Agentic AI Developer Platform: A 5-Pillar Framework
The first pillar is enablement. Developers need tools that reduce friction when building AI-powered applications and agents. This means providing: Native MCP support for connecting agents to enterprise tools and data sources SDKs and frameworks op
Alex Drag
AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration
Why AI guardrails matter It's natural to consider the necessity of guardrails for your sophisticated AI implementations. The truth is, much like any powerful technology, AI requires a set of protective measures to ensure its reliability and integrit
Jason Matis
Secure AI at Scale: Prisma AIRS and Kong AI Gateway Now Integrated
In today's digital landscape, APIs are the backbone of modern applications, and AI is the engine of innovation. As organizations increasingly rely on microservices and AI-powered features, the API gateway has become the critical control point for man
Tom Prenderville
Ready to see Kong in action?
Get a personalized walkthrough of Kong's platform tailored to your architecture, use cases, and scale requirements.