[AI Gateway](/blog/tag/ai-gateway)AI Gateway

March 7, 2026

9 min read

Alex Drag

Head of Product Marketing

The AI coding and Agent Harness approach is no longer experimental. This is likely the most impactful agentic AI use case in production today, and Claude Code is one of the solutions really leading the charge. But as engineering teams race to adopt Claude Code across their organizations, a critical question emerges: who's governing all that LLM traffic?

Without centralized LLM governance, every Claude Code session becomes an unmonitored pipeline between your developers, your proprietary codebase, and Anthropic's API. That's a problem with real security, cost, and compliance consequences.

This blog breaks down the situation we currently find ourselves in. We will discuss:

- What Claude Code is and why AI-assisted coding has become the killer app of the agentic era
- What can go wrong when Claude Code rollouts aren't properly governed
- How an AI Gateway (specifically Kong AI Gateway) gives platform teams the control plane they need to manage Claude Code at scale

## What is Claude Code?

Claude Code is Anthropic's agentic coding and agent harness tool. Unlike traditional code-completion assistants that suggest the next line in an editor, Claude Code operates as an autonomous agent that reads entire codebases, edits files across multiple directories, runs terminal commands, executes tests, interprets error messages, and iterates on solutions–and does it all from natural language instructions.

Claude Code can run in many places: the terminal, IDEs like VS Code and JetBrains, through a desktop app, and even in the browser. It works with models across the Claude family, including Opus 4.6, Sonnet 4.5, and Haiku 4.5. Enterprise users can also run Claude Code through Amazon Bedrock or Google Cloud Vertex AI instances.

Rather than just passively responding to prompts, it actively searches codebases for context, spawns subagents for parallel work, integrates with external tools via the Model Context Protocol (MCP), and maintains project-level memory through CLAUDE.md configuration files. It doesn't just generate code — it plans, executes, tests, and ships.

Done right, Claude Code is a *major *accelerant for any engineering org.

## AI Coding and Agent Harness: The Killer Use Case for Enterprise AI

There is no shortage of hype around AI agents, but AI-assisted coding through an Agent Harness is where agent capabilities are delivering measurable, production-grade results today.

But the impact extends well beyond individual productivity. Startups are shipping faster with leaner teams. Enterprises are rethinking engineering productivity per headcount. Companies across the technology industry have begun adjusting workforce strategies in response to AI-assisted coding gains.

This adoption curve shows no signs of slowing. [_Stack Overflows’s 2025 Developer survey_](https://survey.stackoverflow.co/2025/ai)_Stack Overflows’s 2025 Developer survey_ found that 85% of developers are either already using or are planning to use AI coding tools. [_Gartner predicts that by 2027_](https://www.gartner.com/en/newsroom/press-releases/2025-10-07-gartner-says-ai-revolution-and-cost-pressures-are-two-forces-driving-the-top-four-trends-for-talent-acquisition-in-2026)_Gartner predicts that by 2027_, 75% of hiring processes will include certification or testing for AI proficiency. AI coding agents are becoming standard development infrastructure, not optional enhancements. AI coding is already driving the future of AI-augmented software engineering. There is simply no way around it.

This is promising. And — like we mentioned earlier — the potential is massive.

But so is the risk.

Enterprise-wide rollouts of Claude Code (or any other AI coding tool) don’t just result in productivity gains. They also result in an entirely new category of LLM traffic to manage — traffic that is high-volume, high-context, and deeply intertwined with proprietary source code and business logic. And this brings real business risk.

## The Business Risks of Claude Code Without LLM Governance

Rapid Claude Code adoption *without* *governance* creates a set of risks that compound quickly across an engineering organization.

**Uncontrolled cost exposure: **Without centralized visibility *and *enforcement, it is remarkably easy for costs to spiral. Developers iterating on complex tasks may burn through substantial token volumes without anyone in finance or platform engineering having a clear picture of spending. There are no built-in organizational budget controls in Claude Code itself — that responsibility falls to whatever sits between the developer and the API.

**Sensitive data leakage: **Claude Code sends code context to Anthropic's servers for processing. That means proprietary source code, business logic, environment variables, API keys, configuration files, and potentially customer data are all transmitted over the network. Security researchers have already identified and patched vulnerabilities in Claude Code (including CVE-2025-54794 for path restriction bypass and CVE-2025-54795 for command injection). Without a governance layer inspecting and controlling what flows to the LLM, organizations have limited ability to enforce data loss prevention policies or prevent sensitive data leakage

**No audit trail:** Regulated industries require clear records of AI usage — what was sent, what was returned, which models were used, by whom, and when. Claude Code out of the box does not produce the centralized, structured audit logs that compliance teams need. Every ungoverned session is a blind spot that makes it impossible to answer: "Who asked Claude to modify this production configuration?"

**Shadow AI proliferation: **When LLM governance is absent, developers find their own paths. Different teams may configure Claude Code differently, use different models, apply different security practices, or bypass organizational controls entirely. [_This "shadow AI" fragmentation makes it impossible to enforce consistent security policies_](https://konghq.com/blog/enterprise/agentic-ai-governance-managing-shadow-ai-risk)_This "shadow AI" fragmentation makes it impossible to enforce consistent security policies_, accurately forecast costs, or demonstrate compliance.

**Model and provider lock-in:** Without an abstraction layer, organizations hardcode their tooling and workflows directly to Anthropic's API. If pricing changes, models are deprecated, or the organization wants to evaluate alternative providers, migration becomes expensive and disruptive.

These aren't theoretical risks. They are the predictable consequences of rolling out a powerful, autonomous, API-consuming agent to every developer in an organization without a centralized AI Gateway for LLM governance.

## Why the AI Gateway Pattern Solves Claude Code Governance

The AI gateway pattern addresses these risks by inserting a centralized control tower that governs the relationship between all AI-consuming applications (like Claude Code) and the models they call. It is the same architectural principle that API gateways brought to REST and gRPC traffic, now extended to handle the specific requirements of LLM workloads.

An AI gateway centralizes authentication and access control, so individual developers never need direct access to API keys. It enforces rate limits and token-based budgets to prevent runaway costs. It provides prompt and response logging for auditability. It enables content filtering, PII detection, and data loss prevention at the network layer. It offers observability through pre-built dashboards and AI-specific analytics. And it creates a provider abstraction layer, allowing organizations to route traffic across models and providers without changing client configurations.

**Direct API vs. AI Gateway: A Governance Comparison**

To understand the value of the AI Gateway pattern, compare it to a standard direct integration:

This pattern has emerged as essential infrastructure. Gartner's Hype Cycle for Generative AI 2025 identifies AI gateways as a critical infrastructure component — no longer optional, but required for scaling AI responsibly. And the [EU AI Act](https://konghq.com/blog/enterprise/eu-ai-act-compliance)EU AI Act and frameworks like the OWASP LLM Governance Checklist point to API gateways as central observability and enforcement points for AI traffic.

For Claude Code specifically, the AI gateway pattern is a natural fit because Claude Code already communicates with Anthropic's API over standard HTTP. Routing that traffic through a gateway requires no changes to the developer's workflow — just a configuration change to point Claude Code at the gateway endpoint instead of directly at Anthropic.

## How Kong AI Gateway Governs Claude Code Traffic

Kong AI Gateway extends Kong's mature, battle-tested API management platform to AI workloads. It provides enterprise-grade governance, security, and observability for LLM traffic, including traffic from Claude Code sessions.

Here's how it works in practice.

**Routing Claude Code traffic through Kong AI Gateway**

We’ve published a step-by-step guide for [_routing Claude Code CLI traffic through Kong AI Gateway_](https://developer.konghq.com/how-to/use-claude-code-with-ai-gateway-anthropic/)_routing Claude Code CLI traffic through Kong AI Gateway_. The setup involves configuring Claude Code's API key helper to use a dummy key (since Kong handles authentication upstream), then pointing Claude Code at the local Kong gateway endpoint using the `ANTHROPIC_BASE_URL` environment variable:

ANTHROPIC_BASE_URL=http://localhost:8000/anything \
ANTHROPIC_MODEL=claude-sonnet-4-5-20250929 \
claude

On the Kong side, the AI Proxy plugin is configured to forward requests to Anthropic's API with the real API key, the correct model, and the `llm_format: anthropic` setting that ensures schema compatibility between Claude Code's native request format and the gateway. The configuration also supports raising the maximum request body size to 512 KB to handle the large prompts that Claude Code generates when working with substantial codebases.

But what are some of the specific benefits of using Kong AI Gateway to govern your Claude Code rollouts?

**Centralized Authentication and Key Management**

With Kong AI Gateway in the path, individual developers never hold or manage Anthropic API keys. The gateway injects the `x-api-key` header on behalf of the developer, using credentials managed centrally by the platform team. This eliminates credential sprawl, simplifies key rotation, and prevents API key leakage through developer workstations or version control.

**Cost Control and Rate Limiting**

Kong's plugin ecosystem includes rate limiting, token-based quotas, and usage tracking. Platform teams can set per-developer, per-team, or per-project limits on token consumption. The AI Proxy plugin logs token usage statistics — prompt tokens, completion tokens, total tokens, and cost — for every request, giving finance and platform teams the data they need to allocate and manage AI spend. Semantic caching further reduces costs by caching responses to semantically similar prompts. If you’re able to achieve industry-standard cache hit rates, you can save serious token spend per prompt across the enterprise (and this is [_already becoming a __*massive *__problem_](https://konghq.com/blog/enterprise/ai-cost-management-stopping-margin-erosion)_already becoming a __*massive *__problem_).

**Traffic Observability and LLM Audit Logging **

Kong's File Log plugin (or any of its logging integrations) captures the full request and response metadata for every Claude Code session routed through the gateway. A typical log record includes the user agent (`claude-cli`), the model used, token counts, latency metrics (including time-to-first-token), and the provider name. This gives compliance teams a structured, centralized audit trail of all Claude Code usage across the organization and answers the critical question: "How to audit Claude Code usage enterprise-wide?"

**Security and Content Filtering**

Kong AI Gateway [_supports PII sanitization across 12 languages_](https://developer.konghq.com/how-to/protect-sensitive-information-with-ai/)_supports PII sanitization across 12 languages_, semantic prompt guards for enforcing content policies, and guardrails that determine what behaviors are allowed or blocked. These capabilities can prevent sensitive data — like customer PII, internal API keys, or classified business logic — from being transmitted to Anthropic's API, even if a developer inadvertently includes it in a Claude Code session. For example, you can configure the gateway to detect and redact patterns matching credit card numbers or specific internal project codenames before the request ever leaves your infrastructure.

**Provider Abstraction and Model Routing**

Kong AI Gateway provides a universal LLM API that routes traffic across OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure AI, and other providers. This means organizations can start with Claude Code on Anthropic's API today and seamlessly route traffic to Bedrock-hosted or Vertex-hosted Claude models tomorrow–without changing developer workflows.

**MCP Governance**

As Claude Code's integration with the Model Context Protocol deepens–connecting it to external tools like Google Drive, Jira, Slack, and custom internal systems–Kong AI Gateway extends governance to MCP traffic as well. This includes [_observability into the tools, workflows, and prompts that comprise MCP interactions_](https://konghq.com/blog/engineering/mcp-tool-governance-security-meets-context-efficiency)_observability into the tools, workflows, and prompts that comprise MCP interactions_, as well as the ability to automatically generate secure MCP servers from Kong-managed APIs.

We are already seeing some organizations pausing MCP rollouts due to lack of governance around MCP access to enterprise context and sensitive data. While we understand this decision due to the potential risk, these organizations *will *fall behind those organizations that *don’t have to *pause MCP rollouts because [_they invested in governance to begin with_](https://konghq.com/blog/enterprise/agentic-ai-governance-managing-shadow-ai-risk)_they invested in governance to begin with_.

## Start Governing Claude Code with an AI Gateway Today

Claude Code is a transformative tool. It makes developers faster, codebases more accessible, and engineering organizations more productive. But like any powerful tool that consumes external APIs, processes sensitive data, and operates autonomously, it requires governance.

An AI gateway is the architectural pattern that makes governance possible without slowing developers down. Kong AI Gateway delivers that pattern with enterprise-grade maturity, a rich plugin ecosystem, and native support for routing and governing Claude Code traffic.

To get started, follow Kong's step-by-step guide:[ _Route Claude CLI traffic through AI Gateway and Anthropic_](https://developer.konghq.com/how-to/use-claude-code-with-ai-gateway-anthropic/) _Route Claude CLI traffic through AI Gateway and Anthropic_, or reach out to us today to [_get started on your AI coding governance journey_](https://konghq.com/contact-sales)_get started on your AI coding governance journey_.

## FAQs About Governing Claude Code

**What is the difference between Claude Code and a standard API Gateway?**

A standard API gateway manages REST or gRPC traffic, focusing on connectivity and basic security. An AI Gateway, like Kong, includes specific logic for Large Language Models (LLMs), such as token-based rate limiting (not just request-based), prompt engineering capabilities, PII redaction within prompts, and model routing. It understands the "language" of AI, whereas a standard gateway just sees raw data packets.

**How do I route Claude Code through an API gateway without breaking the CLI?**

You can route Claude Code by changing the `ANTHROPIC_BASE_URL` environment variable to point to your local or hosted Kong AI Gateway endpoint (e.g., `http://localhost:8000/anything`). You must also configure the gateway to accept the request and forward it to Anthropic with the correct authentication headers, as detailed in our setup guide.

**Can Kong AI Gateway prevent developers from sending sensitive code to Anthropic?**

Yes. Kong AI Gateway offers plugins for PII (Personally Identifiable Information) detection and semantic guardrails. You can configure regex patterns or semantic rules to identify sensitive data (like API keys, customer IDs, or specific proprietary code blocks) and block the request or redact the sensitive information before it is sent to the LLM provider.

**Does using an AI Gateway add latency to Claude Code coding sessions?**

The latency added by a high-performance gateway like Kong is negligible (milliseconds) compared to the inference time of the LLM itself (often seconds). Furthermore, features like semantic caching can actually reduce total latency by serving pre-computed answers for repeated queries, making the overall experience faster for developers.

**How does an AI Gateway help with Claude Code audit compliance?**

An AI Gateway acts as a centralized logging point. It captures exactly what prompt was sent, what code was generated, who sent it, and when. Unlike scattered local logs on developer machines, the gateway pushes these logs to your SIEM or analytics platform, creating an immutable audit trail required for frameworks like SOC2, HIPAA, or the EU AI Act.

[Learn More](/products/kong-konnect/)Learn More [Get a Demo](/contact-sales)Get a Demo

**Topics**

- [AI Gateway](/blog/tag/ai-gateway)AI Gateway- [Agentic AI](/blog/tag/agentic-ai)Agentic AI- [Governance](/blog/tag/governance)Governance- [AI Security](/blog/tag/ai-security)AI Security

Alex Drag

Head of Product Marketing

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

Q: How do I route Claude Code through an API gateway without breaking the CLI?

You can route Claude Code by changing the ANTHROPIC_BASE_URL environment variable to point to your local or hosted Kong AI Gateway endpoint (e.g., http://localhost:8000/anything). You must also configure the gateway to accept the request and forward it to Anthropic with the correct authentication headers, as detailed in our setup guide.

[Engineering](/blog/tag)EngineeringFebruary 3, 2026

MCP servers expose all tools by default. There are two problems with this: security (agents get capabilities they shouldn't have) and performance (too many tools degrade LLM tool selection). The solution? Put a gateway between agents and MCP server

Deepak Grewal

# Introducing MCP Tool ACLs: Fine-Grained Authorization for AI Agent Tools

[Product Releases](/blog/tag)Product ReleasesJanuary 14, 2026

The evolution of AI agents and autonomous systems has created new challenges for enterprise organizations. While securing API endpoints is well-understood, controlling access to individual AI agent tools presents a unique authorization problem. Toda

Michael Field

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

[Enterprise](/blog/tag)EnterpriseJuly 2, 2026

The Langflow CVEs and Dify Vulnerabilities: What Actually Happened Langflow's security problems arrived in waves. CVE-2025-3248 introduced a code injection vulnerability allowing remote code execution through unsanitized user input \10\]. Months la

Kong

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

[Engineering](/blog/tag)EngineeringFebruary 3, 2026

Deepak Grewal

# AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

[Engineering](/blog/tag)EngineeringAugust 25, 2025

Why AI guardrails matter It's natural to consider the necessity of guardrails for your sophisticated AI implementations. The truth is, much like any powerful technology, AI requires a set of protective measures to ensure its reliability and integrit

Jason Matis

# AI Gateway vs. Direct LLM API Integration: The Architecture Decision Defining Your AI Strategy

[Engineering](/blog/tag)EngineeringJuly 2, 2026

Most teams start the same way. A developer creates an API key, calls OpenAI or Anthropic, and ships a prototype. The problems surface when that prototype becomes five production services calling three providers. Hardcoded provider dependencies are

Kong

# Moving from Probabilistic Reasoning to Deterministic Execution

[Engineering](/blog/tag)EngineeringJune 24, 2026

Building Reliable GenAI Architectures This is the second post in a series. For the first part, see Why We Need to Stop Prompt Hacking . Generative AI systems do not fail because models are weak. They fail because architectures are incomplete. On

Hugo Guerrero

# Why We Need to Stop Prompt Hacking

[Engineering](/blog/tag)EngineeringJune 23, 2026

The Mirage of the Perfect Prompt The realization that prompt hacking is a dead end often arrives after significant resources have been spent. I realized this while speaking with an international system integrator in India whose team was exploring

Hugo Guerrero

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

[Engineering](/blog/tag)EngineeringFebruary 3, 2026

Deepak Grewal

# Governing Claude Code: How To Secure Agent Harness Rollouts with Kong AI Gateway

## What is Claude Code?

## AI Coding and Agent Harness: The Killer Use Case for Enterprise AI

## The Business Risks of Claude Code Without LLM Governance

## Why the AI Gateway Pattern Solves Claude Code Governance

## How Kong AI Gateway Governs Claude Code Traffic

## Start Governing Claude Code with an AI Gateway Today

## FAQs About Governing Claude Code

## Unleash the power of APIs with Kong Konnect

Recommended posts

# Introducing MCP Tool ACLs: Fine-Grained Authorization for AI Agent Tools

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

# AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

# AI Gateway vs. Direct LLM API Integration: The Architecture Decision Defining Your AI Strategy

# Moving from Probabilistic Reasoning to Deterministic Execution

# Why We Need to Stop Prompt Hacking

# Introducing MCP Tool ACLs: Fine-Grained Authorization for AI Agent Tools

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

# AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

# AI Gateway vs. Direct LLM API Integration: The Architecture Decision Defining Your AI Strategy

# Moving from Probabilistic Reasoning to Deterministic Execution

# Why We Need to Stop Prompt Hacking

# Introducing MCP Tool ACLs: Fine-Grained Authorization for AI Agent Tools

# AI Agent Platforms Are Getting Hacked. Here's What's Missing.

# Model Context Protocol (MCP) Security: How to Restrict Tool Access Using AI Gateways

# AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

# AI Gateway vs. Direct LLM API Integration: The Architecture Decision Defining Your AI Strategy

# Moving from Probabilistic Reasoning to Deterministic Execution

# Why We Need to Stop Prompt Hacking

## Ready to see Kong in action?

## step-0