Blog
  • AI Gateway
  • AI Security
  • AIOps
  • API Security
  • API Gateway
|
    • API Management
    • API Development
    • API Design
    • Automation
    • Service Mesh
    • Insomnia
    • View All Blogs
  1. Home
  2. Blog
  3. Engineering
  4. AI Voice Agents with Kong AI Gateway and Cerebras
Engineering
November 24, 2025
4 min read

AI Voice Agents with Kong AI Gateway and Cerebras

Claudio Acquaviva
Principal Architect, Kong

The integration of next-generation AI workloads — from large language models (LLMs) to speech-to-text and text-to-speech — demands a powerful, secure, and scalable infrastructure. This is especially true for building advanced AI voice agents, which rely on seamless, natural, and highly efficient user interaction. 

This blog post explores how the combined power of Kong AI Gateway and Cerebras AI infrastructure creates an unparalleled foundation for deploying and managing these AI agents at scale. We'll detail a reference architecture that leverages Kong AI Gateway for secure traffic control, policy enforcement, and observability, while utilizing Cerebras' high-performance compute and LLM optimization to orchestrate STT, LLM (like the Qwen-3-32B model), and TTS models in a cohesive, production-ready solution.

Kong Konnect and Cerebras

Kong Gateway is an API gateway and a core component of the Kong Konnect platform. Built on a plugin-based extensibility model, it centralizes essential functions such as proxying, routing, load balancing, and health checking, efficiently managing both microservices and traditional API traffic.

One of Kong Gateway’s greatest strengths lies in its extensible plugin ecosystem, which allows seamless integration of diverse policies and functionalities — including Authentication and Authorization, Rate Limiting, Proxy Caching, Request and Response Transformation, and Traffic Control.

Kong AI Gateway extends Konnect capabilities to the world of generative AI, including LLM models and providing a unified way to connect applications with several other infrastructures, including video, images, sound, etc.

Besides abstracting the complexity of interacting with diverse GenAI infrastructure through a single and standardized interface, it provides features such as Prompt Engineering, Semantic Processing, RAG, and MCP support. This makes it a key component for organizations building AI agents, enabling developers to experiment with models, optimize costs, and ensure compliance across all AI traffic.

Cerebras provides a cutting-edge computing platform for AI workloads. At the heart of its innovation is the Wafer-Scale Engine (WSE), a powerful processor designed to accelerate deep learning training and inference by orders of magnitude compared to traditional GPU or CPU clusters.

Beyond the hardware, Cerebras offers a complete AI supercomputing solution powered by the Cerebras Software Platform (CSoft), a platform that integrates seamlessly with existing AI frameworks, like PyTorch and TensorFlow. Cerebras also offers the Cerebras Cloud, enabling users to access its powerful AI compute infrastructure as a service, including multiple GenAI Models.

AI voice agents

An AI voice agent offers a seamless, natural, and highly efficient way for users to interact with digital systems through conversation. Typically, AI voice agents rely on speech-to-text (STT) models to convert spoken language into text while text-to-speech (TTS) models transform the agent’s textual responses back into speech.

The integration of Cerebras LLM models, STT and TTS models with Kong AI Gateway enables orchestration of AI voice agents. Developers can route audio streams to STT models for transcription, pass the resulting text through a Cerebras language model for understanding or generation, and send the output to TTS for natural speech synthesis — all governed and monitored by Kong AI Gateway. 

The following diagram depicts a reference architecture of the AI voice agent:

As you can see in the diagram, Kong AI Gateway abstracts all GenAI models, including Cerebras LLM as well as the STT and TTS models. Combined with the extensive list of AI-based capabilities like Prompt Decorator, Semantic Caching, etc. Kong AI Gateway provides an easy-to-use and monitor infrastructure ideal for AI agents development.

Kong AI Gateway and Cerebras at work

It's easier to see an AI voice agent consuming Kong AI Gateway, Cerebras LLM, and STT/TTS models than reading about it. The following video demonstrates a simple AI voice agent following the reference architecture:

The AI agent was written with LiveKit, which provides the infrastructure for capturing, transmitting, and managing bi-directional audio streams between users and the AI agent.

From the AI agent perspective, all GenAI models are abstracted by Kong AI Gateway, Each model is exposed to the AI agent through a specific Kong AI Gateway Route.

Here's the snippet of the AI voice agent defining the agent sSession and referring the Kong AI Gateway Routes:

The DATA_PLANE_URL variable is where the Kong AI Gateway Data Plane is located. The AI agent sends requests to the Gateway using the Routes defined for each model.

At the same time, here's the Kong AI Gateway declaration defining the Gateway Services exposed by the Kong Routes:

Again, for each GenAI model, there's a configuration describing how the AI Gateway can integrate with them. Note that the Gateway is taking care of the Cerebras’ API Key, providing a more secure environment for Agent development.

The STT and TTS models, referred to by the AI agent, are deployed in the Speaches.AI engine and exposed through the specific URLs. The LLM model, as expected, is totally managed by Cerebras Cloud.

Observability

Both Cerebras and Konnect provide observability capabilities. For example, here's a Cerebras screenshot with the Qwen-3-32B model consumption used by the AI Voice Agent.

Similarly, Konnect provides ready-to-use dashboards and explorer capabilities to monitor how the models are getting consumed:

Conclusion

The integration of Kong AI Gateway with Cerebras AI infrastructure creates a powerful foundation for deploying and managing next-generation AI workloads at scale. Kong AI Gateway provides a secure, high-performance entry point for managing APIs and AI model endpoints, ensuring efficient traffic control, observability, and policy enforcement across diverse environments. Combined with Cerebras’ large-scale compute capabilities and LLM optimization, this architecture enables seamless orchestration of advanced AI services such as speech-to-text (STT), text-to-speech (TTS), and language understanding models.

Contact sales@konghq.com and sales@cerebras.com if you have questions or need support.

Our next blog post will describe in detail how to configure both Kong AI Gateway and Cerebras. Register for both Kong Konnect and Cerebras to get a trial and start experimenting with both technologies.

Unleash the power of APIs with Kong Konnect

Learn MoreGet a Demo
AI GatewayLLMKong KonnectAIObservability

Table of Contents

  • Kong Konnect and Cerebras
  • AI voice agents
  • Kong AI Gateway and Cerebras at work
  • Observability
  • Conclusion

More on this topic

Videos

Context‑Aware LLM Traffic Management with RAG and AI Gateway

Reports

Agentic AI in the Enterprise: Adoption, Governance, and Barriers

See Kong in action

Accelerate deployments, reduce vulnerabilities, and gain real-time visibility. 

Get a Demo
Topics
AI GatewayLLMKong KonnectAIObservability
Share on Social
Claudio Acquaviva
Principal Architect, Kong

Recommended posts

From Chaos to Control: How Kong AI Gateway Streamlined My GenAI Application

Kong Logo
EngineeringOctober 6, 2025

🚧 The challenge: Scaling GenAI with governance While building a GenAI-powered agent for one of our company websites, I integrated components like LLM APIs, embedding models, and a RAG (Retrieval-Augmented Generation) pipeline. The application was d

Sachin Ghumbre

Kong AI Manager: Govern & Observe Agentic Traffic to Thousands of LLMs

Kong Logo
Product ReleasesMay 27, 2025

Today, we're excited to announce the general availability of AI Manager in Kong Konnect, the platform to manage all of your API, AI, and event connectivity across all modern digital applications and AI agents. Kong already provides the fastest and m

Marco Palladino

AI Guardrails: Ensure Safe, Responsible, Cost-Effective AI Integration

Kong Logo
EngineeringAugust 25, 2025

Why AI guardrails matter It's natural to consider the necessity of guardrails for your sophisticated AI implementations. The truth is, much like any powerful technology, AI requires a set of protective measures to ensure its reliability and integrit

Jason Matis

Move More Agentic Workloads to Production with AI Gateway 3.13

Kong Logo
Product ReleasesDecember 18, 2025

MCP ACLs, Claude Code Support, and New Guardrails New providers, smarter routing, stronger guardrails — because AI infrastructure should be as robust as APIs We know that successful AI connectivity programs often start with an intense focus on how

Greg Peranich

Securing Enterprise AI: OWASP Top 10 LLM Vulnerabilities Guide

Kong Logo
EngineeringJuly 31, 2025

Introduction to OWASP Top 10 for LLM Applications 2025 The OWASP Top 10 for LLM Applications 2025 represents a significant evolution in AI security guidance, reflecting the rapid maturation of enterprise AI deployments over the past year. The key up

Michael Field

Build Your Own Internal RAG Agent with Kong AI Gateway

Kong Logo
EngineeringJuly 9, 2025

What Is RAG, and Why Should You Use It? RAG (Retrieval-Augmented Generation) is not a new concept in AI, and unsurprisingly, when talking to companies, everyone seems to have their own interpretation of how to implement it. So, let’s start with a r

Antoine Jacquemin

Insights from eBay: How API Ecosystems Are Ushering In the Agentic Era

Kong Logo
EngineeringDecember 15, 2025

APIs have quietly powered the global shift to an interconnected economy. They’ve served as the data exchange highways behind the seamless experiences we now take for granted — booking a ride, paying a vendor, sending a message, syncing financial rec

Amit Dey

Ready to see Kong in action?

Get a personalized walkthrough of Kong's platform tailored to your architecture, use cases, and scale requirements.

Get a Demo
Powering the API world

Increase developer productivity, security, and performance at scale with the unified platform for API management, AI gateways, service mesh, and ingress controller.

Sign up for Kong newsletter

    • Platform
    • Kong Konnect
    • Kong Gateway
    • Kong AI Gateway
    • Kong Insomnia
    • Developer Portal
    • Gateway Manager
    • Cloud Gateway
    • Get a Demo
    • Explore More
    • Open Banking API Solutions
    • API Governance Solutions
    • Istio API Gateway Integration
    • Kubernetes API Management
    • API Gateway: Build vs Buy
    • Kong vs Postman
    • Kong vs MuleSoft
    • Kong vs Apigee
    • Documentation
    • Kong Konnect Docs
    • Kong Gateway Docs
    • Kong Mesh Docs
    • Kong AI Gateway
    • Kong Insomnia Docs
    • Kong Plugin Hub
    • Open Source
    • Kong Gateway
    • Kuma
    • Insomnia
    • Kong Community
    • Company
    • About Kong
    • Customers
    • Careers
    • Press
    • Events
    • Contact
    • Pricing
  • Terms
  • Privacy
  • Trust and Compliance
  • © Kong Inc. 2026