Blog
  • AI Gateway
  • AI Security
  • AIOps
  • API Security
  • API Gateway
|
    • API Management
    • API Development
    • API Design
    • Automation
    • Service Mesh
    • Insomnia
    • View All Blogs
  1. Home
  2. Blog
  3. Product Releases
  4. Introducing the Insomnia AI Runner: Accelerate and secure GenAI traffic to one or more LLMs
Product Releases
September 11, 2024
4 min read

Introducing the Insomnia AI Runner: Accelerate and secure GenAI traffic to one or more LLMs

Marco Palladino
CTO and Co-Founder of Kong

Today with the release of Insomnia 10, we are quite stoked to also announce a brand new offering in Insomnia, the AI Runner, a managed SaaS service that provides developers with the ability to accelerate and secure LLM traffic for their applications. This capability is the first of a new class of developer infrastructure products that will complement Insomnia’s existing developer tooling capabilities for APIs.

The AI Runner enables developers to accelerate LLM traffic by up to 20x with semantic caching while also securing LLM traffic with out-of-the-box AI guardrails. You can also use the AI Runner to consume multiple LLMs with a single OpenAI-compatible interface. By doing so, you can build faster user experiences powered by AI that are more secure and easier to build, and it only takes a few seconds to use.

All Insomnia users can get started with the AI Runner for free.

Security and acceleration in one line of code

With the Insomnia AI Runner, you can create as many “AI Runners” as you need to accelerate and secure your LLM traffic. The Insomnia AI Runner sits in the execution path of your LLM traffic for GenAI, and it accelerates all LLM traffic with semantic caching while also securing your traffic with guardrails that you can apply in one click.

You can create as many AI Runners as you need - each one with their own configuration.

You can create as many “AI Runners” as you need, and each one will provision a URL that you can use in your applications by simply changing one line of code to point to the new URL.

Migrating to the AI Runner is extremely easy, simply point your line of code to it.

By doing so, it becomes extremely easy to migrate existing applications written in vanilla GenAI integrations, or via frameworks like LangChain and others.

Accelerate AI with semantic caching

The AI Runner is able to understand the intent and meaning of the prompts you are sending through it. If it finds two similar prompts, it will return a copy of the cached content instead of making an upstream request to the LLM you are consuming, even when similar prompts are using different words.

With semantic caching, the Insomnia AI Runner can accelerate all GenAI traffic significantly. In the chart above, the lower the value, the lower the latency.


To understand the nuances between two different prompts, the AI Runner gives you the ability to set a similarity threshold to determine if cached content should be returned or not. A stronger similarity threshold will result in more cache hits and higher performance, but it can also result in prompts with wide variances being interpreted as having the same meaning. On the other hand, a lower threshold will understand more nuances between the prompts, but it will return a lower hit ratio.

You can easily configure the AI Runner’s similarity threshold.


Additionally, you can configure the caching time to live (TTL) for each AI Runner, as well as store credentials for your LLM within the AI Runner itself. This makes it so that you don’t need to update your applications when you want to modify your credentials, as it will be applied on the fly by the AI Runner.

Secure AI with out-of-the-box guardrails

It is crucial to ensure that AI traffic follows specific guidelines for improving security, reducing mishandling of sensitive customer information, and returning better responses.

As such, the AI Runner ships with AI guardrails out of the box. This makes it easier to protect your LLM traffic against security attacks while ensuring that personal and sensitive data is not returned by the LLMs.

Out-of-the-box AI guardrails are available and ready to use for your AI traffic.


By allowing you to select exactly which guardrails you want to apply for each AI Runner, Insomnia makes it easier to create secure AI experiences, with less coding.

In the future, we will allow you to easily create your own guardrails, too.

Built for developers, powered by Konnect

Under the hood, the new AI Runner is powered by a subset of features provided by Kong’s AI Gateway technology. It runs on the enterprise infrastructure provided by Kong Konnect, which is currently powering hundreds of enterprise organizations across the world, including those operating in highly regulated industries. 

The Insomnia AI Runner is powered by Kong AI Gateway, running on Kong Konnect.

It is entirely possible to self-host your own version of the AI Runner by deploying Kong’s AI Gateway directly (you can contact sales to learn more) and - by doing so - gain access to even more AI features that are currently unavailable in the Insomnia AI Runner.

Get started for free

You can get started for free with AI Runner today.

AIInsomniaKong GatewayKong Konnect

More on this topic

Videos

Cigna's API Gateway Journey with Kong Konnect

Videos

Kong Gateway 3.8

See Kong in action

Accelerate deployments, reduce vulnerabilities, and gain real-time visibility. 

Get a Demo
Topics
AIInsomniaKong GatewayKong Konnect
Share on Social
Marco Palladino
CTO and Co-Founder of Kong

Recommended posts

An Early Christmas Present for the AI C-Suite: Metering & Billing Comes to Kong Konnect

Kong Logo
Product ReleasesDecember 18, 2025

The AI boom has a dirty secret: for most enterprises, it's bleeding money. Every LLM call, every agent invocation, every API request that powers your AI products — they all cost something. And right now, most organizations have no idea what they're

Alex Drag

Liabilities into Assets: Konnect Metering & Billing, Powered by OpenMeter

Kong Logo
Product ReleasesOctober 14, 2025

Picture this: you’ve spent months building a slick API that developers love, and it’s already humming behind the scenes in production. But every time someone calls your endpoint, what happens? You get an invisible hit across multiple cost centers, i

Dan Temkin

Can You Trust What You’re Shipping? You Will with Insomnia v12

Kong Logo
Product ReleasesOctober 13, 2025

AI Assist: Clean commits, transparent teams Building trust starts with small things, like making sure every commit tells the right story. That’s where Insomnia’s v12 AI Commit capability comes in.  Developers want to write code. It’s what they’re go

Haley Giuliano

Building a First-Class Kubernetes Experience in Kong Konnect

Kong Logo
Product ReleasesSeptember 18, 2025

Simplify operations and scale with confidence To unlock Kubernetes’ full potential, many enterprises are relying on three key building blocks available in Kong Konnect today: Kubernetes Ingress Controllers: Ingress controllers are used for managing

Adam Jiroun

Multi-Cloud API and AI Infra Gets Smarter: Managed Redis for Kong DCGW

Kong Logo
Product ReleasesSeptember 16, 2025

Global, multi-cloud agentic infrastructure Modern enterprises are embracing multi-cloud strategies to avoid vendor lock-in, optimize costs, and ensure resilience. Yet managing API infrastructure (which also happens to be AI infrastructure) across mu

Alex Drag

Announcing terraform-provider-konnect v3

Kong Logo
Product ReleasesAugust 22, 2025

It’s been almost a year since we released our  Konnect Terraform provider . In that time we’ve seen over 300,000 installs, have 1.7 times as many resources available, and have expanded the provider to include data sources to enable federated managem

Michael Heap

Kong AI/MCP Gateway and Kong MCP Server Technical Breakdown

Kong Logo
EngineeringDecember 11, 2025

In the latest Kong Gateway 3.12 release , announced October 2025, specific MCP capabilities have been released: AI MCP Proxy plugin: it works as a protocol bridge, translating between MCP and HTTP so that MCP-compatible clients can either call exi

Jason Matis

Ready to see Kong in action?

Get a personalized walkthrough of Kong's platform tailored to your architecture, use cases, and scale requirements.

Get a Demo
Powering the API world

Increase developer productivity, security, and performance at scale with the unified platform for API management, AI gateways, service mesh, and ingress controller.

Sign up for Kong newsletter

    • Platform
    • Kong Konnect
    • Kong Gateway
    • Kong AI Gateway
    • Kong Insomnia
    • Developer Portal
    • Gateway Manager
    • Cloud Gateway
    • Get a Demo
    • Explore More
    • Open Banking API Solutions
    • API Governance Solutions
    • Istio API Gateway Integration
    • Kubernetes API Management
    • API Gateway: Build vs Buy
    • Kong vs Postman
    • Kong vs MuleSoft
    • Kong vs Apigee
    • Documentation
    • Kong Konnect Docs
    • Kong Gateway Docs
    • Kong Mesh Docs
    • Kong AI Gateway
    • Kong Insomnia Docs
    • Kong Plugin Hub
    • Open Source
    • Kong Gateway
    • Kuma
    • Insomnia
    • Kong Community
    • Company
    • About Kong
    • Customers
    • Careers
    • Press
    • Events
    • Contact
    • Pricing
  • Terms
  • Privacy
  • Trust and Compliance
  • © Kong Inc. 2025