Introducing LLM Analytics in Kong Konnect for GenAI Traffic
We’re pleased to announce the new LLM Usage reporting feature in Advanced Analytics, which aims to help organizations better manage their large language model (LLM) usage. This feature offers insights into token consumption, costs, and latency, allowing businesses to optimize their AI investments. By enabling comparisons across different LLM providers and models, organizations can make better-informed decisions and improve budget management.
The increasing adoption of AI and its challenges
As businesses increasingly integrate AI into their operations, they encounter a variety of challenges. While AI offers substantial benefits, such as improved efficiency and enhanced customer experiences, the rapid adoption also brings complexities that organizations must navigate.
- Visibility: Many organizations struggle to gain a clear understanding of how AI, particularly LLMs, is being utilized. This lack of visibility can lead to inefficiencies and missed opportunities for optimization.
- Cost management: With the growing use of AI technologies in your organization, managing costs becomes critical. Organizations must monitor token usage and associated expenses to ensure they're maximizing their investment in AI while minimizing unnecessary spending.
- Compliance and governance: Legal and compliance teams can ensure that LLM usage aligns with organizational policies and industry regulations by monitoring usage patterns and costs.
Analyze AI traffic with Konnect Advanced Analytics
With the release of Kong API Gateway 3.8, organizations can now leverage Konnect’s Advanced Analytics to get insights into AI traffic. By providing a comprehensive view of their AI usage, this new feature empowers organizations to make data-driven decisions, optimize costs, and improve the overall performance of their AI-powered solutions.
Here are a few example use cases where you can use Advanced Analytics for AI traffic:
- Using Advanced Analytics, organizations can compare cost and response times of various AI providers, enabling data-driven decisions on which provider to use.
- By leveraging detailed records of AI requests, organizations can create comprehensive audit trails.
- Using Advanced Analytics, organizations can track AI usage and costs per department, enabling fair allocation of AI resources and budgeting based on actual usage and value generated.
- By comparing performance metrics (latency, token usage) across different versions of AI models, organizations can make informed decisions about rolling out updates or rolling back to previous versions if issues are detected.
Users can now toggle between API Usage and LLM Usage views in Explorer, allowing for tailored metrics, groupings, and filter options specific to AI traffic analysis. The LLM Usage feature introduces critical new metrics, including LLM Latency, Costs, Prompt Tokens, and Completion Tokens, while advanced filtering and grouping options now encompass Provider, Request Model, and Response Model, enabling granular analysis.
Additionally, API Requests now displays detailed LLM Insights for AI-related traffic that has been proxied through the Kong AI Gateway, providing a comprehensive audit trail with rich metadata.
How to get started
Head over to Kong Konnect and navigate to "Analytics" on your left nav bar. Go to "Explorer" and you'll see a new “From” selector at the top. Switch from "API Usage" to "LLM Usage."
This new switch helps you focus on one dataset at a time — either API or LLM usage. Since these datasets have different characteristics and metrics, separating them reduces information overload and makes analysis easier.
Once you select "LLM Usage," you'll see new LLM-specific metrics and dimensions, such as token usage and model types. This view is tailored to help you understand and optimize your LLM operations within the familiar Explorer interface.
Now let's explore token usage across different providers. This analysis gives you valuable insights into potential costs and helps identify which providers are more cost-effective. To do this:
- Pick a suitable visualization, such as a bar chart.
- Select “Total Token Count” as your metric.
- Choose “Provider” as your group by dimension.
This view will clearly display total token usage per provider, allowing for quick comparisons. Keep in mind that higher token usage typically means higher costs, making this information crucial for optimizing your LLM expenses and provider selection.
To add another layer of analysis, include the Control Plane dimension. This will break down LLM usage by environment, team, or line of business. By doing so, you'll gain a comprehensive view of how different parts of your organization are utilizing LLMs. This analysis empowers you to make data-driven decisions about resource allocation and optimization across your entire ecosystem, potentially uncovering opportunities for cost savings or areas needing additional resources.
Start using Advanced Analytics in Kong Konnect Plus today and unlock the full potential of your AI traffic data, empowering your organization to make informed decisions, optimize costs, and enhance performance across all AI initiatives.