TELUS Builds an AI-Enabled, Unified API Platform with Kong

TELUS Builds a Multi-Cloud, AI-Enabled, Unified API Platform with Kong

Canada’s national telecom reengineers its API ecosystem with Kong, powering resilient networks and accelerating AI-driven innovation across wireless, fiber, and digital services.

100%

enterprise-wide visibility across all onboarded APIs

>90%

platform dashboards and insights generated using AI agents built on top of Kong

TELUS Communications, one of Canada’s largest telecom companies, operates a network spanning three time zones and delivers wireless, fibre, and digital services to 18 million customers across retail, healthcare, agriculture, and connected home experiences.

Background

TELUS is one of Canada’s largest technology companies, operating across three time zones, multiple data centers, and multi-cloud regions to deliver services across its 5G network, fibre infrastructure, and digital businesses, including TELUS Health, TELUS Agriculture, and TELUS Digital. Within the CIO organization alone, TELUS supports 500+ applications, managed by 3,000 employees and 120+ development teams, powering millions of daily customer interactions across retail, healthcare, billing, logistics, and device provisioning.

TELUS made an early bet on API platforms more than six years ago and adopted Kong Gateway to unify traffic across its highly distributed environment.

But as the company modernized and expanded its digital footprint, a critical issue emerged: the API environment was scaling faster than the organization’s ability to govern, observe, and stress-test it. TELUS’s challenge was clear: modernize how it operates and governs APIs, create a unified platform experience, and prepare the organization for an AI-accelerated future without slowing down innovation. At API Summit 2025, Adam Smith, Director, CAPS Team, and Ahmed Khalifa, Principal Architect, shared how they overcame these challenges to create a unified API platform that sets them up for a successful AI future.

“Just because you have an API gateway doesn’t mean you’re going to have really good APIs. And being in the cloud doesn’t guarantee good software.”

Ahmed Khalifa

Principal Architect

Challenge

Like many large enterprises, TELUS’s API environment reflected the complexity of the organization behind it. Years of modernization and business expansion had produced a highly distributed technology landscape:

Workloads running across multiple cloud providers
Internal applications hosted across several on-prem data centers
A mix of modern microservices, legacy APIs, and high-traffic transactional systems
An engineering organization distributed across three VP groups

“You need the network team, the cloud team, the identity provider team, the developer tooling team — if any one of them isn’t aligned, APIs don’t work reliably," said Ahmed Khalifa, Principal Architect, at TELUS.

This fragmentation had a direct operational impact. Kong Gateway had visibility into every API onboarded, every consumer, every payload, every timeout, every failure, but no single team was responsible for translating that data into organization-level insight. Teams monitored what they owned, but no one monitored the portfolio. As a result, many challenges emerged.

Slow APIs went unnoticed until business impact was visible
API owners didn’t know their consumers or how changes affected them
Performance tests didn’t reflect real user behavior
Minor latency spikes (like a 50–60 ms jump across the organization) went undetected by individual teams
Legacy APIs accumulated technical debt without centralized visibility
High-traffic events amplified issues across teams that weren’t coordinated

Performance testing during Black Friday preparations revealed another structural weakness: synthetic UI automation didn’t behave like real customers.

During actual peak events, TELUS would see 1,000 real users each making five requests, but scripted load tests generated five test users each making 1,000 calls — an entirely different stress profile.

“Machines cannot simulate proper Black Friday load very well. Teams thought their APIs were fast until we compared them to real traffic profiles, and the difference was massive.”

Ahmed Khalifa

Principal Architect

TELUS needed to modernize not just its APIs, but its operating model by consolidating fragmented ownership, creating shared visibility, and establishing performance and reliability practices that scaled across 120+ teams.

Solution

TELUS reorganized its platform strategy around a single objective: build a unified API operating model that improves reliability without slowing down innovation. After years of dealing with a fragmented approach where architecture, engineering, operations, cloud, and security each worked from their own playbook, the company consolidated these teams into an integrated platform organization. This alignment allowed TELUS to deliver consistent infrastructure patterns, streamlined developer workflows, and end-to-end API governance across hundreds of applications.

A critical component of this strategy was leveraging Kong Gateway as the insight engine for the organization. The API gateway provided TELUS with a rich view of runtime behavior: latency patterns, failure rates, consumer–owner relationships, timeout configurations, and even unused or “dead” APIs. Instead of using these insights only for operational firefighting, TELUS began treating the gateway as a source of organizational intelligence.

“We see too much — who owns the API, who consumes it, how often they’re calling it, what the payload looks like. We even see dead APIs,” Khalifa said. “Once we consolidated that data, we could finally show teams what was really happening across TELUS.”

With Kong, TELUS built a suite of internal dashboards and analytics that gave executives and engineering leaders an enterprise-level view of API performance, rather than isolated, team-by-team monitoring. Weighted latency scores, error distributions, and configuration insights were updated daily, giving TELUS the ability to spot issues early on, including subtle 50–60 ms latency spikes across the entire API fleet that no individual team could detect on its own.

This data became the foundation for deeper collaboration. TELUS built tools that paired API owners with their consumers, enabling conversations that previously never happened. Suddenly, when an API slowed down at night or behaved unpredictably under load, both sides could see it in the same dashboard and work together to resolve it. What began as an observability improvement quickly evolved into a cultural shift.

Just as importantly, TELUS transformed its performance-testing approach. Historically, testing relied on frontend scripts that didn’t reflect real user behavior during Black Friday or major device launches. The team designed new, API-centric performance tests that pushed services directly with varied payloads and realistic concurrency models.

After proving value during late-night test windows, the team persuaded business leadership to allow live “game day” performance tests during actual business hours. These game days became a breakthrough. Each one exposed issues that would have triggered outages during high-traffic events — caching misconfigurations, under-tested login flows, unrealistic load profiles, and APIs that behaved differently under real customer stress.

“The difference between tests and reality was massive,” Khalifa said. “Running game days during business hours lets us finally see the real behavior and fix problems before they hurt us.”

TELUS also embraced AI to accelerate platform development and bring API intelligence directly to developers. The team adopted GitHub Copilot in early 2023 and expanded to intelligent agents like Aider, Cursor, Roo, and Klein in 2024. They built MCP servers on top of the gateway’s data, enabling developers to ask their IDE simple questions:

Who owns this API?

How fast is it?

Is it REST or SOAP?

Who else uses it?

Then they would receive full API profiles instantly.

These initiatives didn’t just modernize API governance; they redefined how TELUS builds, tests, and operates software at scale.

“Roughly 90% of TELUS’s internal dashboards and platform tools are now generated or enhanced by AI agents, turning what used to take weeks into hours.”

Ahmed Khalifa

Principal Architect

Results

TELUS’s transformation delivered significantly measurable improvements across system reliability, developer productivity, and organizational behavior.

One of the most visible improvements was Black Friday readiness. After multiple game nights and game days, TELUS entered its next major sales event with unprecedented confidence. The platform team identified and resolved issues long before they reached customers, eliminating the slowdowns and cascading delays that had haunted previous years. Where past events triggered multitier incidents, the new process enabled predictable, stable performance even under extreme demand.

The enterprise-wide dashboards built on top of Kong elevated TELUS’s observability posture dramatically. Leaders now have access to weighted latency metrics, high-resolution error patterns, and configuration insights across hundreds of APIs. Teams act faster because the data is clear, shared, and anchored in real behavior. Issues that once sparked debates — “Is this really a problem?” — are now surfaced with objective evidence and fixed collaboratively.

The new visibility also strengthened accountability. By linking API ownership to consumption and runtime behavior, TELUS eliminated the ambiguity that used to slow down remediation. Teams understand not just how their APIs behave but who relies on them and what happens if they break.

TELUS also experienced a marked improvement in developer experience. With AI agents integrated into IDEs and dashboards auto-generated through MCP-enabled tooling, developers no longer dig through documentation or guess which API to use. Instead, they receive accurate, context-rich recommendations based on real production data. This reduces onboarding time, avoids reliance on outdated services, and prevents teams from unintentionally duplicating functionality that already exists.

High-quality performance testing is another area where TELUS achieved a milestone. Running live load tests during daytime traffic, which was once considered too risky, became a safe, repeatable practice. Each game day revealed previously hidden edge cases, allowing teams to tune upstream services, adjust timeouts, fix payload inefficiencies, and patch brittle integrations. The organization learned that performance isn’t a one-time fix; it’s a continuous feedback loop that improves as teams collaborate around shared data.

The adoption of AI also accelerated delivery cycles. TELUS reports that most dashboards and operational tools shown at API Summit were generated by AI agents. Previously manual work — like building reports, surfacing insights, and generating alerts — is now automated, freeing engineers to focus on higher-value problems. And because developers can query platform data directly through agents, they make smarter decisions and reduce misconfigurations.

Equally important is cultural evolution. TELUS transformed its API platform team into a center of excellence, not a gatekeeper.

Through education sessions on timeouts, payload design, and proper API behavior and through active collaboration with application teams, the platform group built trust. They now act as partners who help teams ship safer, more resilient software rather than enforcing rules from the sidelines.

“We’re not just maintaining a gateway. We’re part of how APIs are built across TELUS. The more we collaborated with teams, the more they trusted us and the more they saw we were helping them avoid problems before they happened.”

Ahmed Khalifa

Principal Architect

Together, these changes have delivered meaningful business value:

Key outcomes include:

Significantly improved Black Friday reliability and predictability
Enterprise-wide visibility across 500+ applications and hundreds of APIs
Faster development cycles through AI-generated dashboards and IDE-integrated API intelligence
Stronger collaboration between API producers and consumers
Safer performance testing, enabling live daytime “game days” that reveal real bottlenecks
A more resilient, high-confidence platform ready for modern AI-driven use cases

TELUS’s modernization journey shows that reliability, visibility, and innovation can scale together. By standardizing APIs with Kong, embracing AI-enabled tooling, and shifting from reactive monitoring to proactive engineering, TELUS has built a platform capable of supporting both the pace of modern development and the pressure of Canada-wide peak traffic events.

TELUS Builds a Multi-Cloud, AI-Enabled, Unified API Platform with Kong

Setting the Stage for an AI-Accelerated Future

Achieving reliability and readiness across a complex, distributed ecosystem

Unifying governance, performance, and engineering culture through a data-driven API platform

Fueling reliability, collaboration, and AI-enabled innovation

More Customer Stories

Get ahead today