American Airlines Dev Experience Takes Off With Service Mesh

January 26, 2022

8 min read

Karl Haworth

Kubernetes is hard. Last year, we started the developer experience product at American Airlines. As we transitioned into the later half of 2020 and into 2021, we wanted to tackle Kubernetes app deployments. We aimed to make it easy for the users to do the right things, no matter how difficult those tasks were. Through our Kubernetes journey, we created reproducible patterns for application teams to use to make things even easier.

In this post, I'll explain how we at American Airlines piloted our developer experience and how Kuma service mesh is helping us to push our new shared hosting services. Fasten your seatbelts and prepare for takeoff.

Developer Experience Platform: Runway

We built our developer experience platform, Runway, on Spotify’s Backstage platform. Backstage brings a solid base for an enterprise framework that we could build upon. The developer experience team at American Airlines loves open source technology, and we look to the open source marketplace to help increase our engineering standards. American Airlines wants to be part of thriving, open source communities.

Why We Chose Kuma for Runway

We spent quite a bit of time evaluating service mesh technologies and discussing them with technical leaders. Kuma stood out from the pack by offering batteries included, specifically:

We have teams with use cases for traffic policies where certain workloads need to be isolated
The ability for multiple meshes with explicit requirements for traffic rules for sensitive data
The Kuma gateway will allow us to ensure API security on our applications
And Kuma offers cross-region and cross-provider support as well

Developer experience at American Airlines and Kuma seem to share similar goals. We want to make things easy with minimal upfront configurations.

Runway Architecture Overview

Below is a simplified overview of our infrastructure. We start with our security delivery content providers and global traffic managers. We have a Kubernetes provider with multiple clusters in multiple regions. And we have Kuma installed on all of our clusters with a Kuma global control plane.

We deploy Kuma using GitHub actions. They allow us to deploy Kuma on both global clusters and our remote zones. We also use the GitHub issues and pull requests comments to document our clusters' life cycles.

For additional automation, we simplified our global traffic management strategy. In the past, internet-facing applications had many hurdles to jump through. We’ve simplified the process while keeping proper security in place.

Whenever we find a new ingress on our clusters using Kubernetes DevOps, the system notifies our GTM provider. As apps change clusters, our global traffic manager solution tracks apps from cluster to cluster. As apps are removed, the GTM property is removed, or IPs are adjusted. Using Kuma, we can put services anywhere within the mesh and ensure it just works with very minimal configuration. The application teams barely know about the service mesh, which works with the automation we created.

In the below diagram, you can see our GTM Sync Service application being reached out to by our custom operator for GTM creates, updates and deletes. Due to the Kuma service mesh, our GTM Sync service can exist on any cluster, and it does not have to exist side by side with our operator.

Technical Guide: A GitOps Approach to Kong Gateway and Kong Mesh

About Runway

We’ve used Runway as a base to encourage open source at American Airlines.

The platform has made it easy even for non-developers at American Airlines to get started, contribute and build their team's service requests into the platform. Some of the highlights on our inner source contributions include:

The API management team created built-in management capabilities for APIs right within Runway
The data engineering team built powerful workflows to allow teams access to data lake items
The InnerSource team made their marketplace inside Runway to encourage the re-use of components and code
The Enterprise Managed File Transfer group built plugins to view information on our file transfer jobs, and they’re continuing to push forward with job creation
The Compute-as-a-service team built a tool to create and view information on virtual machines

Runway: Create App

A large portion of what we tried to achieve is around our Create app. This is how applications join our clusters and our mesh in a friendly way. We wanted to make it easy for the application developers at American Airlines to launch their apps into our Kubernetes environment and not have to worry about the underlying infrastructure—click and deploy. We have options for new app creation and deploying existing apps into our mesh, and we have big plans on extending these blueprints.

Abstraction Layers in Runway

Runway: Main UI

Runway's abstraction layers help our app teams easily onboard our clusters and mesh. It gives us the user interface to provide our abstractions. The Runway Kubernetes Operator takes in a small subset of yaml to expand and create Kubernetes resources. We shot for describe your app in 10 lines of the yaml or less, and we made it pretty close.

Runway Kubernetes Operator

This operator is also awesome because we can change how app teams are deploying apps without teams being fully aware. We updated the operator when we needed to add Kuma locality-aware routing labels. Doing so ensured all apps deployed in our ecosystem were gracefully updated. We also automatically update our global traffic managers based on ingress changes. That way, teams no longer have to manually fill out requests for GTM services or security services provided by our delivery partners.

Argo CD for Deployments

With Argo CD for our deployments, teams don’t have to know much about deploying an app to Kubernetes. They simply update a yaml file in a repo, and the deployment is updated everywhere within three minutes. It makes rollbacks easy as well.

Cluster Service

Our cluster service allows teams to securely create namespaces and projects through a Rancher platform. It also applies Kuma annotations needed at the namespace level.

GitHub Repos

Our GitHub repos allow us to control the code and use GitHub Actions using workflow dispatch to deploy our infrastructure. We also document our traffic policies and infrastructure as code in GitHub repos.

GitHub Internal Actions

Our GitHub internal actions allow us to reduce duplication of code chunks. Teams can easily find reusable actions through the InnerSource marketplace hosted on Runway.

Rancher

Rancher gives us a way to abstract our user management abilities with our clusters to introduce new cloud providers to support our Kubernetes clusters with minimal effort.

App Creation: Only 6 Minutes for a Full Deployment

That includes building a Docker image, security scanning, Kuma resource creation and our traffic manager's global traffic management property.

Argo CD Deployment

Argo CD deployed the app, which you can see in the below image. We have a custom-defined web app in the second box from the left. That web app has expanded into normal Kubernetes resources such as services, external services, deployments, horizontal pod autoscaler and ingress. If we had secrets or config maps to find, you’d see those as well.

The Kubernetes resources get created from the Runway platform using the Runway Operator. Ten lines of yaml produced all of those resources. That’s very minimal compared to what it usually takes.

We partnered with a few app teams to run through our full workflow to deploy production apps to our infrastructure. One team, in particular, we’ve partnered with has a pretty cool new app coming out to enhance our customers’ onboarding experience. The team, which typically deploys many microservices, has reduced a new app deployment time from days to hours.

Outside teams with less infrastructure typically already can see a reduction in time from months to hours.

The team has also seen an increase in app portability and less reliance on our cloud partners. That’s a huge plus when we’re trying to support the world’s largest airline and utilize multiple cloud providers for our applications.

Typically, teams in our organization need to be Kubernetes experts to launch apps to clusters, but not this team. Due to the amazing efforts by our contributing members of developer experience, the team mastered containerization to continue with the new infrastructure, and that’s it. They don’t need to know much about Kubernetes.

We also gave teams a detailed view of the service mesh using the Kuma tracing integration. Our customers can trace traffic from the cluster and addresses through the mesh to the app. This gives teams a much deeper insight into what’s going on throughout the entire lifespan of a transaction without any blind spots—another batteries-included item from Kuma.

Video: Styra DAS & Kong Mesh: Policy-as-Code to Control Microservice-Based Communication at Scale

Conclusion

We’ve created abstractions to help users avoid being overwhelmed with yet another technology layer. Some of those include:

Blueprints that provide a starting point for teams with batteries included
The operator provides the ability to hook up apps into clusters and into the service mesh without even realizing it, reducing the number of lines of yaml required to launch an app into Kubernetes or GitHub actions
Automation provides the ability for our team to automate common policies and infrastructure creation
The global traffic management service we built makes it easy to update ingress pointers to access our apps within our mesh
The Kubernetes infrastructure makes it easy for portable apps between our cloud providers
Kuma makes it super easy to control traffic permissions, routing and gateways, as it's easy to set up, span multiple regions and service providers without manual steps
Kuma makes it easy to communicate with services wherever they reside
And with Kuma, we didn’t need to seek external components - we simply used Kuma

For even more details on Runway and how we've leveraged Kuma, check out my "Create App Lifecycle in Runway" demo in my full Kong Summit session video recording.

As we look towards the future, we’d like to include more information about the Kuma service mesh policies and global traffic management to our customers right through Runway so that teams can easily find the information they’re looking for.

We continue enhancing the developer experience by looking for modern open source technologies like Kuma and exploring Kuma's capabilities using policies.

Learn more from our Kuma Community call recording below.