Behind the Scenes: Mesh Manager Architecture
How did the Kong Mesh team design and bring Kong Mesh into Kong Konnect? In this blog post, we’re going to dive into this question to understand what’s going on behind the scenes with Mesh Manager.
Introduction
Mesh Manager was officially launched in Kong Konnect in September 2023. For those unfamiliar with the product, Mesh Manager represents Kong's SaaS offering of Kong Mesh, which itself originated as a CNCF project named Kuma. In this post, we'll dive into the steps and obstacles encountered in integrating Kong Mesh into Kong Konnect to provide users with the Mesh Manager experience.
The mesh team faced three primary challenges in incorporating Kong Mesh into Kong Konnect.
- Provide an abstraction in Mesh Manager that is easy for Konnect Engineering teams to operate.
- Tackle multi-tenancy. Various measures were implemented to ensure multi-tenancy aligned seamlessly with the existing Kong Konnect framework, and extend the same notion down to control plane-to-zone communication and mesh resources.
- Address zone connectivity. Zone connectivity to the global control plane and existing Kuma APIs needed to exhibit identical behavior.
Before diving into solutions, let’s start by clearly defining the problem statement and understanding how Kong Mesh is used.
Defining the problem statement
Kong Mesh operates with three distinct layers: Global Control Plane, Remote Zone Control Plane, and Dataplane. Within Kong Konnect, the control plane is offered as the SaaS solution, providing users with a unified platform for management and governance. Essentially, this means we provide a streamlined, centralized hub for managing Kong Mesh control planes, and you, as a customer, deploy Kong Mesh zones and sidecars within your infrastructure with more efficiency and ease.
Why would you want this? Not to veer off-topic, but typically if you’re a large enterprise, the engineering org will start with, possibly, three environments. For Kong Mesh, that would equate to three global control planes. But this problem starts to compound as the business grows, with new teams, or new verticals, that need their own set of isolation and requirements. Depending on how you decide to deploy the Kong Mesh global control plane it either needs to be hosted on a Kubernetes cluster independent from the zones or in Universal Mode by using postgres as the database instead, along with the several day 2 operations that typically come with operating a new platform in a production-ready state.
From an operational perspective, this can be cumbersome. But with Mesh Manager, this experience pivots dramatically, such that when you need a new global control plane you simply make an API call to Kong Konnect and it's provisioned. The above scenario that comes with the territory of managing your own infrastructure is abstracted away.
In order to make this a reality, the team had several technical objectives to design for the three elements mentioned above.
The dynamic duo: Mesh Manager's architecture
When Kong Mesh was onboarded into Kong Konnect, two services were introduced to the Kong Konnect family: Kong Mesh itself, deployed as a global control plane in universal mode with postgres as the datastore, and, a net-new service, the mesh virtual control plane manager (vcp-manager). These two services work in tandem, along with existing Kong Konnect platform services that provide authentication and authorization capabilities, to bring you Mesh Manager.
Rather than designing Kong Mesh specifically to accommodate multiple global control planes, a majority of the core behavior is consolidated to the mesh vcp-manager. This architecture was selected for two reasons:
- to reduce the overhead of maintaining Kuma, the CNCF project, and Kong Mesh Enterprise, the on-prem version, in addition to Mesh Manager
- to be able to easily shard Kong Mesh deployments in the future
Behind the scenes, Kong Mesh is still the acting global control plane. When a user creates any mesh configurations or zones establish a connection, this is predominantly handled by Kong Mesh. The role of the vcp-manager, is exactly as the name suggests — it provides an API entry point for users to create/update/delete virtual global control planes as well as provision zones. (We'll talk more about zone provisioning in a sec). It maintains the relationship between the user to their global control planes, and in turn propagates global cp IDs to Kong Mesh. Kong Mesh in turn implements multi-tenancy by treating the global control plane ID as the tenant, and associating mesh resources to this ID.
In essence, the vcp-manager generates a virtual control plane, accessible through Kong Konnect APIs and maintains the multi-tenant relationship, while Kong Mesh oversees the management of mesh resources and zone connectivity, utilizing the tenant provisioned by the vcp-manager.
Row Level Security (RLS) capabilities are also in play here to restrict access of global control planes and kong mesh resources to owners. If you are unfamiliar with RLS, it's becoming a staple in the engineering world because it's a very pragmatic mechanism to securely implement multi-tenancy. It avoids the need to over-engineer the database design, and any additional operational burdens that comes with that, while still ensuring isolation among users.
Overall, this architecture has several key advantages, the most critical being it greatly reduced the cost of releasing Mesh Manager, which is predominantly Kong Mesh. It's a secure multi-tenant solution, future-proof, operationally efficient from a support perspective, and easily integrates with existing platform services for components such as authentication and authorization.
Supporting zone connection to control plane
However, merely supporting global control planes within a multi-tenant structure wasn’t the end game; more work was required in order to allow zones to establish connectivity to the global control planes that are now hosted in Kong Konnect. To understand why, we need to look a little closer at the zone-to-global cp connectivity model in the on-prem version of Kong Mesh.
Kong Mesh utilizes JSON Web Tokens (JWT) for authentication of zone control planes to the global control plane. This is to ensure that only approved zone control planes connect to the global.
The steps are pretty straightforward: an administrator invokes the generate-zone-token endpoint on the global control plane to issue a JWT for a new zone using its own private key. This token is then utilized by the zone to authenticate with the global control plane when it establishes a connection.
No pun intended, this is the complication. Since Konnect employs its own authentication system for issuing PATs (Personal Access Tokens) and SATs (System Account Access Tokens), the zone connect request would be blocked immediately by our Kong Gateway that secures Kong Konnect.
What's the solution then?
The only significant change is that Kong Mesh does not directly issue the token. Essentially, a new endpoint called provisionZone was added to Kong Mesh to integrate with the Kong Konnect token issuance and validation service. This endpoint utilizes Kong Konnect's authentication/authorization services to produce a SAT (system account access token) for the zone. However, all the same metadata used by Kuma’s open core model to store the zone resources are still in place. In summary, when a zone tries to establish a connection with a Mesh Manager control plane, a token is still required, but instead of a JWT we provide the Kong Konnect-provisioned SAT.
Bringing it all together
Once you have reached this step in the Mesh Manager journey, it is Business As Usual. You have a Global Control Plane in Konnect, you have deployed a remote zone control plane, and it has established a connection, some in Kong may even say a Konnection. You can now deploy Kuma sidecars into your infrastructure and manage your mesh policies via Konnect.
Meet the team and acknowledgements
Not enough can be said about the incredible work the Mesh Team accomplishes. They have triple duties supporting open-source Kuma, Kong Mesh Enterprise, and now Mesh Manager. Mesh Manager itself required a lot of thought and consideration when it came to the architecture, future-proofing it, and best handling the complex world of open-source CNCF projects.
We want to take a moment to introduce the mesh team as it exists today at Kong.
We’d like to give a shout-out to the two exceptional Senior Frontend Engineers on the team: Phillip Rudloff and John Cowen. They've been making contributions to align the UI experience of Mesh Manager with Konnect, while also reducing any additional overhead of supporting Kong Mesh and Kuma.
We would also like to recognize the backend engineering team and managers: Bart Smykla, Ilya Lobkov, Jakub Dyszkiewicz, Jay Chen, Krzysztof Słonka, Łukasz Dziedziak, Mike Beaumont, Marcin Skalski, Charly Molter, and John Harris.
We're genuinely interested in the feedback from our users. Your feedback is instrumental in understanding gaps and subsequent enhancements. If you want to give Mesh Manager a try, the easiest way is to get started for free with Kong Konnect. Let us know what you think!
Also, if you love service meshes, API gateways, SRE work, among many other roles, why not check out Kong's job openings and consider joining us?