What's Coming in Kuma 2.5?
We are excited to announce the latest minor release of Kuma 2.5.0.
We’re in the last stride of publishing this release, but I believe the features we’re landing are important enough to start teasing you about it. Let’s focus on three main features:
- Delta KDS is now the default. This complete rewrite of the protocol between the Global and the Zone control planes brings a lot to Kuma
- The simplest and most expressive locality-aware load balancing API in the service mesh space.
- An improvement in performance for people leveraging MeshTrafficPermission to restrict access to their services
Delta KDS as default
Kong recently launched Kong Mesh Manager, the SaaS offering to manage the enterprise distribution of Kuma, Kong Mesh. To achieve this it was necessary to part with the state of the world based syncing protocol between global and zone control planes.
So contributors from Kong enhanced Kuma with a new protocol that is called “deltaKDS”.
The main addition of the protocol is that it is only sending resources that have changed instead of the entire snapshot of the environment. If you want to understand the details see the MADR
After incubating it for a release it is now running in production on Konnect and we believe it is time for everyone to leverage this stronger, more efficient and more scalable protocol.
The transition from the old to the new protocol is seamless so you don’t have to worry about it. You will also be able to revert to the v1 for the coming 2 minor versions by setting KUMA_EXPERIMENTAL_KDS_DELTA_ENABLED=false
.
New locality-aware load balancing API
Since its inception, Kuma constantly strives to be a service mesh that achieves both simplicity and flexibility. We demonstrated this with our new policy API introduced in 2.0.0.
Another thing that Kuma always innovates is in multi-zone support. We are strongly convinced that for a service mesh to demonstrate its full utility it must be able to interconnect clusters and control traffic easily.
With 2.5.0 we’re introducing locality-aware load balancing as a feature in the existing MeshLoadBalancingStrategy policy.
This feature is divided into two parts:
- Locality-aware load balancing within a zone. As an example to help with cost savings for clusters spreading within AWS availability zones.
- Locality-aware load balancing across zones. Which can help with reliability, compliance, and security.
Zone locality-aware load-balancing
As usual, this all relies on data plane tags you specify as a list of tags in order of preference. Kuma will first pick the endpoints that have the same tags as the client proxy. For example:
Here traffic to the service backend
will try to stay on the same node, if no instances exist on this node it will try to find instances in the same availability zone, finally if this is still not possible it will route to any other endpoint.
Cross-Zone load-balancing
When routing across zones many things are at stake:
Firstly, not all services will have the same policy. Thankfully Kuma’s strongly expressive policy matching API allows very granular control of this.
Secondly, there’s symmetry in a lot of Cross-zone routing, for example, us-east fails over to us-west and vice versa. We wanted to build an api that would be expressive enough to not need to write things twice in these cases.
Thirdly, excluding is as important as including. We’ve seen recently a growing number of constraints around data sovereignty and we wanted to be able to have a global mesh with local data.
This leads us to the following API:
We always favor traffic in the same zone. Once we do not have enough healthy instances in the same zone we’ll start load balancing to other zones in the order of priority of the elements in “failover.” Each rule has 2 sections:
- “From” which restricts rules to a subset of the source zone.
“To” which defines which zones to route to. There can be 4 different types of “to” rules:
- Any: all zones that haven’t been used already,
- Only: only a subset of zones,
- AnyExcept: restrict to a specific zone,
- None: don’t failover to anything else.
We’re now going to show how we use this new policy to define a very advanced setup:
- Proxies in zone-1 should failover to zone-2 and vice-versa
- Proxies in zone-3 should never failover out of itself, except for service-a
- All other zones should failover anywhere except to zone-3
We believe that this expressiveness will enable you to define all your most advanced constraints in an expressive and powerful way. To fully understand what’s happening you can always use the inspect API and its graphical representation in the GUI.
For full details, you can check the MADR.
Reachable Services inference through MeshTrafficPermission
One of the core scalability issues of service meshes is the flexibility between being able to access a service and having a proxy configuration that grows exponentially with the number of services in your mesh.
Our answer to this has been reachable services that need to be manually added to proxies, which can be cumbersome.
However, in some environments, MeshTrafficPermission is leveraged for security purposes. There is no point in being able to connect to a service to which you don’t have access. Therefore, we’ll prune the configuration. We believe that the performance improvements of this feature can be substantial even with coarse-grained permissions at the namespace level.
To leverage this feature, make sure to enable:
KUMA_EXPERIMENTAL_AUTO_REACHABLE_SERVICES=true
You can also use this feature with reachableServices to obtain the best performance possible.
Last but not least, TargetRef-based policies are now GA
We’ve been working on these new policies for the past year. We've proven their expressiveness and power and would now like to move them away from beta so users can leverage them as much as possible.
While this release will be out in a few days, you can already try it out with: curl https://kuma.io/installer.sh | VERSION=2.5.0-preview.v51e90c3 sh -
You can already check out the docs here: https://kuma.io/docs/2.5.x/