A service mesh is a mechanism for managing communications between the various individual services that make up modern applications in a microservice-based system.
When a service mesh is applied, all inter-service communication is routed through proxies, which can be used to implement networking features such as encryption and load balancing. The service mesh decouples the network logic from the application or business logic of each microservice so that it can be implemented and managed consistently across the whole system.
The dedicated infrastructure layer of the service mesh is something of a complimentary piece of technology to an API gateway. But a service mesh only handles communication between services that make up a system, while an API gateway decouples the underlying system from the API that is exposed to clients (which can be other systems within the organization or external clients).
The difference between API gateway vs service mesh is sometimes characterized as north-south (API gateway) versus east-west (service mesh) communication, but thats not strictly accurate.
While the service mesh pattern was designed to handle network connectivity between microservices, it can also be applied to other architectures (monolithic, mini-services, serverless) wherever there are multiple services communicating across a network.
In this article, well explore what a service mesh is, how it works, and how it can efficiently handle challenges like routing or load balancing. Well also learn about the different components in a service mesh and see benefits and challenges a service mesh imposes and how you can overcome them.
Why Do You Need a Service Mesh?
Once upon a time, programs were developed and deployed as a single application. Called monolithic architecture, this traditional approach worked fine for simple applications, but it becomes a burden as applications grow complex and the codebase swells.
Thats why modern organizations typically move from monolithic architecture to microservices architectures. By allowing teams to independently work on the application in small parts (called microservices), applications can be modularly developed as a collection of services. But as the number of services grow, you need to ensure connection between services is fast, smooth, and resilient.
The data plane is a network proxy replicated alongside each microservice (known as a sidecar), which manages all inbound and outbound network traffic on behalf of the microservice. As part of this, it may perform service discovery, load balancing, security and reliability functions. The service and sidecar should be deployed on the same host and if your deployment is containerized in the same pod.
The Data Plane
The data plane is made up of services running alongside sidecar proxies. The services handle business logic, while proxies sit between them and other services. All traffic, ingress, and egress occur through the proxies, which are responsible for routing (proxying) traffic to other services.
Fig. 1: Service mesh data plane
Many features provided by the service mesh work at the request level, making sidecars Layer 7 proxies. By operating at this layer, service meshes provide capabilities like intelligent load balancing based on observed latency, or provide sensible and consistent timeouts for requests.
Sidecars also provide functionality at the connection level. For example, sidecars can provide functionality like Mutual Transport Layer Security (mTLS), allowing each party in the communication to validate the certificate of the other.
The Control Plane
Proxies need to be configured. This is done through the control plane, which consists of several services that provide administrative functionality over the service mesh and provides an interface to configure the behavior of the data plane and for the proxies to coordinate their actions.
Fig. 2: Service mesh architecture
Operators interact with the service mesh through the control plane by using a CLI or API. For example, operators work through the control plane to define routing rules, create circuit breakers, or enforce access control.
Depending on the implementation, you can use the control plane to export observability data such as metrics, logs, and traces.
Distributed systems split applications into distinct services. While this architectural type has many advantages (like faster development time, making changes easier to implement, and ensuring resilience), it also introduces some challenges (like service discovery, efficient routing, and intelligent load balancing). Many of these challenges are cross-cutting, requiring multiple services to implement such functionalities independently and redundantly.
The service mesh layer allows applications to offload those implementations to the platform level. It also provides distinct advantages in the areas of observability, reliability, and security.
The service mesh can enhance observability. A system is considered observable if you can understand its internal state and health based on its external outputs. As the connective tissue between services, a service mesh can provide data about the systems behaviorwith minimal changes to application code.
By default, sidecar proxies provide metrics, such as request latency and error counts. Proxies can automatically generate traces. Services forward these through necessary headers, thereby enhancing the visibility of requests flowing through the system. Offloading observability concerns to the service mesh ensures consistent, useful observability.
The service mesh also helps improve system reliability. By offloading fault tolerance concerns to a service mesh, services can focus on differentiating business logic. The service mesh can handle retrying requests transparently, without other services even being aware if dependencies are experiencing issues.
Service meshes also safeguard service reliability by enforcing a timeout on long-running requests. It can ensure services dont get overloaded by utilizing techniques like circuit breaking. Because the service mesh has a holistic system view, it can decide what techniques are necessary to maintain reliability.
The service mesh simplifies the adoption of secure communication practices. It helps platforms to establish a zero trust security model, which assumes that no entityeven those within the networkis blindly trusted.
Service meshes can take on the responsibility of authenticating, controlling access to, and encrypting traffic between services. By employing techniques like mTLS, a service mesh can ensure secure communication between services. It can also help services mitigate issues like service impersonation, unauthorized access, or packet sniffing. These all take place at the platform level, without business applications intervening.
Different service mesh implementations have different feature sets. Some promote simplicity, while others focus on capabilities. For example, some implementations focus on having light sidecar proxies, while others offer chaos engineering capabilities. Its important to take these characteristics into consideration when choosing a service mesh implementation.
Service Mesh Challenges
Despite all its benefits, the service mesh also comes with some caveats, which can present some challenges. When adopting a service mesh, you should consider the following.
Added complexity: Adding a service mesh to a platform introduces another layer to your overall architecture. Regardless of which service mesh implementation you select, this will incur extra management costs. An extra set of services (such as the control plane) must be managed, and sidecar proxies must be configured and injected.
Resource consumption: A sidecar proxy accompanies each application replica. These proxies consume resources (such as CPU and memory), increasing linearly with the number of application replicas.
Security loopholes: Bugs or configuration mistakes at the service mesh layer can create security threats. For example, a wrong configuration can expose an internal service to the outside world.
Debugging: An extra layer can make it harder to pinpoint issues. Traffic flowing through proxies adds extra network hops, which can obscure the root cause of some problems.
The threshold at which service mesh advantages exceed its disadvantages varies from organization to organization. When youre considering the adoption of a service mesh, it is crucial to know how they excel, what they can offer, and also when they can be counterproductive.
What Challenges Does Service Mesh Address?
The loosely coupled nature of services characterizes modern microservice architecture. But as the number of services and requests between them increases, the demands on the platform rise.
Service meshes can help to address these demands by providing solutions to common problems.
In microservice architectures, individual services must be able to find one another to communicate. Service meshes provide this functionality through a discovery layer. By registering services into the service mesh, other services can discover them by name and initiate communication.
Service meshes also allow independent scaling of services through easy and transparent load balancing between service replicas. In addition, they can provide load balancing algorithms ranging from simple round-robin balancing to more sophisticated algorithms, such as weighted or least requests.
Routing is another aspect where service meshes can benefit applications. Simpler architectures and processes dont require sophisticated routing. However, as platforms grow, this need emerges. Practices like A/B testing or canary deployments require that platforms can redirect requests to specific services (or service versions).
By using a service mesh, platforms can redirect users to specific service versions or direct requests based on specific headers or other criteria.
Distributed systems need to handle fault-tolerance scenarios. When applications crash, it can cause cascading failures, unless a system can handle such scenarios gracefully. Service meshes improve system reliability by handling offloading techniques like circuit breaking.
By splitting services into microservices, engineers now must collect data from different sources to understand overall application behavior. Collecting data consistently from what could be hundreds of distributed services is no small task. In addition, that data needs to be correlated so that a system can be observable.
Service meshes handle this concern by collecting data about individual services, as well as the interactions between them, improving observability.
Secure communication is another major concern in distributed systems. Splitting applications into multiple services increases the attack surface, potentially exposing applications to multiple attack vectors. Requests need to be authenticated, authorized, and encrypted. A service mesh can fulfill these requirements by employing techniques like role-based access control (RBAC), thereby managing secure communication at the platform level. It would be impractical and error-prone to address all of these concerns at the service level alone, and doing so would reduce much of the value of distributed systems.
Service meshes alleviate this complexity by offloading these concerns to the platform level. Services can focus on business logic while the service mesh takes care of the common system-level concerns.
Service Mesh vs. Microservices
Whats the difference between service mesh and microservices?
Microservices refer to an application made up of multiple interconnected and loosely coupled services that work together to deliver application functionality (as opposed to a single monolithic system)
Service mesh is a technology pattern that can be applied to a microservice-based system to manage networked communication between services. With a service mesh, the networking functionality is decoupled from the services application logic, which means it can be managed independently.
In a microservice architecture, an application is broken up into multiple loosely coupled services that communicate over a network. Each microservice is responsible for a discrete element of business logic. For example, an online shopping system might include individual services to handle stock control, shopping cart management, and payments.
Microservices provide several advantages over a traditional, monolithic design.
As each service is developed and deployed independently, teams can embrace the benefits of agile practices and roll out updates more frequently. Individual services can be scaled independently, and if one service fails, it does not take the rest of the system down with it.
Manage Network Communication
Service mesh was introduced as a way of managing the communication between the services in a microservice-based system.
As the individual services are often written in different languages, implementing network logic as part of each service can duplicate effort. Even if the same code is reused by different microservices, theres a risk of inconsistencies as changes must be prioritized and implemented by each team alongside enhancements to the microservices core functionality.
Just as a microservice architecture enables multiple teams to work concurrently on different services and deploy them separately, using a service mesh enables those teams to focus on delivering business logic and not concern themselves with the implementation of networking functionality. With a service mesh, network communication between services within a microservice-based system is implemented and managed consistently.
How to Implement Service Mesh
Setting up a service mesh requires a suitable proxy to be deployed to each instance of the services that need to communicate across the network and a separate control plane element to be installed to configure and manage those proxies.
In this article, weve considered how distributed systems introduce multiple challenges that applications need to address. However, addressing these challenges at the individual service level can be complex, time-consuming, error-prone, and redundant.
A service mesh provides common solutions to these challenges, and these solutions are developed and maintained at the platform level. The service mesh also benefits applications by contributing to improved observability, reliability, and security.
Kong Mesh is an enterprise service mesh from Kong which is based on Kuma and built on top of Envoy. Its easy to set up and scale, and it supports various environments. It addresses the issues described in this article and allows applications to focus on differentiating business features.