Debugging Applications in Production with Service Mesh
As an application developer, have you ever had to troubleshoot an issue that only happens in production? Bugs can occur when your application gets released into the wild, and they can be extremely difficult to debug when you cannot reproduce without production data. In this blog, I am going to show you how to safely send your production data to development applications deployed using a service mesh to help you better debug and build production proof releases. For this how-to, we will use the Kuma service mesh and a Kubernetes cluster.
To begin, let’s install the necessary pre-requisite software. First, connect to your Kubernetes cluster and install Kuma. Then, clone the following git repository which contains the necessary files for this tutorial and change to the project directory.
Next, create and label namespaces for the applications and install Kuma:
% kubectl create ns myblog
% kubectl create ns kong
% kubectl label namespaces myblog kong kuma.io/sidecar-injection=enabled --overwrite=true
namespace/myblog labeled
namespace/kong labeled
% git clone https://github.com/Kong/blog-traffic-mirror.git
% cd blog-traffic-mirror
blog-traffic-mirror % curl -L https://kuma.io/installer.sh | VERSION=2.0.0 sh -
blog-traffic-mirror % export PATH=$PATH:./kuma-2.0.0/bin
blog-traffic-mirror % kumactl install control-plane | kubectl apply -f -
blog-traffic-mirror % kubectl get po -n kuma-system -w
NAME READY STATUS RESTARTS AGE
kuma-control-plane-7d567c7fb9-xs2fx 0/1 Running 0 2s
kuma-control-plane-7d567c7fb9-xs2fx 1/1 Running 0 11s
^C
Note: At the time of this writing, Kuma is at version 2.0.0, so you may want to change VERSION to reflect the latest version.
Let’s verify Kuma successfully installed by port forwarding and sending a request to the Kuma’s HTTP API. In a separate terminal window, execute the following command.
blog-traffic-mirror % kubectl port-forward svc/kuma-control-plane -n kuma-system 5681:5681 &
blog-traffic-mirror % curl http://localhost:5681/
{
"hostname": "kuma-control-plane-76d76dbc4b-f556m",
"tagline": "Kuma",
"version": "2.0.0",
"instanceId": "kuma-control-plane-76d76dbc4b-f556m-0c77",
"clusterId": "0c4430a0-d2dc-4cfe-bb65-21bd734013dc",
"gui": "The gui is available at /gui"
}
Now let’s install all of the applications we are going to use. We are going to use Kong Gateway for ingress traffic into the mesh and mockbin.org as the application:
blog-traffic-mirror % kubectl apply -f https://bit.ly/kong-ingress-dbless
blog-traffic-mirror % kubectl get po -n kong -w
NAME READY STATUS RESTARTS AGE
ingress-kong-7c4b795d5d-p5gln 0/3 Init:0/1 0 3s
ingress-kong-7c4b795d5d-p5gln 0/3 PodInitializing 0 5s
ingress-kong-7c4b795d5d-p5gln 0/3 Running 0 6s
ingress-kong-7c4b795d5d-p5gln 1/3 Running 0 10s
ingress-kong-7c4b795d5d-p5gln 2/3 Running 0 10s
ingress-kong-7c4b795d5d-p5gln 3/3 Running 0 20s
^C
Now verify Kong Gateway is a service on the mesh:
blog-traffic-mirror % curl http://localhost:5681/meshes/default/service-insights/kong-proxy_kong_svc_80
{
"type": "ServiceInsight",
"mesh": "default",
"name": "kong-proxy_kong_svc_80",
"creationTime": "2022-11-03T16:42:32Z",
"modificationTime": "2022-11-03T16:42:32Z",
"status": "online",
"dataplanes": {
"total": 1,
"online": 1
}
}
Next, we will install two versions of mockbin. The first version is “production”, and the second version will be our development release candidate:
blog-traffic-mirror % kubectl apply -f mockbin.yaml -n myblog
blog-traffic-mirror % kubectl get po -n myblog -w
NAME READY STATUS RESTARTS AGE
mockbin-588cfb4499-hlg6g 0/2 Init:0/1 0 1s
mockbin-v2-85f4d85655-zndt2 0/2 Init:0/1 0 2s
mockbin-588cfb4499-hlg6g 0/2 PodInitializing 0 3s
mockbin-v2-85f4d85655-zndt2 0/2 PodInitializing 0 4s
mockbin-v2-85f4d85655-zndt2 1/2 Running 0 5s
mockbin-588cfb4499-hlg6g 1/2 Running 0 4s
mockbin-v2-85f4d85655-zndt2 2/2 Running 0 10s
mockbin-588cfb4499-hlg6g 2/2 Running 0 10s
^C
Verify both versions of the mockbin service are part of the mesh:
blog-traffic-mirror % curl http://localhost:5681/meshes/default/service-insights/
{
...
{
"type": "ServiceInsight",
"mesh": "default",
"name": "mockbin-v2_myblog_svc_3000",
"creationTime": "2022-11-11T16:05:15Z",
"modificationTime": "2022-11-11T16:05:15Z",
"status": "online",
"dataplanes": {
"total": 1,
"online": 1
},
"addressPort": "mockbin-v2_myblog_svc_3000.mesh:3000"
},
{
"type": "ServiceInsight",
"mesh": "default",
"name": "mockbin_myblog_svc_3000",
"creationTime": "2022-11-11T16:05:15Z",
"modificationTime": "2022-11-11T16:05:15Z",
"status": "online",
"dataplanes": {
"total": 1,
"online": 1
},
"addressPort": "mockbin_myblog_svc_3000.mesh:3000"
}
],
"next": null
}
After our applications are installed, we need a way to observe traffic, so let’s install the built-in Kuma observability stack:
blog-traffic-mirror % kumactl install observability | kubectl apply -f -
blog-traffic-mirror % kubectl get po -n mesh-observability -w
NAME READY STATUS RESTARTS AGE
...
grafana-597d56c4c9-h4wnj 2/2 Running 0 21s
loki-promtail-zzkdh 1/1 Running 0 20s
loki-promtail-922d5 1/1 Running 0 20s
prometheus-server-675bb58486-58q6p 2/3 Running 0 25s
prometheus-server-675bb58486-58q6p 3/3 Running 0 30s
loki-0 1/1 Running 0 70s
^C
After the reference observability stack is installed, we will configure the mesh metrics backend and review Kuma’s built-in Grafana dashboards.
Run the below command to configure the metrics backend on the default mesh:
blog-traffic-mirror % kubectl apply -f mesh.yaml
blog-traffic-mirror % kubectl port-forward svc/grafana 8080:80 -n mesh-observability &
In order to use metrics in Kuma, you must first enable a metrics “backend”. By setting a metrics backend to “prometheus”, this instructs Kuma to expose metrics from every proxy inside the mesh. The below snippet shows you the default mesh configuration:
apiVersion: kuma.io/v1alpha1
kind: Mesh
metadata:
name: default
spec:
metrics:
enabledBackend: prometheus-1
backends:
- name: prometheus-1
type: prometheus
Click on Grafana and login using the default credentials (admin/admin). After you are logged into the admin console, find the Kuma CP dashboard, and you should see something similar to the below screen shot.
Next, create an Ingress for the mockbin application to call it with an external address:
blog-traffic-mirror % kubectl apply -f ingress.yaml -n myblog
ingress.networking.k8s.io/mockbin created
Get the external address of the mockbin service:
blog-traffic-mirror % kubectl get ing -n myblog
NAME CLASS HOSTS ADDRESS PORTS AGE
mockbin <none> * 34.173.171.154 3000 18s
Verify the mockbin service works:
blog-traffic-mirror % curl http://34.173.171.154/mockbin/requests
{
"startedDateTime": "2022-11-11T16:13:45.978Z",
"clientIPAddress": "10.48.2.1",
"method": "GET",
"url": "http://34.122.238.196/requests",
"httpVersion": "HTTP/1.1",
"cookies": {},
"headers": {
"host": "mockbin.myblog.svc.3000.mesh",
"x-forwarded-for": "10.48.2.1",
"x-forwarded-proto": "http",
"x-forwarded-host": "34.122.238.196",
"x-forwarded-port": "80",
"x-forwarded-path": "/mockbin/requests",
"x-forwarded-prefix": "/mockbin",
"x-real-ip": "10.48.2.1",
"user-agent": "curl/7.79.1",
"accept": "*/*",
"x-request-id": "7fa88513-f3bc-4407-b001-5fa020f782b1",
"x-envoy-expected-rq-timeout-ms": "15000",
"x-envoy-internal": "true"
},
"queryString": {},
"postData": {
"mimeType": "application/octet-stream",
"text": "",
"params": []
},
"headersSize": 429,
"bodySize": 0
}
Let’s generate some traffic so our Grafana dashboard has some data:
blog-traffic-mirror % while true; do; curl http://34.173.171.154/mockbin/requests; sleep 3; done &
After the above command executes for a bit, find the Kuma Service dashboard in Grafana and select mockbin_myblog_svc_3000
from the “Service” dropdown. You should see data in the Incoming and Outgoing panels:
Next, let’s check version 2 of the mockbin service and make sure no traffic is flowing to that service. Select mockbin-v2_myblog_svc_3000
from the “Service” dropdown:
Now that we have determined no traffic is flowing to version 2, let’s mirror the traffic there:
blog-traffic-mirror % kubectl apply -f traffic-mirror.yaml
Wait for a bit and you should start seeing inbound traffic flowing on the mockbin-v2_myblog_svc_3000
service:
Let’s look at the Kuma policy we just applied. The Kuma policy we applied is of type “ProxyTemplate”. A ProxyTemplate allows custom policy to be applied directly to Envoy. This is a powerful feature in Kuma which allows the user to send Envoy configuration directly from Kuma. The below snippet says to apply the ProxyTemplate to the “mockbin_myblog_svc_3000” service.
apiVersion: kuma.io/v1alpha1
kind: ProxyTemplate
mesh: default
metadata:
name: traffic-mirror
spec:
selectors:
- match:
kuma.io/service: mockbin_myblog_svc_3000
The below snippet patches the Envoy HTTP Connection Manager by matching certain criteria. In this case, the criteria is an inbound network filter of type “envoy.filters.network.http_connection_manager” tagged with kuma.io/service: mockbin_myblog_svc_3000
.
conf:
imports:
- default-proxy
modifications:
- networkFilter:
operation: patch
match:
name: envoy.filters.network.http_connection_manager
origin: inbound
listenerTags:
kuma.io/service: mockbin_myblog_svc_3000
Once Kuma finds the matches network filter, it applies the below value which will create a route in Envoy. The route defines a virtual host that sends traffic coming from domain “mockbin.myblog.svc.3000.mesh” with path “/requests” to the local Envoy “cluster”, i.e. localhost:3000. For more information on route matching in Envoy, see here.
In order to send traffic to version 2 of the mockbin service, we need to set an option on the Envoy route called “request_mirror_policies”. The request_mirror_policies option tells Envoy to send a configurable amount of traffic to the “mockbin-v2_myblog_svc_3000” cluster. In this example, we send 100% of the traffic. Envoy will not return the response from version 2 to the calling service which allows us to safely run development code in production:
value: |
name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: mockbin_myblog_svc_3000
route_config:
name: inbound:mockbin_myblog_svc_3000
virtual_hosts:
- name: mockbin_myblog_svc_3000
domains: ["mockbin.myblog.svc.3000.mesh"]
routes:
- match:
prefix: "/requests"
route:
cluster: localhost:3000
request_mirror_policies:
- cluster: mockbin-v2_myblog_svc_3000
runtime_fraction:
default_value:
numerator: 100
The complete policy is located in the traffic-mirror.yaml
file.
Now that we have production traffic flowing to a development version of the application, we can troubleshoot problems that are difficult to reproduce without production data.
Let’s see this in action. First, get the pod names:
blog-traffic-mirror % kubectl get po -n myblog
NAME READY STATUS RESTARTS AGE
mockbin-588cfb4499-hlg6g 2/2 Running 0 13m
mockbin-v2-85f4d85655-zndt2 2/2 Running 0 13m
Let’s inspect the logs for version 1 of our mockbin service:
blog-traffic-mirror % kubectl logs mockbin-588cfb4499-hlg6g -n myblog -c mockbin
GET /requests 200 11.614 ms - 861
GET /requests 200 2.040 ms - 861
GET /requests 200 1.552 ms - 870
GET /requests 200 1.218 ms - 861
GET /requests 200 1.597 ms - 870
GET /requests 200 1.071 ms - 861
GET /requests 200 0.989 ms - 870
GET /requests 200 1.174 ms - 861
GET /requests 200 1.201 ms - 870
GET /requests 200 1.227 ms - 861
Now, let’s inspect the logs for version 2 of our mockbin service. In version 2, we have inserted additional logging information, which in conjunction with production data, allows us to reproduce and identify the source of the problem:
blog-traffic-mirror % kubectl logs mockbin-v2-85f4d85655-zndt2 -n myblog -c mockbin
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.043 ms - 809
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.252 ms - 800
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.150 ms - 800
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.464 ms - 809
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.090 ms - 800
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.626 ms - 800
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.178 ms - 800
************ Hello from mockbin-traffic-mirror ************
GET /requests 200 1.368 ms - 800
Congratulations, you have successfully and safely mirrored production application traffic into a development version of the application to help you better debug and build production proof releases.