February 14, 2019
7 min read

Observability For Your Microservices Using Kong and Kubernetes

Harry Bagdi

Read the latest version: APM With Prometheus and Grafana on Kubernetes Ingress

Archived post below.

In the modern SaaS world, observability is key to running software reliability, managing risks and deriving business value out of the code that you're shipping. To measure how your service is performing, you record Service Level Indicators (SLIs) or metrics, and alert whenever performance, correctness or availability is affected.

Very broadly, application monitoring falls into two categories: white box and black box monitoring. These terms mean exactly what they sound like.

Whitebox monitoring provides visibility into the internals of your applications. It can include things like thread stats, GC stats, internal performance counters and timing data. Whitebox monitoring usually requires instrumenting your application, meaning, it requires some modifications to your code. But, it can be extremely helpful in figuring out the root cause of an outage or a bottleneck. The (sharply dropping) cost of instrumenting your applications pays off very quickly with an increased understanding of how your application performs in a variety of scenarios and allows you to make reasonable trade-offs with concrete data.

Black box monitoring means treating the application as a black box, sending it various inputs and observing it from the outside to see how it responds. Because it doesn't require instrumenting your code and can be implemented from outside your application, black box monitoring can give a simple picture of performance that can be standardized across multiple applications. When implemented in a microservice architecture, black box monitoring can give an operator a similar view of services as the services have of each other.

Both types of monitoring serve different purposes and it's important to include both of them in your systems. In this blog, we outline how to implement black box monitoring, with the understanding that combining both types of monitoring will give you a complete picture of your application health. Kong allows users to easily implement black box monitoring because it sits between the consumers of a service and the service itself. This allows it to collect the same black box metrics for every service it sits in front of, providing uniformity and preventing repetition.

In this tutorial, we will explain how you can leverage the Prometheus monitoring stack in conjunction with Kong, to get black box metrics and observability for all of your services. We choose Prometheus, since we use it quite a bit, but this guide can be applied to other solutions like StatsD, Datadog, Graphite, InfluxDB etc. We will be deploying all of our components on Kubernetes. Buckle up!


We will be setting up the following architecture as part of this guide.

Here, on the right, we have a few services running, which we would like to monitor. We also have Prometheus, which collects and indexes monitoring data, and Grafana, which graphs the monitoring data.

We're going to deploy Kong as a Kubernetes Ingress Controller, meaning we'll be configuring Kong using Kubernetes resources and Kong will route all traffic inbound for our application from the outside world. It is also possible to set up routing rules in Kong to proxy traffic.


You'll need a few things before we start setting up our services:

  • Kubernetes cluster: You can use Minikube or a GKE cluster for the purpose of this tutorial. We run a Kubernetes cluster v 1.18.x.
  • Helm: We will be using Helm to install all of our components. Tiller should be installed on your k8s cluster and helm CLI should be available on your workstation. You can follow Helm's quickstart guide to set up helm.

Once you've Kubernetes and Helm setup, you're good to proceed.

Caution: Some settings in this guide are tweaked to keep this guide simple. These settings are not meant for Production usage.

Install Prometheus and Grafana


We will install Prometheus with a scrape interval of 10 seconds to have fine grained data points for all metrics. We'll install both Prometheus and Grafana in a dedicated ‘monitoring' namespace.

To install Prometheus, execute the following:helm install prometheus stable/prometheus --namespace monitoring --create-namespace --values --version 11.5.0


Grafana is installed with the following values for the Helm chart (see comments for explanation):

To install Grafana, go head and execute the following:helm install grafana stable/grafana --namespace monitoring --values --version 5.3.0

Set Up Kong

Next, we will install Kong, if you don't already have it installed in your Kubernetes cluster.
We chose to use the Kong Ingress Controller for this purpose since it allows us to configure Kong using Kubernetes itself. You can also choose to install Kong as an application and configure it using Kong's Admin API.

Run the following commands to create the Kong namespace and then install Kong using the helm chart:

helm install kong kong/kong --namespace kong --create-namespace --set ingressController.installCRDs=false --values --version 1.7.0

The helm chart values we use here are:

It will take a few minutes to get all pods in the running state as images are pulled down and components start up.

Enable Prometheus Plugin in Kong

Next, once Kong is running, we will create a Custom Resource in Kubernetes to enable the Prometheus plugin in Kong. This configures Kong to collect metrics for all requests proxies via Kong and expose them to Prometheus.

Execute the following to enable the Prometheus plugin for all requests:

Set Up Port Forwards

Now, we will gain access to the components we just deployed. In a production environment, you would have a Kubernetes Service with external IP or load balancer, which would allow you to access Prometheus, Grafana and Kong. For demo purposes, we will set up port-forwarding using kubectl to get access. Please do not do this in production.

Open a new terminal and execute the following commands:

Access Grafana Dashboard

To access Grafana, you need to get the password for the admin user.

Execute the following to read the password and take a note of it:

Now, browse to http://localhost:3000 and fill in username as "admin" and password as what you just noted above. You should be logged in to Grafana and find that Kong's Grafana Dashboard is sitting there waiting for you.

Setup Services

Now, we have all the components for monitoring setup, we will spin up some services for demo purposes and setup Ingress routing for them.

Install Services

We will setup three services: billing, invoice, comments.
Execute the following to spin these services up:kubectl apply -f

Install Ingress for the Services

Next, once the services are up and running, we will create Ingress routing rules in Kubernetes. This will configure Kong to proxy traffic destined for these services correctly.

Execute the following:

Let's Create Some Traffic

We're done configuring our services and proxies. Time to see if our set up works or catches fire.
Execute the following in a new terminal:

Since we have already enabled Prometheus plugin in Kong to collect metrics for requests proxied via Kong, we should see metrics coming through in the Grafana dashboard.

You should be able to see metrics related to the traffic flowing through our services.
Try tweaking the above script to send different traffic patterns and see how the metrics change.
The upstream services are httpbin instances, meaning you can use a variety of endpoints to shape your traffic.

Metrics collected

Request Latencies of Various Services

Kong collects latency data of how long your services take to respond to requests.
One can use this data to alert the on-call engineer if the latency goes beyond a certain threshold. For example, let's say you've an SLA that your APIs will respond with latency of less than 20 millisecond for 95% of the requests. You could configure Prometheus to alert based on the following query:histogram_quantile(0.95, sum(rate(kong_latency_bucket{type="request"}[1m])) by (le,service)) > 20

The query calculates the 95th percentile of the the total request latency (or duration) for all of your services and alerts you if it is more than 20 milliseconds. The "type" label in this query is "request", which tracks the latency added by Kong and the service. You can switch this to "upstream", to track latency added by the service only. Prometheus is really flexible and well documented, so we won't go into details of setting up alerts here, but you'll be able to find them in the Prometheus documentation.

Kong Proxy Latency

Kong also collects metrics about it's performance. The following query is similar to the previous one but gives us insight into latency added by Kong:histogram_quantile(0.90, sum(rate(kong_latency_bucket{type="kong"}[1m])) by (le,service)) > 2

Error Rates

Another important metric to track is the rate of errors and requests your services are serving. The timeseries kong_http_status collects HTTP status code metrics for each service.

This metric can help you track the rate of errors for each of your service:sum(rate(kong_http_status{code=~"5[0-9]{2}"}[1m])) by (service)

You can also calculate the percentage of requests in any duration that are errors. Try to come up with a query to derive that result.

Please note that all HTTP status codes are indexed, meaning you could use the data to learn about your typical traffic pattern and identify problems. For example, a sudden rise in 404 response codes could be indicative of client codes requesting an endpoint that was removed in a recent deploy.

Request Rate and Bandwidth

One can derive the total request rate for each of your service or across your Kubernetes cluster using the kong_http_status timeseries.

Another metric that Kong keeps track of is the amount of network bandwidth (kong_bandwidth) being consumed. This gives you an estimate of how request/response sizes co-relate with other behaviours in your infrastructure.

With these metrics, you should be able to gain quite a bit of insight and implement strategies like the REDmethod (Requests, Errors and Durations) for monitoring.

And that's it. You now have metrics for the services running inside your Kubernetes cluster and have much more visibility into your applications, which you gained using only configurations. Since you now have Kong set up in your Kubernetes cluster, you might want to check out its other uses plugin-enabled functionalities: authentication, logging, transformations, load balancing, circuit-breaking, and much more, which you can now easily use with very little additional setup.

If you've have any questions, please reach out to Kong's helpful community members via Kong Nation.

Happy Konging!

Thanks to Kevin Chen, Judith Malnick, Robert Paprocki and Marco Palladino for reviewing drafts of this post!