Kong Enterprise 3.3 delivers enhanced security, usability, and platform reliability. Learn more

Observability For Your Microservices Using Kong and Kubernetes

Read the latest version: APM With Prometheus and Grafana on Kubernetes Ingress

Archived post below.

In the modern SaaS world, observability is key to running software reliability, managing risks and deriving business value out of the code that you’re shipping. To measure how your service is performing, you record Service Level Indicators (SLIs) or metrics, and alert whenever performance, correctness or availability is affected.

Very broadly, application monitoring falls into two categories: white box and black box monitoring. These terms mean exactly what they sound like.

Whitebox monitoring provides visibility into the internals of your applications. It can include things like thread stats, GC stats, internal performance counters and timing data. Whitebox monitoring usually requires instrumenting your application, meaning, it requires some modifications to your code. But, it can be extremely helpful in figuring out the root cause of an outage or a bottleneck. The (sharply dropping) cost of instrumenting your applications pays off very quickly with an increased understanding of how your application performs in a variety of scenarios and allows you to make reasonable trade-offs with concrete data.

Black box monitoring means treating the application as a black box, sending it various inputs and observing it from the outside to see how it responds. Because it doesn’t require instrumenting your code and can be implemented from outside your application, black box monitoring can give a simple picture of performance that can be standardized across multiple applications. When implemented in a microservice architecture, black box monitoring can give an operator a similar view of services as the services have of each other.

Both types of monitoring serve different purposes and it’s important to include both of them in your systems. In this blog, we outline how to implement black box monitoring, with the understanding that combining both types of monitoring will give you a complete picture of your application health. Kong allows users to easily implement black box monitoring because it sits between the consumers of a service and the service itself. This allows it to collect the same black box metrics for every service it sits in front of, providing uniformity and preventing repetition.

In this tutorial, we will explain how you can leverage the Prometheus monitoring stack in conjunction with Kong, to get black box metrics and observability for all of your services. We choose Prometheus, since we use it quite a bit, but this guide can be applied to other solutions like StatsD, Datadog, Graphite, InfluxDB etc. We will be deploying all of our components on Kubernetes. Buckle up!


We will be setting up the following architecture as part of this guide.

Here, on the right, we have a few services running, which we would like to monitor. We also have Prometheus, which collects and indexes monitoring data, and Grafana, which graphs the monitoring data.

We’re going to deploy Kong as a Kubernetes Ingress Controller, meaning we’ll be configuring Kong using Kubernetes resources and Kong will route all traffic inbound for our application from the outside world. It is also possible to set up routing rules in Kong to proxy traffic.


You’ll need a few things before we start setting up our services:

  • Kubernetes cluster: You can use Minikube or a GKE cluster for the purpose of this tutorial. We run a Kubernetes cluster v 1.18.x.
  • Helm: We will be using Helm to install all of our components. Tiller should be installed on your k8s cluster and helm CLI should be available on your workstation. You can follow Helm’s quickstart guide to set up helm.

Once you’ve Kubernetes and Helm setup, you’re good to proceed.

Caution: Some settings in this guide are tweaked to keep this guide simple. These settings are not meant for Production usage.

Install Prometheus and Grafana


We will install Prometheus with a scrape interval of 10 seconds to have fine grained data points for all metrics. We’ll install both Prometheus and Grafana in a dedicated ‘monitoring’ namespace.

To install Prometheus, execute the following:
helm install prometheus stable/prometheus --namespace monitoring --create-namespace --values https://bit.ly/2RgzDtg --version 11.5.0


Grafana is installed with the following values for the Helm chart (see comments for explanation):

  enabled: true  # enable persistence using Persistent Volumes
   apiVersion: 1
   Datasources:  # configure Grafana to read metrics from Prometheus
   - name: Prometheus
     type: prometheus
     url: http://prometheus-server # Since Prometheus is deployed in
     access: proxy    # same namespace, this resolves
                      # to the Prometheus Server we installed previous
     isDefault: true  # The default data source is Prometheus

    apiVersion: 1
    - name: 'default' # Configure a dashboard provider file to
      orgId: 1        # put Kong dashboard into.
      folder: ''
      type: file
      disableDeletion: false
      editable: true
        path: /var/lib/grafana/dashboards/default
      gnetId: 7424  # Install the following Grafana dashboard in the
      revision: 5   # instance: https://grafana.com/dashboards/7424 
      datasource: Prometheus

To install Grafana, go head and execute the following:
helm install grafana stable/grafana --namespace monitoring --values https://bit.ly/2YllNrj --version 5.3.0

Set Up Kong

Next, we will install Kong, if you don’t already have it installed in your Kubernetes cluster.
We chose to use the Kong Ingress Controller for this purpose since it allows us to configure Kong using Kubernetes itself. You can also choose to install Kong as an application and configure it using Kong’s Admin API.

Run the following commands to create the Kong namespace and then install Kong using the helm chart:

helm install kong kong/kong --namespace kong --create-namespace --set ingressController.installCRDs=false --values https://bit.ly/kongvalues541 --version 1.7.0

The helm chart values we use here are:

  useTLS: false     # Metrics for Prometheus are available
readinessProbe:     # on the Admin API. By default, Prometheus
  httpGet:          # scrapes are HTTP and not HTTPS.
    scheme: HTTP    # Admin API should be on TLS in production.
    scheme: HTTP
ingressController:  # enable Kong as an Ingress controller
  enabled: true
  prometheus.io/scrape: "true" # Ask Prometheus to scrape the
  prometheus.io/port: "9542"   # Kong pods for metrics

It will take a few minutes to get all pods in the running state as images are pulled down and components start up.

Enable Prometheus Plugin in Kong

Next, once Kong is running, we will create a Custom Resource in Kubernetes to enable the Prometheus plugin in Kong. This configures Kong to collect metrics for all requests proxies via Kong and expose them to Prometheus.

Execute the following to enable the Prometheus plugin for all requests:

echo "apiVersion: configuration.konghq.com/v1
kind: KongPlugin
    global: \"true\"
  name: prometheus
plugin: prometheus
" | kubectl apply -f -

Set Up Port Forwards

Now, we will gain access to the components we just deployed. In a production environment, you would have a Kubernetes Service with external IP or load balancer, which would allow you to access Prometheus, Grafana and Kong. For demo purposes, we will set up port-forwarding using kubectl to get access. Please do not do this in production.

Open a new terminal and execute the following commands:

POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring  port-forward $POD_NAME 9090 &

# You can access Prometheus in your browser at localhost:9090

POD_NAME=$(kubectl get pods --namespace monitoring -l "app.kubernetes.io/name=grafana" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace monitoring port-forward $POD_NAME 3000 &

# You can access Grafana in your browser at localhost:3000
# We will get around to getting admin credentials in just a minute.

POD_NAME=$(kubectl get pods --namespace kong -l "app.kubernetes.io/instance=kong" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace kong port-forward $POD_NAME 8000 &

# Kong proxy port is now your localhost 8000 port
# We are using plain-text HTTP proxy for this purpose of
# demo.

Access Grafana Dashboard

To access Grafana, you need to get the password for the admin user.

Execute the following to read the password and take a note of it:

kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo 

Now, browse to http://localhost:3000 and fill in username as “admin” and password as what you just noted above. You should be logged in to Grafana and find that Kong’s Grafana Dashboard is sitting there waiting for you.

Setup Services

Now, we have all the components for monitoring setup, we will spin up some services for demo purposes and setup Ingress routing for them.

Install Services

We will setup three services: billing, invoice, comments.
Execute the following to spin these services up:
kubectl apply -f https://bit.ly/2Z9LmuM

Install Ingress for the Services

Next, once the services are up and running, we will create Ingress routing rules in Kubernetes. This will configure Kong to proxy traffic destined for these services correctly.

Execute the following:

echo "apiVersion: configuration.konghq.com/v1
kind: KongIngress
  name: strip-path
  strip_path: true
" | kubectl apply -f -

echo "apiVersion: extensions/v1beta1
kind: Ingress
    configuration.konghq.com: strip-path
  name: sample-ingresses
  - http:
     - path: /billing
         serviceName: billing
         servicePort: 80
     - path: /comments
         serviceName: comments
         servicePort: 80
     - path: /invoice
         serviceName: invoice
         servicePort: 80" | kubectl apply -f -

Let’s Create Some Traffic

We’re done configuring our services and proxies. Time to see if our set up works or catches fire.
Execute the following in a new terminal:

while true;
  curl http://localhost:8000/billing/status/200
  curl http://localhost:8000/billing/status/501
  curl http://localhost:8000/invoice/status/201
  curl http://localhost:8000/invoice/status/404
  curl http://localhost:8000/comments/status/200
  curl http://localhost:8000/comments/status/200
  sleep 0.01

Since we have already enabled Prometheus plugin in Kong to collect metrics for requests proxied via Kong, we should see metrics coming through in the Grafana dashboard.

You should be able to see metrics related to the traffic flowing through our services.
Try tweaking the above script to send different traffic patterns and see how the metrics change.
The upstream services are httpbin instances, meaning you can use a variety of endpoints to shape your traffic.

Metrics collected

Request Latencies of Various Services

Request Latencies of Various Services

Kong collects latency data of how long your services take to respond to requests.
One can use this data to alert the on-call engineer if the latency goes beyond a certain threshold. For example, let’s say you’ve an SLA that your APIs will respond with latency of less than 20 millisecond for 95% of the requests. You could configure Prometheus to alert based on the following query:
histogram_quantile(0.95, sum(rate(kong_latency_bucket{type="request"}[1m])) by (le,service)) > 20

The query calculates the 95th percentile of the the total request latency (or duration) for all of your services and alerts you if it is more than 20 milliseconds. The “type” label in this query is “request”, which tracks the latency added by Kong and the service. You can switch this to “upstream”, to track latency added by the service only. Prometheus is really flexible and well documented, so we won’t go into details of setting up alerts here, but you’ll be able to find them in the Prometheus documentation.

Kong Proxy Latency

Kong Proxy Latency

Kong also collects metrics about it’s performance. The following query is similar to the previous one but gives us insight into latency added by Kong:
histogram_quantile(0.90, sum(rate(kong_latency_bucket{type="kong"}[1m])) by (le,service)) > 2

Error Rates

Error Rates

Another important metric to track is the rate of errors and requests your services are serving. The timeseries kong_http_status collects HTTP status code metrics for each service.

This metric can help you track the rate of errors for each of your service:
sum(rate(kong_http_status{code=~"5[0-9]{2}"}[1m])) by (service)

You can also calculate the percentage of requests in any duration that are errors. Try to come up with a query to derive that result.

Please note that all HTTP status codes are indexed, meaning you could use the data to learn about your typical traffic pattern and identify problems. For example, a sudden rise in 404 response codes could be indicative of client codes requesting an endpoint that was removed in a recent deploy.

Request Rate and Bandwidth

Request Rate

One can derive the total request rate for each of your service or across your Kubernetes cluster using the kong_http_status timeseries.


Another metric that Kong keeps track of is the amount of network bandwidth (kong_bandwidth) being consumed. This gives you an estimate of how request/response sizes co-relate with other behaviours in your infrastructure.

With these metrics, you should be able to gain quite a bit of insight and implement strategies like the RED method (Requests, Errors and Durations) for monitoring.

And that’s it. You now have metrics for the services running inside your Kubernetes cluster and have much more visibility into your applications, which you gained using only configurations. Since you now have Kong set up in your Kubernetes cluster, you might want to check out its other uses plugin-enabled functionalities: authentication, logging, transformations, load balancing, circuit-breaking, and much more, which you can now easily use with very little additional setup.

If you’ve have any questions, please reach out to Kong’s helpful community members via Kong Nation.

Happy Konging!


Thanks to Kevin Chen, Judith Malnick, Robert Paprocki and Marco Palladino for reviewing drafts of this post!

Share Post

Subscribe to Our Newsletter!

    How to Scale High-Performance APIs and Microservices

    Learn how to make your API strategy a competitive advantage.

    June 20, 2023 8:00 AM (PT) Register Now