Engineering
November 14, 2022
3 min read

How to Use Prometheus to Monitor Kong Gateway

Jun Ouyang

Observability is a critical part of Kong's API Gateway. In this post, we'll describe two options to monitor Kong Gateway using Prometheus.

Prometheus is an open source system monitor toolkit built at SoundCloud that is now widely adopted. StatsD was originally a simple daemon developed by Etsy to aggregate and summarize application metrics. Prometheus provides a StatsD exporter to collect metrics that are sent in StatsD format.

Kong Gateway supports both of the above for integrating with Prometheus. This enables Prometheus to pull metrics directly from the gateway or using a StatsD exporter in between to offload some work from the gateway.

Kong Gateway is built on top of OpenResty/Nginx, which is a multi-process single-threaded architecture. To collect and aggregate metrics from different processes, we implement the Prometheus plugin with shared memory.

Nginx handles requests in a non-blocking way and is normally very efficient. However, every read and write operation to the shared memory requires a mutex lock to lock the critical section and will block all worker processes from processing requests. When the plugin is used to monitor metrics with high cardinality, it can affect Kong Gateway performance significantly, especially increasing the long tail latencies. We recommend using the StatsD plugin as an alternative for such use cases.

We’ll explain how to use these two plugins in the following sections.

Prometheus

As the investigation progressed, we found that the Prometheus plugin collects metrics with some expensive function calls because it stores many high cardinality metrics in Nginx's shared memory. Therefore when the Prometheus service performed its periodic pull for the metrics, it triggered high overhead in Nginx and affected the real request latency. (See the issues on GitHub to get more information).

So unlike the old release version, in Kong Gateway 3.0 the Prometheus plugin doesn't export status codes, latencies, bandwidth, and upstream health check metrics by default to avoid the costly overhead of collecting metrics.

Because these metrics will need to be added up or reset during the life of each connection — and these metrics have some different labels — they take up a lot of memory space, which needs to be traversed when Prometheus polls for metrics leading to higher latencies. They can still be turned on manually by setting config status_code_metrics, lantency_metrics, bandwidth_metrics, and upstream_health_metrics respectively. Enabling those metrics will impact the performance if you have a lot of Kong entities, therefore we recommend using the StatsD plugin with the push model for those cases.

In a previous version of Kong Gateway, we found some performance issues with the Prometheus plugin. For example, in the production environment of one of our enterprise customers, they found that the request from Prometheus to pull metrics caused sporadic spikes in latencies for other requests — sometimes as much as three seconds.

Level Up Your API Game: Advanced Analytics for Unprecedented Observability

1. Bootstrap Kong config

2. Enable Prometheus plugin

 3. Config your Prometheus config

    4. Import dashboard

     Import Kong statsd exporter dashboard to your grafana.

2. Statsd

In Kong 3.0, we moved all statsd-advanced functions to statsd plugin, so community users can use statsd plugin to achieve more complex and functional things that are previously only available to Enterprise offerings. When using statsd plugin to push metrics, Kong doesn't need to store the content of the metrics in memory, and therefore this plugin has minimal impact on the proxy path. The only overhead is using OpenResty cosocket to asynchronously send data over the network,. However a middleware service (statsd_exporter) needs to be deployed to collect and aggregate metrics sent by Kong.

1. Start statsd_exporter

While it is common to run centralized StatsD servers, the exporter works best as a sidecar. This allows you to scale the statsd exporter horizontally as you add more Gateway where as a single instance can become a bottleneck.

Here is an example of a mapping yaml file for statsd_exporter:

Using statsd plugin with the above provides slightly different metrics compared with Prometheus plugin.

Then start your statsd_exporter and config your Prometheus to collect metrics

./statsd_exporter  --statsd.mapping-config=./config.yaml

2. Bootstrap Kong config

Bootstrap kong gateway config.

3. Config statsd plugin

4. Import dashboard

Import Kong statsd exporter dashboard to your grafana.