By on May 26, 2022

Advanced rate limiting policies with Kong and Hazelcast

Kong Gateway integrates with Hazelcast to implement advanced policies for rate limiting besides the fundamental options provided by the Kong plugins, including distributed compute, security, zero downtime, and cloud agnostic.

As organizations increasingly rely on APIs as building blocks for mission-critical business functions, how they build and manage APIs to deliver on expectations of high performance and reliability will be a “make or break” factor for success. Mission-critical API services need to be always on, be scalable, and always be reliable and available all the time. Yet, continued increases in transaction rates and consumption can overwhelm API services, and even result in unexpected service blackouts resulting in business SLA violations and frustrated customers. Hazelcast can increase performance and reliability of the Kong service connectivity platform by providing advanced rate limiting capabilities.

Kong and API gateway policies

One of the main responsibilities of any API gateway is to implement critical and complex policies, off-loading and leaving the services sitting behind it to focus on their business logic.

From this standpoint, it’s important to provide not just a lightweight but, as important as, an extensible API gateway.

Kong Gateway is a reverse proxy that lets you expose your APIs and control the exposure with a plethora of policies. Running in front of any RESTful API, Kong Gateway can be extended through modules and plugins and then implement those policies. This sets the foundations for a modular architecture, where plugins can be enabled and executed at runtime.

Kong Gateway provides an extensive list of ready-to-use plugins. Here are just a few of them:

  • Authentication: plugins to implement all sorts of authentication mechanisms such as OpenID Connect, Basic Authentication, LDAP, Mutual TLS (mTLS), API Key, etc.
  • Transformations: plugins to transform request before routing them to the upstreams and plugins to transform their responses before returning to the Consumers, transform GraphQL upstreams into a REST API, transform requests into Kafka messages in a Topic of an existing Kafka based Event Processing infrastructure, etc.
  • Serverless: integration with AWS Lambda and Azure Functions.
  • Analytics and monitoring: to provide metrics to external systems, including Datadog and Prometheus
  • Traffic control: plugins to implement Canary Releases, Mocking endpoints, Routing policies based on request headers, Rate limiting for requests & responses etc.
  • Proxy caching: to cache commonly requested responses in the Gateway.

You can also create your own custom plugins with the Plugin Development Guides and PDK references and guides on creating plugins with several languages including GoLang, Python, NodeJS and Lua Scripting Language.

Extending Kong Gateway capabilities with Hazelcast

Hazelcast is a real-time data platform for fast, stateful, data-intensive workloads on-premises, at the edge or as a fully managed cloud service. Hazelcast provides linearly scalable distributed in-memory data and compute capability to Kong. Hazelcast can underpin various plugins be it rate-limiting, response limiting,  proxy caching, transformation, serverless, industrializing machine learning and more.  

Hazelcast is a multi-threaded data platform that partitions data equally across all nodes within the cluster allowing it to distribute workloads (data and computation) across all member nodes. This is a scalable architecture where the cluster rebalances workloads automatically for linear scalability. Furthermore, the distributed architecture of Hazelcast allows Kong to share context and data across multiple instances to achieve maximum performance.

Hazelcast advantage with Kong

A NoSQL Database is a popular option for providing caching capability for Kong.In comparison, a NoSQL Database does not provide any compute capability, is less performant and scalable than Hazelcast at higher load. A key reason for this is the Primary-Secondary architecture which requires that all updates are processed through the primary node only. This makes the primary node a bottleneck and read replicas unreliable to provide the latest version of data during peak load.

Thus, a data cluster which uses Primary-Secondary architecture is only as performant as the primary node itself. Adding secondary nodes only provide additional read capability with eventual consistency with no guarantee on consistency of data. Additionally, if the platform is single threaded it is also not equipped to utilize the multi core servers efficiently. 

The Hazelcast Platform is more feature rich than other in-memory cache-only platforms. The ability of Hazelcast to perform grid computing (server side compute) provides extra performance and flexibility over the current Rate Limiting plugin, which does the computation locally which then has to be synchronized across the platform. The ability to group related data in the same partition and submit computation tasks to be performed on the cluster member or server where data is located ensures that computation is fast as well as data movement is limited, ensuring performance with  optimum use of Network bandwidth. With all relevant information in RAM, there is no need to traverse a network to remote storage for transaction processing. The difference in speed is significant: minutes vs. sub-millisecond response times for complex transactions done millions of times per second.

Kong Gateway and rate limiting policies

Rate limiting policies protect your upstream services from overuse by limiting the number of requests each consumer can send to the API. There are different ways to implement Rate Limiting using a variety of algorithms. Check the “How to Design a Scalable Rate Limiting Algorithm with Kong API” blog post for more information.

Besides the plugin previously listed, currently, Kong provides two specific plugins to implement Rate Limiting policies:

The Kong Gateway Rate Limiting plugin

The Rate Limiting plugin provides capabilities to provide configurations including:

  • Localization: Local (rate limiting counters are stored locally in-memory on the Cluster node) or Redis (counters are stored on a Redis server and shared across the nodes)
  • Scope: policies can be applied to Kong Services, Routes or Consumers.
  • Number of limits: multiple limits can be configured to define policies based on seconds, minutes, hours, days, months, or years.

For a more detailed tutorial, check out “Protecting Services With Kong Gateway Rate Limiting

A typical Kong Control Plane and Data Plane deployment with an external server to store and share the Rate Limiting counters would look like this:

As the Kong Data Plane scales out, the new instances will work with the same counters shared by the Rate Limiting Clusters.

The Kong Gateway Rate Limiting Advanced plugin

The Rate Limiting plugin implements the fixed window algorithm, that is, it limits the number of requests based on a fixed time unit. One of the problems with it is that it might lead to a problem where we have a peak of requests combining the edges of the current and next windows.

As compared to the standard Rate Limiting plugin, Rate Limiting Advanced adds support for the sliding window algorithm. This algorithm takes the previous counter into account. For example, if the current window is 25% through, we weigh the previous window’s count by 75%. Check the Rate Limiting Library for the description of the algorithm used by the plugin.

Another important capability provided by the Rate Limiting Advanced plugin, for low latency, is an in-memory table of the counters. It can be synchronized across the Cluster with asynchronous or synchronous updates with the Rate Limiting Servers responsible for storing the counters.

Hazelcast Rate Limiting plugin

The rate limiting capabilities provided by both Kong plugins support a variety of policies customers typically have. For enterprise customers &  advanced requirements, Hazelcast and Kong have developed a new plugin.

Hazelcast allows to share data / data structures across multiple Kong data planes with consistency promoting reusability of data allowing to rationalize APIs. The data planes can also submit all or parts of its computing logic to the Hazelcast cluster. This is particularly important for rate limiting as the rules could be complex based on customer tier, location, product, current load, etc., which require a view of context across Kong. As a shared data layer, Hazelcast is able to gather context across multiple Kong instances to enable broader or complex rate limiting rules efficiently. 

In many cases, a rate limiting plugin is used to reduce the burden of the workload on the backend systems. This forces API consumers to compromise on their ability to provide data to end users in an agile, real-time manner. An alternative architecture with Kong is to use Hazelcast as a system accelerator where the in-memory computing and stream processing capabilities can boost Kong into a highly responsive, real-time API endpoint without forcing consumers to limit access to data.


The new Hazelcast Rate Limiting plugin was designed to address advanced policies for enterprise class applications. Hazelcast provides the ideal platform to execute complex, context-aware rate limiting rules across multiple Kong instances without overloading the API Gateway with computation or the need to store additional data. Hazelcast is a truly scalable platform that complements Kong HQ with fast data and computation capabilities. Hazelcast provides the leading in-memory computing platform for high-performance data processing at scale. The platform allows users to  use the realtime computational capabilities to enable more complex rate limiting policies.

On the other hand, Kong provides a highly scalable and extensible API Gateway with a long list of plugins that can be combined with the Rate Limiting plugin. Feel free to add new and experiment further implementing policies like caching, log processing, OIDC-based authentication, canary, GraphQL integration, and more.

Both technologies can be synergistically used for cloud native applications supporting multiple platforms for hybrid deployments.

Share Post