API Gateways for High Availability (HA) Clusters

While an API gateway provides a single point of access for your application’s APIs, that doesn’t mean it should be a single point of failure. Just as the microservices that make up your application can be scaled according to demand, your API gateway needs to scale so you can increase bandwidth and provide high availability for a consistent service.

Learn More

What is a High Availability Cluster?

One of the many benefits of a microservices architecture over the traditional monolith approach to application design is the ability to scale the individual microservices as required for performance and availability. In production scenarios requiring high availability, uptime is always limited by the least available service in the stack. If you’re using an API gateway to provide a single entry point for clients to access your backend services, you don’t want it to become a single point of failure for your production system. This is where a high availability cluster for your API gateway comes in. 

An API gateway requires a server to host the service and a data store to store the configuration details. A cluster of API gateways consists of multiple instances of the service running on separate servers (nodes) and a distributed data store. How many nodes you need in the cluster and where those nodes are located depends on the level of availability you need.

Where are HA clusters used?

For many of today’s enterprises, any downtime for online services, whether planned or unplanned, is to be avoided at all costs. In the context of public-facing services, including online shopping, store inventories and banking, the widespread use of cloud hosting has reduced users’ tolerance for system outages. Where API usage is monetized, such as market comparison sites and online payments, the impact of downtime can be financial as well as reputational. Measures such as timeouts on calls and rate limiting can help manage high volumes of traffic, but they won’t prevent the entire application going offline if the server hosting the gateway fails.

High availability API gateway clusters provide consistency for your application’s APIs, ensuring a response is always provided. If one node in the cluster fails, another will take over, continuing to route the request to the relevant backend APIs and consolidating the responses as needed. Combined with a scalable microservice architecture, a high availability API gateway cluster ensures your application can handle large volumes of traffic and react to unexpected spikes, while being resilient to hardware failures. 

For internal applications with API endpoints exposed only to consumers within the organization, planned downtime may be acceptable. However, hardware failures can occur unexpectedly, and if usage is very high, an API gateway cluster may be necessary to handle the volume of traffic without slow-downs or outages.

High Availability Clusters for Production Environments

Your application design may lend itself to a single API gateway that provides public-facing endpoints for all your APIs, or multiple API gateways optimized for different types of use case, such as browsers and mobile apps, IoT devices and integrations with internal systems. You can create a cluster of gateway nodes for a single API gateway; setups with multiple API gateways will require a cluster for each type of gateway.

Depending on the gateway provider, you may be able to host the nodes on premises, in a private data center or in a public cloud. Locating gateway nodes in multiple regions and clouds helps to ensure uptime in the event of physical damage in one location or failure of a particular cloud provider. 

The exact requirements for setting up the cluster will depend on the API gateway you’re using. If all nodes in the cluster are active, you’ll need to add a load balancer in front of the API gateways to distribute traffic across all nodes. This can use a simple round-robin approach or apply a weighting based on the response time of each node. If the cluster consists of a primary (active) node and multiple secondary (passive) nodes, application logic is required to determine when the primary node has failed and which secondary node should be made the new primary. 

An API gateway requires a data store to hold the configuration details and any other data that needs to be persisted. The nodes in the cluster may share a single data store or each node may connect to a separate data store, with changes replicated between them. In either case, the data store should also be replicated for high availability, ideally across multiple regions. If each gateway node maintains a cache to improve performance, these also need to be kept in sync as part of the cluster configuration.

Configuring API Gateways into a High Availability Cluster

The Kong API Gateway is designed to make setting up a high availability cluster as simple as possible. Each node consists of the Kong Server, and nodes are added to the same cluster by connecting them to the same data store. Kong recommends using the Apache Cassandra data store for high availability scenarios, as it supports multi-region setups and provides a failure-tolerant architecture, ensuring the data store itself does not become a single point of failure. 

Because all nodes in a cluster are connected to the same data store, there is no need to replicate the gateway configuration settings. Each node in a Kong API Gateway cluster maintains a local cache to minimize traffic to the data store and maximize performance. The only additional configuration required when setting up a high availability cluster is the cache update times and frequency. 

Configuring a High Availability Cluster of Multiple API Gateways

To set up a high availability cluster of API gateways with the Kong API Gateway:

  1. Set up an Apache Cassandra database cluster with replication to multiple regions as required for your needs. Consider setting the cassandra_consistency value to QUORUM or LOCAL_QUORUM, according to the number of nodes in the database cluster. If you’re not replicating the datastore, you can use a PostgreSQL database instead.
  2. Install and start Kong API Gateway on the first server node for the cluster. Kong is fully platform agnostic and can be installed in any cloud or datacenter. During installation, specify the primary Apache Cassandra node as the datastore.
  3. Use the Admin API or Kong Manager (for Kong Enterprise users) to add APIs to the gateway and specify the route for each API. Do the same for any plugins required, such as authentication, logging and rate limiting. These configuration details are stored in the datastore.
  4. Each node in the cluster maintains a cache to minimize requests to the datastore and improve performance. The cache design means the Kong API Gateway cluster is eventually consistent. You can specify the timings and frequency for cache updates according to the balance you want to strike between consistency and performance. By default, each Kong API Gateway cache is set to poll the datastore for updates every five seconds, with no delay to allow changes to be propagated through the database cluster.
    1. To improve performance, use the Admin API or Kong Manager to poll for updates (db_update_frequency) less frequently.
    2. Ensure the propagation delay (db_update_propagation) is sufficient to allow updates to propagate through all nodes in the datastore cluster. (Cassandra is eventually consistent, so this value must not be set to zero otherwise the Kong API Gateway nodes will invalidate their caches only to update them with an out-of-date value.)
  5. Install Kong API Gateway on as many additional server nodes as required and connect each of them to the same datastore as above. The configuration details and plugins will be cached by each node and updated according to the frequency and timings you have specified.
  6. Set up a load balancer to distribute traffic across all nodes in the API gateway cluster.
  7. Test the cluster and tweak the cache update settings to balance performance against consistency as required.

You can set up additional Kong API Gateway clusters to support different API gateway configurations designed for particular types of client.

Configuring a High Availability Cluster of a Single API Gateway

If you only have one Kong API Gateway server, this creates a cluster of one node. Follow the steps described above, but you will not need to configure the db_update_propagation unless the data store is replicated. As there is only one API gateway, there is no need to add a load balancer to distribute traffic. 

For more details about setting up a high availability cluster for a Kong API Gateway, see the Clustering Reference.

Want to learn more?

Request a demo to talk to our experts to answer your questions and explore your needs.