While an API gateway provides a single point of access for your application’s APIs, that doesn’t mean it should be a single point of failure. Just as the microservices that make up your application can be scaled according to demand, your API gateway needs to scale so you can increase bandwidth and provide high availability for a consistent service.
What is a High Availability Cluster?
One of the many benefits of a microservices architecture over the traditional monolith approach to application design is the ability to scale the individual microservices as required for performance and availability. In production scenarios requiring high availability, uptime is always limited by the least available service in the stack. If you’re using an API gateway to provide a single entry point for clients to access your backend services, you don’t want it to become a single point of failure for your production system. This is where a high availability cluster for your API gateway comes in.
An API gateway requires a server to host the service and a data store to store the configuration details. A cluster of API gateways consists of multiple instances of the service running on separate servers (nodes) and a distributed data store. How many nodes you need in the cluster and where those nodes are located depends on the level of availability you need.
Where are High Availability (HA) clusters used?
For many of today’s enterprises, any downtime for online services, whether planned or unplanned, is to be avoided at all costs. In the context of public-facing services, including online shopping, store inventories and banking, the widespread use of cloud hosting has reduced users’ tolerance for system outages. Where API usage is monetized, such as market comparison sites and online payments, the impact of downtime can be financial as well as reputational. Measures such as timeouts on calls and rate limiting can help manage high volumes of traffic, but they won’t prevent the entire application going offline if the server hosting the gateway fails.
High availability API gateway clusters provide consistency for your application’s APIs, ensuring a response is always provided. If one node in the cluster fails, another will take over, continuing to route the request to the relevant backend APIs and consolidating the responses as needed. Combined with a scalable microservice architecture, a high availability API gateway cluster ensures your application can handle large volumes of traffic and react to unexpected spikes, while being resilient to hardware failures.
For internal applications with API endpoints exposed only to consumers within the organization, planned downtime may be acceptable. However, hardware failures can occur unexpectedly, and if usage is very high, an API gateway cluster may be necessary to handle the volume of traffic without slow-downs or outages.
High Availability Clusters for Production Environments
Your application design may lend itself to a single API gateway that provides public-facing endpoints for all your APIs, or multiple API gateways optimized for different types of use case, such as browsers and mobile apps, IoT devices and integrations with internal systems. You can create a cluster of gateway nodes for a single API gateway; setups with multiple API gateways will require a cluster for each type of gateway.
Depending on the gateway provider, you may be able to host the nodes on premises, in a private data center or in a public cloud. Locating gateway nodes in multiple regions and clouds helps to ensure uptime in the event of physical damage in one location or failure of a particular cloud provider.
The exact requirements for setting up the cluster will depend on the API gateway you’re using. If all nodes in the cluster are active, you’ll need to add a load balancer in front of the API gateways to distribute traffic across all nodes. This can use a simple round-robin approach or apply a weighting based on the response time of each node. If the cluster consists of a primary (active) node and multiple secondary (passive) nodes, application logic is required to determine when the primary node has failed and which secondary node should be made the new primary.
An API gateway requires a data store to hold the configuration details and any other data that needs to be persisted. The nodes in the cluster may share a single data store or each node may connect to a separate data store, with changes replicated between them. In either case, the data store should also be replicated for high availability, ideally across multiple regions. If each gateway node maintains a cache to improve performance, these also need to be kept in sync as part of the cluster configuration.
Configuring API Gateways into a High Availability Cluster
The Kong API Gateway is designed to make setting up a high availability cluster as simple as possible. Each node consists of the Kong Gateway, and nodes are added to the same cluster by connecting them to the same data store.
Because all nodes in a cluster are connected to the same data store, there is no need to replicate the gateway configuration settings. When you set up the first Kong Gateway node in the cluster, just configure the gateway settings as normal using the Admin API or Kong Manager. The settings are stored in the data store and replicated to the other nodes when they are added.
Each node in a Kong API Gateway cluster maintains a local cache to minimize traffic to the data store and maximize performance. The cache design means the Kong API Gateway cluster is eventually consistent. The only additional configuration required when setting up a high availability cluster is the cache update times and frequency. These determine how frequently each node polls the database to check for changes to the gateway configuration (services, routes, plugins, etc). By default, the cache settings are tuned for consistency, but you can adjust them for better performance.
To learn more about setting up a high availability cluster for a Kong API Gateway, take a look at the Clustering Reference.