NGINX/OpenResty Event Handling Strategy for CPU Efficiency
Kong Gateway is built on Nginx/OpenResty because it has a unique master/worker architecture, forks out many worker processes, and can utilize every CPU core efficiently. It is the powerhouse within Kong. But one master / multi workers architecture has an annoying defect. Each worker is an isolated process, so it is hard to share something with each other, and in Kong Gateway, there’s a lot of information to communicate between workers, like rebuilding router/health check/cluster sync, etc. We had to find out a way to solve this problem.
The solution is called event propagation mechanism. Before 3.0, Kong used a library named "lua-resty-worker-events", since 3.0 we introduced a new library named "lua-resty-events"(Looks similar, huh?).
This blog will cover how these two libraries work and why we decided to develop a new events library.
lua-resty-worker-events
OpenResty has a useful feature called shared memory—a memory area that every Nginx worker can read and write— which is the ideal method to exchange information between Nginx processes.
How it works
Lua-resty-worker-events just based on shared memory.
Its mechanism is very simple: one worker writes one or more event items in shared memory, which means it posts a message to notify other workers. At the same time, every worker starts a timer to poll events and tries to get the message from the shared memory. Lua-resty-worker-events is the core component in Kong Gateway, many important features depend on it heavily, such as health check, cluster sync, and update routes/targets. It works for us from the 0.10 version to the latest 2.8 version.
Where it falls short
The problem though is that even as kong moves forward throughout the years, lua-resty-worker-events remains at the original place.
Here are some of the disadvantages:
- Access to shared memory must use a lock, the lock overhead is very high when we have many workers(36+).
- It uses OpenResty API ngx.timer.at to poll events, but if no event occurs it will cause unnecessary CPU costs.
- It queries events 1 time per second, which is very inefficient and can not respond quickly as we expected.
- Because of the limit of shared memory size, if we post too many events the memory will overflow and lose the subsequent events.
lua-resty-events
In this cloud native and huge traffic flow era, we needed to change the situation. We need a new event propagation mechanism that could post events to each worker cheaper and more efficiently.
In fact, there is an efficient event mechanism in Nginx itself. Yes, the event loop combined with Linux epoll and Nginx async callback. Just with this event mechanism, Nginx can deal with 10k/100k concurrent connections and works as the core of Kong Gateway.
So we had an interesting idea: why can’t we drop off shared memory and customize nginx into a special event notification system? So we wrote some POC code to verify this idea and implemented a new events library called lua-resty-events.
How it works
Basically, this library is a classic publish/subscribe system, which is a widely used and proven asynchronous message/event dispatch mechanism
We have three roles in this Pub/Sub system:
- event-broker: a service run in one and only one Nginx worker.
- event-publisher: any Nginx worker that wants to post an event.
- event-subscriber: each Nginx worker that wishes to receive events.
The last two roles normally can be the same Nginx worker, we can also call it event-user/event-client. Relatively, an event-broker can be called an event-server.
The event-broker listens on an UNIX domain socket, the event-user connects to the event-broker with Lua cosocket (OpenResty).
So if one event-user wants to publish an event the only thing that needs to do is send a serialized message to the event-broker by socket. The event-broker will receive it immediately without any delay and then broadcast it to all event users. Due to efficient Nginx, almost at the same time, users will get and check the event message, and do what they want to do.
In this process we do not need to use shared memory and ngx.timer, so we eliminate the cost of lock and poll, very fast and sensitive, all work depends on the bottom framework, thanks to Nginx and OpenResty.
Details of Implementation
The first thing we should mention is how to implement event-broker.
UNIX domain socket is a build-in function of Nginx, it is easy to configure, for example:
server {
listen unix:/path/to/events.sock;
}
We chose UNIX domain socket for two reasons:
- It has less overhead than TCP/IP stack.
- It doesn’t need to worry about a port conflict.
But Nginx will accept requests in an uncertain worker process, which will cause event-users to connect to different event-brokers, and hard to manage or broadcast to all event-users. The point is to restrict Nginx to only serve clients in only one worker.
In the lua-resty-events library, there is a small piece of C code to do this job. It borrows from Nginx source code and can disable certain listening sockets. It is also exposed as a FFI interface for Lua land. The rest is easy. In OpenResty's init_worker phase, we call this FFI function to disable the other worker’s listening and only retain one worker's listening, this worker is the event-broker process.
local is_broker = ngx_worker_id() == opts.broker_id
— only enable listening on special worker id
if not is_broker then
disable_listening(opts.listening)
end
Implementation of communication protocol
The event propagation mechanism must work for both the HTTP subsystem and the stream subsystem.WebSocket is a good binary protocol in OpenResty HTTP subsystem, but it can not work in the stream subsystem, So we have to design our own protocol to transmit messages.
The protocol in lua-resty-events imitates the lua-resty-websocket library, you can think it is a simplified version of it.
In the stream subsystem the protocol uses the raw socket directly, in the HTTP subsystem the protocol does the same thing as lua-resty-websocket: sends an upgrade HTTP message, then uses the 101 response code to switch to stream communication.
sock:connect(addr)
if subsystem == “http” then
local req = “GET / HTTP/1.1\r\n” ..
“Host: localhost\r\n” ..
“Connection: Upgrade\r\n” ..
“Upgrade: Kong-Worker-Events/1\r\n\r\n”
sock:send(req)
end
After this handshake, we define a binary message format:
Len(2 bytes) + Payload(max to 65535 bytes).
It’s simple but it is able to store serialized event messages, is unseen by outside users, and can be changed at any time.
The points above are the main points to note within our lua-resty-events library. Compared to lua-resty-worker-events, we also made some other changes, like the usage of string.buffer/ngx.semaphore/ngx.thread.
If you are interested in this library please view its GitHub repository: https://github.com/Kong/lua-resty-events.
Benchmarking performance
In order to check the performance of the lua-resty-events library, we have set up a test environment to generate loads on the OpenResty 1.21.4.1.
For this benchmark, we set up three AWS c4.8xlarge VMs, which have 36 cores, 60GB memory, and 1000M network bandwidth. One for wrk client, one for OpenResty Server, one for Nginx backend.
We also customized the nginx.conf for OpenResty:
- worker_processes auto
- worker_rlimit_nofile 500000
- worker_connections 400000
- lua_socket_log_errors off
- access_log off
- Start a timer to post a lot of events in init_worker phase, the event callback is just an empty function.
We set up two test cases:
- Post 1000 events per 0.5 second in timer context
- Post 10000 events per 0.1 second in timer context
Then we run wrk to do pressure testing, the test command is :
wrk -c 10000 -t 36 -d 3m –latency http://172.31.23.50:8080
The results:
RPSP99 Latency (ms)resty-eventsworker-eventsresty-eventsworker-events1000/0.5s404281.11404294.461000/0.5s958.94118010000/0.1s403709.14262086.9210000/0.1s942.291310As you can see, in the first case, lua-resty-events and lua-resty-worker-events are almost the same, but in the second case, lua-resty-events is better than lua-resty-worker-events, RPS is about 50% up. It shows that lua-resty-events has better performance in massive events.
In both of these two cases lua-resty-events' latency is better than lua-resty-worker-events.
We also observed that in the second case the CPU usage of lua-resty-worker-events is about 70%-90%, but lua-resty-events is about stable 50%. The reason is that lua-resty-worker-events uses more CPU to manipulate shared memory but lua-resty-events has no cost for this.
What's next
The lua-resty-events library is the successor of the lua-resty-worker-events library. It uses a completely different technology to implement an event propagation mechanism, including UNIX domain socket, special C code trick, and customized binary protocol.
Due to these changes, the lua-resty-events library is faster and more efficient than before, it has been integrated into Kong 3.0, which will make Kong Gateway even more powerful. The lua-resty-events library is still under active development, we will continue to improve it and try to add more features to it.
Please don't hesitate to share your thoughts and experiences with us at Kong Nation. If you encounter any bugs or issues with the library, please open an Issue on our GitHub repository as well.
Blog post co-authored by: Chrono Luo and Guanlan Dai