There is a well-known problem called c10k. The essence of it is to handle 10000 concurrent clients on a single server. This problem was conceived in 1999 by Dan Kegel and at that time it made the industry to rethink the way the web servers were handling connections. Then-state-of-the-art solution to allocate a thread for each client started to leak facing the upcoming web scale. Nginx was born to solve this problem by embracing event-driven I/O model provided by a shiny new epoll system call (in Linux).
Times were different back then and now we can have a really beefy server with the 10G network, 32 cores and 256 GiB RAM that can easily handle that amount of clients, so c10k is not much of a problem even with threaded I/O. But, anyway, I wanted to check how various solutions like threads and non-blocking async I/O will handle it, so I started to write some silly servers in my c10k repo and then I’ve stuck because I needed some tools to test my implementations.
Basically, I needed a c10k client. And I actually wrote a couple – one in Go and the other in C with libuv. I’m going to also write the one in Python 3 with asyncio.
While I was writing each client I’ve found 2 peculiarities – how to make it bad and how to make it slow.
By making bad I mean making it really c10k – creating a lot of connections to the server thus saturation its resources.
I started with the client in Go and quickly stumbled upon the first roadblock. When I was making
10 concurrent HTTP request with simple "net/http"
requests there were only 2 TCP connections
$ lsof -p $(pgrep go-client) -n -P
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
go-client 11959 avd cwd DIR 253,0 4096 1183846 /home/avd/go/src/github.com/dzeban/c10k
go-client 11959 avd rtd DIR 253,0 4096 2 /
go-client 11959 avd txt REG 253,0 6240125 1186984 /home/avd/go/src/github.com/dzeban/c10k/go-client
go-client 11959 avd mem REG 253,0 2066456 3151328 /usr/lib64/libc-2.26.so
go-client 11959 avd mem REG 253,0 149360 3152802 /usr/lib64/libpthread-2.26.so
go-client 11959 avd mem REG 253,0 178464 3151302 /usr/lib64/ld-2.26.so
go-client 11959 avd 0u CHR 136,0 0t0 3 /dev/pts/0
go-client 11959 avd 1u CHR 136,0 0t0 3 /dev/pts/0
go-client 11959 avd 2u CHR 136,0 0t0 3 /dev/pts/0
go-client 11959 avd 4u a_inode 0,13 0 12735 [eventpoll]
go-client 11959 avd 8u IPv4 68232 0t0 TCP 127.0.0.1:55224->127.0.0.1:80 (ESTABLISHED)
go-client 11959 avd 10u IPv4 68235 0t0 TCP 127.0.0.1:55230->127.0.0.1:80 (ESTABLISHED)
The same with ss
1
$ ss -tnp dst 127.0.0.1:80
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 127.0.0.1:55224 127.0.0.1:80 users:(("go-client",pid=11959,fd=8))
ESTAB 0 0 127.0.0.1:55230 127.0.0.1:80 users:(("go-client",pid=11959,fd=10))
The reason for this is quite simple – HTTP 1.1 is using persistent connections
with TCP keepalive for clients to avoid the overhead of TCP handshake on each
HTTP request. Go’s "net/http"
fully implements this logic – it multiplexes
multiple requests over a handful of TCP connections. It can be tuned via
Transport
.
But I don’t need to tune it, I need to avoid it. And we can avoid it by
explicitly creating TCP connection via net.Dial
and then sending a single
request over this connection. Here the function that does it and runs
concurrently inside a dedicated goroutine.
func request(addr string, delay int, wg *sync.WaitGroup) {
conn, err := net.Dial("tcp", addr)
if err != nil {
log.Fatal("dial error ", err)
}
req, err := http.NewRequest("GET", "/index.html", nil)
if err != nil {
log.Fatal("failed to create http request")
}
req.Host = "localhost"
err = req.Write(conn)
if err != nil {
log.Fatal("failed to send http request")
}
_, err = bufio.NewReader(conn).ReadString('\n')
if err != nil {
log.Fatal("read error ", err)
}
wg.Done()
}
Let’s check it’s working
$ lsof -p $(pgrep go-client) -n -P
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
go-client 12231 avd cwd DIR 253,0 4096 1183846 /home/avd/go/src/github.com/dzeban/c10k
go-client 12231 avd rtd DIR 253,0 4096 2 /
go-client 12231 avd txt REG 253,0 6167884 1186984 /home/avd/go/src/github.com/dzeban/c10k/go-client
go-client 12231 avd mem REG 253,0 2066456 3151328 /usr/lib64/libc-2.26.so
go-client 12231 avd mem REG 253,0 149360 3152802 /usr/lib64/libpthread-2.26.so
go-client 12231 avd mem REG 253,0 178464 3151302 /usr/lib64/ld-2.26.so
go-client 12231 avd 0u CHR 136,0 0t0 3 /dev/pts/0
go-client 12231 avd 1u CHR 136,0 0t0 3 /dev/pts/0
go-client 12231 avd 2u CHR 136,0 0t0 3 /dev/pts/0
go-client 12231 avd 3u IPv4 71768 0t0 TCP 127.0.0.1:55256->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 4u a_inode 0,13 0 12735 [eventpoll]
go-client 12231 avd 5u IPv4 73753 0t0 TCP 127.0.0.1:55258->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 6u IPv4 71769 0t0 TCP 127.0.0.1:55266->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 7u IPv4 71770 0t0 TCP 127.0.0.1:55264->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 8u IPv4 73754 0t0 TCP 127.0.0.1:55260->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 9u IPv4 71771 0t0 TCP 127.0.0.1:55262->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 10u IPv4 71774 0t0 TCP 127.0.0.1:55268->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 11u IPv4 73755 0t0 TCP 127.0.0.1:55270->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 12u IPv4 71775 0t0 TCP 127.0.0.1:55272->127.0.0.1:80 (ESTABLISHED)
go-client 12231 avd 13u IPv4 73758 0t0 TCP 127.0.0.1:55274->127.0.0.1:80 (ESTABLISHED)
$ ss -tnp dst 127.0.0.1:80
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 127.0.0.1:55260 127.0.0.1:80 users:(("go-client",pid=12231,fd=8))
ESTAB 0 0 127.0.0.1:55262 127.0.0.1:80 users:(("go-client",pid=12231,fd=9))
ESTAB 0 0 127.0.0.1:55270 127.0.0.1:80 users:(("go-client",pid=12231,fd=11))
ESTAB 0 0 127.0.0.1:55266 127.0.0.1:80 users:(("go-client",pid=12231,fd=6))
ESTAB 0 0 127.0.0.1:55256 127.0.0.1:80 users:(("go-client",pid=12231,fd=3))
ESTAB 0 0 127.0.0.1:55272 127.0.0.1:80 users:(("go-client",pid=12231,fd=12))
ESTAB 0 0 127.0.0.1:55258 127.0.0.1:80 users:(("go-client",pid=12231,fd=5))
ESTAB 0 0 127.0.0.1:55268 127.0.0.1:80 users:(("go-client",pid=12231,fd=10))
ESTAB 0 0 127.0.0.1:55264 127.0.0.1:80 users:(("go-client",pid=12231,fd=7))
ESTAB 0 0 127.0.0.1:55274 127.0.0.1:80 users:(("go-client",pid=12231,fd=13))
I also decided to make a C client built on top of libuv for convenient event loop.
In my C client, there is no HTTP library so we’re making TCP connections from the start. It works well by creating a connection for each request so it doesn’t have the problem (more like feature :-) of the Go client. But when it finishes reading response it stucks and doesn’t return the control to the event loop until the very long timeout.
Here is the response reading callback that seems stuck:
static void on_read(uv_stream_t* stream, ssize_t nread, const uv_buf_t* buf)
{
if (nread > 0) {
printf("%s", buf->base);
} else if (nread == UV_EOF) {
log("close stream");
uv_connect_t *conn = uv_handle_get_data((uv_handle_t *)stream);
uv_close((uv_handle_t *)stream, free_close_cb);
free(conn);
} else {
return_uv_err(nread);
}
free(buf->base);
}
It appears like we’re stuck here and wait for some (quite long) time until we finally got EOF.
This “quite long time” is actually HTTP keepalive timeout set in nginx and by default it’s 75 seconds.
We can control it on the client though with
Connection
and
Keep-Alive
HTTP headers which are part of HTTP 1.1.
And that’s the only sane solution because on the libuv side I had no way to close the convection – I don’t receive EOF because it is sent only when connection actually closed.
So what is happening is that my client creates a connection, send a request, nginx replies and then nginx is keeping connection because it waits for the subsequent requests. Tinkering with libuv showed me that and that’s why I love making things in C – you have to dig really deep and really understand how things work.
So to solve this hanging requests I’ve just set Connection: close
header to
enforce the new connection on each request from the same client and to disable
HTTP keepalive. As an alternative, I could just insist on HTTP 1.0 where there is
no keep-alive.
Now, that it’s creating lots of connections let’s make it keep those connections for a client-specified delay to appear a slow client.
I needed to make it slow because I wanted my server to spend some time handling the requests while avoiding putting sleeps in the server code.
Initially, I thought to make reading on the client side slow, i.e. reading one byte at a time or delaying reading the server response. Interestingly, none of these solutions worked.
I tested my client with nginx by watching access log with the
$request_time
variable. Needless to say, all of my requests were served in 0.000 seconds.
Whatever delay I’ve inserted, nginx seemed to ignore it.
I started to figure out why by tweaking various parts of the request-response pipeline like the number of connections, response size, etc.
Finally, I was able to see my delay only when nginx was serving really big file like 30 MB and that’s when it clicked.
The whole reason for this delay ignoring behavior were socket buffers. Socket buffers are, well, buffers for sockets, in other words, it’s the piece of memory where the Linux kernel buffers the network requests and responses for performance reason – to send data in big chunks over the network and to mitigate slow clients, and also for other things like TCP retransmission. Socket buffers are like page cache – all network I/O (with page cache it’s disk I/O) is made through it unless explicitly skipped.
So in my case, when nginx received a request, the response written by send/write syscall was merely stored in the socket buffer but from nginx point of view, it was done. Only when the response was large enough to not fit in the socket buffer, nginx would be blocked in syscall and wait until the client delay was elapsed, socket buffer was read and freed for the next portion of data.
You can check and tube the size of the socket buffers in
/proc/sys/net/ipv4/tcp_rmem
and /proc/sys/net/ipv4/tcp_wmem
.
So after figuring this out, I’ve inserted delay after establishing the connection and before sending a request.
This way the server will keep around client connections (yay, c10k!) for a client-specified delay.
So in the end, I have a 2 c10k clients – one written in Go and the other written in C with libuv. The Python 3 client is on its way.
All of these clients connect to the HTTP server, waits for a specified delay
and then sends GET request with Connection: close
header.
This makes HTTP server keep a dedicated connection for each request and spend some time waiting to emulate I/O.
That’s how my c10k clients work.
ss
stands for socket stats and it’s more versatile tool to inspect sockets than netstat
. ↩︎