In revision.
Crisp5 min readGo deeper →

epoll, kqueue, io_uring

epoll for Linux, kqueue for BSDs and macOS, io_uring for the modern era. Same problem, three eras of answers.

The trio

APIOSYearThroughput at 10k fds
selectEverything1983Hits FD_SETSIZE wall
pollPOSIX1986Linear scan, slow
epollLinux 2.6+2002Solved C10K
kqueueFreeBSD 4.1, macOS2000Same shape as epoll, also watches signals, timers, fs
io_uringLinux 5.1+2019Solved C10M and storage too

epoll and kqueue answer "which fd is ready." io_uring answers "do this operation and tell me when it's done."

epoll, three syscalls

  • epoll_create1(0): returns an epoll fd.
  • epoll_ctl(epfd, op, fd, event): add, modify, or remove an fd from the interest set.
  • epoll_wait(epfd, events, max, timeout): returns ready events.

Edge-triggered (EPOLLET) is faster but requires "read until EAGAIN" discipline. Level-triggered is safer.

kqueue, two syscalls

  • kqueue(): returns a kq fd.
  • kevent(kq, changelist, n, eventlist, m, timeout): register interest AND fetch events in one call.

kqueue also watches: signals, timers, file descriptor changes (vnode events), process exit (proc events). On macOS it is the default for network servers and file watchers.

io_uring, ring buffers

Two mmap'd queues shared with the kernel. Userspace writes ops into the submission ring; the kernel writes results into the completion ring. One io_uring_enter syscall processes a batch. In SQPOLL mode, a kernel thread polls the SQ and you can submit without any syscall at all.

Supports: read, write, recv, send, accept, connect, openat, close, fsync, fallocate, splice, sendmsg/recvmsg, linked ops, fixed buffers, fixed files. Basically every I/O operation.

io_uring: shared rings between userspace and kernel, batched submission, batched completion.

When to use which

  • Network server on Linux, broad compat: epoll. Battle-tested. Every framework supports it.
  • macOS or BSD development: kqueue. Native, fast, also handles non-fd events.
  • High-throughput Linux server, recent kernel (5.10+): io_uring. Especially for storage or proxies pushing millions of ops/sec.
  • Cross-platform code: use a library (libuv, mio, libevent) that abstracts over all three.

The interview answer

"epoll is the Linux solution for I/O multiplexing: register fds once, get O(K) wakeups for K ready fds. kqueue is the BSD equivalent with a broader scope. io_uring is the modern Linux answer that goes beyond readiness notification: you submit operations via a shared ring, the kernel does them, and writes completions back. One syscall handles thousands of ops. It's also the right answer for async disk I/O, which epoll never solved properly."

Learn more