In revision.
Crisp5 min readGo deeper →

CPU scheduling algorithms

Linux uses CFS (now EEVDF) for fairness; real-time policies SCHED_FIFO and SCHED_DEADLINE for hard deadlines.

The answer

Linux runs the Completely Fair Scheduler (CFS) for normal tasks, replaced by EEVDF (Earliest Eligible Virtual Deadline First) in kernel 6.6+. Both target fairness: every runnable task gets a proportional share of CPU based on its weight (nice value). For deterministic latency you opt into a real-time class: SCHED_FIFO (run until you block or someone higher priority arrives), SCHED_RR (FIFO with time slices), or SCHED_DEADLINE (specify period, deadline, runtime, the kernel guarantees you meet it).

Scheduler picks run in O(log N) time via a red-black tree keyed by virtual runtime (CFS) or virtual deadline (EEVDF).

Scheduling classes, priority order

  1. SCHED_DEADLINE: bandwidth-reserved real-time, EDF + CBS.
  2. SCHED_FIFO and SCHED_RR: priority-based real-time (1-99).
  3. SCHED_NORMAL: CFS/EEVDF. Default for everything.
  4. SCHED_BATCH: like NORMAL but assumes CPU-bound, less wakeup boost.
  5. SCHED_IDLE: only runs when nothing else wants to.

A higher class always wins. Within NORMAL, nice value (-20 to +19) controls weight. Each step of nice changes CPU share by ~1.25x. Nice -20 vs nice 0 means roughly 10x the CPU share.

How CFS works

Each task has a virtual runtime (vruntime). Lower vruntime means "less CPU consumed, more deserving." The scheduler picks the leftmost task in a red-black tree ordered by vruntime, runs it for up to sched_min_granularity_ns (default 2.25ms), then re-checks.

Higher-priority tasks (lower nice) accumulate vruntime more slowly, so they stay leftmost longer. This is the "fair" in CFS.

How EEVDF differs

EEVDF (in 6.6+) adds a per-task latency knob. CFS only had weight; EEVDF lets you say "this task should get its share within X ms of becoming runnable." Latency-sensitive tasks get scheduled sooner without changing their total CPU share.

CFS/EEVDF loop: pick leftmost task, run, update vruntime, reinsert.

Real-time the right way

SCHED_FIFO and SCHED_RR have 99 priority levels (1 lowest, 99 highest). A FIFO task at any priority runs until it voluntarily yields, blocks, or a higher-priority RT task arrives. Never use SCHED_FIFO for CPU-bound code unless you have configured RT throttling (/proc/sys/kernel/sched_rt_runtime_us), or you will deadlock the box.

SCHED_DEADLINE is the modern choice for hard real-time. You declare runtime, deadline, period. The kernel admits the reservation if there is capacity, then guarantees you runtime ns of CPU within every period. Cleaner contract than priority juggling.

The interview answer

"Linux uses CFS, now EEVDF in 6.6, for general workloads. It picks the leftmost task in a red-black tree by virtual runtime, gives O(log N) selection, and tries to be fair according to nice values. For real-time work there's SCHED_FIFO, SCHED_RR, and the modern SCHED_DEADLINE. I avoid SCHED_FIFO unless I really need it because it will starve everything if your code spins. Most tuning is just adjusting nice or using cgroups for fairness across groups."

Learn more