Python memory model and reference counting
CPython uses reference counting with a backup cycle collector. Objects are freed as soon as their refcount hits zero; cycles need the GC to break them.
CPython manages memory in two layers: a per-object reference count and a generational cycle collector. Most objects are freed by refcount alone; the GC only runs to catch reference cycles that refcounting can't resolve.
Reference counting
Every PyObject has a refcount field. Operations that create a reference (x = obj, lst.append(obj), passing as argument) increment it. Operations that release one (del x, function return, replacing a list element) decrement it. When the count hits zero, the object is freed immediately and its references are decremented in turn (which can cascade).
import sys
x = []
sys.getrefcount(x) # 2: x itself plus the argument to getrefcountPros:
- Deterministic. Objects die exactly when they become unreachable.
- Low pause times. Freeing is incremental, spread across normal operations.
- No need for stop-the-world scans for most memory.
Cons:
- Overhead on every assignment.
- Cannot collect reference cycles. If A references B and B references A, both have refcount 1 even with no external references.
The cycle collector
Python has a backup garbage collector that runs periodically to find cycles. It tracks "container" objects (lists, dicts, classes, anything that can hold references to other Python objects). Atomic objects (int, str, float) are not tracked because they can't form cycles.
The cycle GC is generational with three generations (0, 1, 2). New container objects start in gen 0. Survivors get promoted. Gen 0 is collected often, gen 2 rarely.
import gc
gc.collect() # Force a full collection
gc.get_count() # Allocations since last collection per generation
gc.disable() # Turn off automatic collectionWhy both
Refcounting handles the common case fast. Most Python objects die simply: a variable goes out of scope, refcount drops to zero, memory freed. No scan, no pause.
The cycle collector handles the rare case. Doubly-linked lists, parent/child references, callback closures - any structure where two objects reference each other. Without the GC these would leak.
Practical notes
- Use
weakrefto refer to an object without preventing its collection. Caches, observers, parent backrefs. __slots__on classes skips the per-instance dict, saving memory and avoiding cycles via attribute holes.- The
__del__method is called when refcount hits zero. Avoid heavy work there; it runs in the middle of whatever was decrementing the count. - Free-threaded Python (PEP 703) keeps reference counting but makes operations atomic, with a biased-refcount optimization for the owning thread.
The model: refcount for the simple case, GC for the cyclic case. Together they keep Python memory-safe without manual free.
Learn more
- DocsPython docs: Memory Managementpython.org
- DocsPython docs: gc modulepython.org
- Talk