Idempotency keys

How Stripe-style idempotency actually works, the transactional traps, and how to retrofit it into an existing API.

Every distributed system has retries. Retries exist because networks lie. The client thinks the request failed. The server actually committed it. The retry doubles the charge, creates the second invoice, sends the duplicate email. Idempotency keys are how you make retries safe without sacrificing throughput or correctness.

What idempotency means precisely

A function f is idempotent if f(f(x)) = f(x). In APIs, this translates to: calling the endpoint N times with the same input produces the same result as calling it once. Notably, the result includes both the server state and the response body. A real-world idempotent endpoint:

Creates the resource exactly once.
Returns the exact same response body every time, including the same generated IDs and timestamps.
Has the same observable side effects exactly once (one webhook, one Stripe charge, one email).

HTTP's GET, PUT, DELETE are theoretically idempotent. POST is not. In practice, all of them benefit from explicit idempotency keys because real systems have races, partial failures, and eventually consistent caches that violate the spec.

The Stripe pattern

Client supplies Idempotency-Key: <UUID> header. Server flow:

Read key from header. If missing on a critical endpoint, reject with 400 (or generate one, log a warning).
Hash the request body.
Look up the key in the idempotency store.
If found and request hash matches and status is succeeded, return cached response.
If found and request hash matches and status is in_progress, return 409 or wait.
If found and request hash does NOT match, return 422 (key reuse with different body is a client bug).
If not found, insert with status=in_progress, do the work, update with status=succeeded and the response body, return.

The TTL is typically 24 hours. Long enough for any legitimate retry, short enough that storage stays manageable.

The full idempotency lookup and execution flow

The transactional trap, in detail

Naive implementation:

with db.transaction():
  db.insert(idempotency_keys, key=K, status='in_progress')
  charge = stripe.create_charge(amount=5000)
  db.update(idempotency_keys, key=K, status='succeeded', response=charge)
return charge

What happens if stripe.create_charge succeeds but the connection drops before db.update? The transaction rolls back. The idempotency row vanishes. The retry sees no key, calls Stripe again, charges twice. You just lost the customer.

Three ways to fix it.

Fix 1: split the transaction

Commit the idempotency row before the side effect.

# Transaction 1
with db.transaction():
  db.insert(idempotency_keys, key=K, status='in_progress')
 
# Side effect, no transaction
charge = stripe.create_charge(amount=5000)
 
# Transaction 2
with db.transaction():
  db.update(idempotency_keys, key=K, status='succeeded', response=charge)

Now if Stripe succeeds but the second transaction fails, the next retry sees in_progress and knows to recover. The "recover" step is the hard part: you need to either re-query Stripe by your own idempotency key (Stripe supports this) or reconcile manually.

Fix 2: outbox pattern

Insert the idempotency row and a side-effect record into the same transaction. A background worker reads the side-effect record and performs the work, marking it complete. The work itself is now decoupled from the request. See the transactional outbox section.

Fix 3: pass the key downstream

The cleanest pattern. Forward the same idempotency key to every downstream that supports it. Stripe accepts Idempotency-Key. Many internal services should too. Now even if you call Stripe twice, Stripe dedupes on its side. End-to-end idempotency.

In production I combine fix 2 and fix 3.

What to hash

The request body, fully. Including the timestamp if the client sent one. If the client sends Idempotency-Key: K with body {amount: 100} and then Idempotency-Key: K with body {amount: 1000}, that is suspicious. Either return 422 (Stripe's choice) or treat it as a new request (some APIs do this).

Do NOT hash:

The auth header (the same logical request might be made by a fresh token).
Tracing headers (X-Request-ID, traceparent).
The idempotency key itself.

Hash with SHA-256. Store the hex digest alongside the key.

Storage choices

Postgres

A dedicated table. Primary key on key. TTL via a cron job that deletes rows older than 24h.

Pros: transactional with your business data. Free. Cons: row contention on hot keys, but rarely a problem for idempotency.

Redis

SET key payload EX 86400 NX is atomic and TTL-aware.

Pros: fast, automatic expiry. Cons: not transactional with your DB. You have to be careful that the Redis state matches the DB state.

DynamoDB

Stripe's choice for some services. Conditional puts handle the "insert if not exists" atomically. TTL is native.

For most teams: Postgres until you hit scale problems, then DynamoDB or Redis.

In-progress handling

When a retry arrives and the original is still running, you have three choices.

Reject with 409. Simple. Client must retry later. Works for short operations.

Block and poll. Server holds the connection, polls the DB for status, returns the response when ready. Works for medium-length operations (10-30s). Costs server resources.

Return 202 with a status URL. Async pattern. Client polls. Best for long operations.

Stripe uses 409 with a hint that the original is processing. The SDK retries with exponential backoff.

Failure responses

A controversial choice: do you cache failures?

Cache them (Stripe). Same key gets the same response, including the same 400. Predictable. Forces the client to use a new key for a new attempt.

Don't cache them. Failed calls can be retried with the same key. Convenient, but you have to be careful that the "failure" was truly idempotent (the side effect did not partially happen).

Cache them. Force new keys for retries. It is the only model that holds up under partial failure.

Retrofitting idempotency

Existing API without keys. Steps:

Add Idempotency-Key as an optional header. Log when present, don't enforce.
Identify mutating endpoints. Add idempotency table.
Wrap handlers in middleware that does the lookup-execute-store cycle.
Update SDKs to send keys by default.
After 2-4 weeks of telemetry, make the header required on critical endpoints.

Avoid the big-bang. Roll out per endpoint.

Distributed idempotency

When multiple services participate in a transaction, you need consistent keys across hops.

Pattern: each incoming request has a key K. Outgoing calls use derived keys: K + ":" + downstream_service + ":" + step. Deterministic. If the workflow retries from step 3, the downstream call for step 3 sees the same key it saw last time.

This composes through orchestration engines like Temporal or AWS Step Functions, which handle derivation automatically.

Operational notes

Monitor key reuse. Alert when the same key is used with a different request hash. Usually a client bug.
Monitor in-progress age. Keys stuck in_progress for >5 minutes mean a worker died mid-request. Recover or mark failed.
Don't reuse keys for cleanup. A common bug: the client retries with the same key after the TTL has expired. From the server's view, this is a brand-new request. The TTL must outlast any reasonable client retry window.
Test by killing pods mid-request. Real idempotency holds under pod kills, network partitions, and DB failovers. If you have never tested those scenarios, you do not actually have idempotency.

What I would tell a junior engineer

Idempotency is the cheapest insurance policy in distributed systems. It costs one table, one header, and a middleware. It prevents an entire category of customer-facing nightmares. Add it to every mutating endpoint, even if you think the client will never retry. The client will retry. The client always retries.

Learn more

Article
Stripe: Designing robust and predictable APIs with idempotencyStripe blog
Article
Stripe: Idempotency in distributed systemsStripe blog
Docs
IETF Idempotency-Key draftIETF
Article
AWS Builders Library: idempotent APIsAWS
Article
Designing Data-Intensive Applications, ch 8Martin Kleppmann

Deep dive15 min read← Back to crisp