Chapter 42·Intermediate·8 min read
Background Jobs and Queues: Doing Work Later
A plain-English guide to background jobs and queues — why slow work shouldn't block a request, how a queue and workers decouple producers from consumers, retries, and idempotency.
June 30, 2026
Some work is just too slow to do while a user waits. Resizing a video, sending ten thousand emails, generating a report — make the user sit through that inside their request and the experience is miserable (and often times out). The fix is to do the work later, in the background, using a queue. This pattern quietly powers a huge fraction of what backends actually do.
The problem: slow work blocks the request
A normal request is synchronous: the user waits for it to finish. That's fine for a quick database read, but disastrous for genuinely slow work.
The insight is that the user only needs to know the work was accepted, not completed. So: accept it, respond immediately, and do the actual work after.
The queue: a buffer between asking and doing
A queue is the mechanism. Instead of doing slow work inline, your app drops a job describing it onto the queue and responds right away. Separately, workers pull jobs off the queue and execute them.
This splits the system into two roles that run independently:
| Role | Job |
|---|---|
| Producer | Your app — creates jobs and pushes them to the queue |
| Consumer (worker) | Pulls jobs and performs the actual work |
Because the two are decoupled, they can scale and fail independently. Slow workers don't slow down the app; a flood of jobs doesn't crash it.
Smoothing spikes
A queue does more than move work off the request path — it absorbs bursts. When traffic spikes, jobs pile up in the queue rather than overwhelming your system all at once. Workers drain the backlog at a steady, survivable rate.
Without the queue, that spike hits your system directly and can knock it over. With it, the excess simply waits — the work takes longer to finish, but nothing falls down. This is one of the main tools for building systems that survive load they weren't provisioned for.
Retries: failure isn't fatal
A big bonus of queuing work: failures can be retried. If a worker tries to send an email and the email service is briefly down, the job isn't lost — it goes back on the queue to be attempted again later. Inline work has no such luxury; a failure there just fails.
This makes background work substantially more robust than synchronous work. Transient failures — a flaky API, a momentary network blip — heal themselves through retries instead of becoming user-facing errors. Most queue systems support automatic retries with backoff, and a "dead-letter" place for jobs that keep failing so they can be inspected rather than retried forever.
Idempotency: make repeats safe
Retries create one critical obligation. If a job can run more than once — because it failed partway, or its success notification got lost — then running it twice must not cause harm. A job designed this way is idempotent.
Designing idempotent jobs — often by giving each job a unique ID and recording which IDs have completed — is the discipline that makes queues reliable rather than a source of duplicate-action bugs.
Recap
- Queues let you do slow work later, so requests respond immediately instead of blocking on it.
- A producer enqueues jobs; independent workers (consumers) pull and run them — decoupled and separately scalable.
- Queues absorb spikes, turning a dangerous burst into a manageable backlog workers drain over time.
- Queued jobs can be retried, making background work far more robust against transient failures.
- Retries demand idempotency — design jobs so running them twice causes no harm.
We've built a capable backend. The final question is getting it running reliably for real users — deployment. Continue to Deploying a Backend.