Background Jobs and Queues: Doing Work Later

A plain-English guide to background jobs and queues — why slow work shouldn't block a request, how a queue and workers decouple producers from consumers, retries, and idempotency.

Some work is just too slow to do while a user waits. Resizing a video, sending ten thousand emails, generating a report — make the user sit through that inside their request and the experience is miserable (and often times out). The fix is to do the work later, in the background, using a queue. This pattern quietly powers a huge fraction of what backends actually do.

The problem: slow work blocks the request

A normal request is synchronous: the user waits for it to finish. That's fine for a quick database read, but disastrous for genuinely slow work.

The insight is that the user only needs to know the work was accepted, not completed. So: accept it, respond immediately, and do the actual work after.

The queue: a buffer between asking and doing

A queue is the mechanism. Instead of doing slow work inline, your app drops a job describing it onto the queue and responds right away. Separately, workers pull jobs off the queue and execute them.

Request

Enqueue a job

Respond immediately

Worker runs the job later

The request returns fast; a worker does the slow part later

This splits the system into two roles that run independently:

Role	Job
Producer	Your app — creates jobs and pushes them to the queue
Consumer (worker)	Pulls jobs and performs the actual work

Because the two are decoupled, they can scale and fail independently. Slow workers don't slow down the app; a flood of jobs doesn't crash it.

Smoothing spikes

A queue does more than move work off the request path — it absorbs bursts. When traffic spikes, jobs pile up in the queue rather than overwhelming your system all at once. Workers drain the backlog at a steady, survivable rate.

Incoming burst

spike

Worker capacity

steady

Queued backlog

buffered

A queue turns a spike into a manageable backlog

Without the queue, that spike hits your system directly and can knock it over. With it, the excess simply waits — the work takes longer to finish, but nothing falls down. This is one of the main tools for building systems that survive load they weren't provisioned for.

Retries: failure isn't fatal

A big bonus of queuing work: failures can be retried. If a worker tries to send an email and the email service is briefly down, the job isn't lost — it goes back on the queue to be attempted again later. Inline work has no such luxury; a failure there just fails.

This makes background work substantially more robust than synchronous work. Transient failures — a flaky API, a momentary network blip — heal themselves through retries instead of becoming user-facing errors. Most queue systems support automatic retries with backoff, and a "dead-letter" place for jobs that keep failing so they can be inspected rather than retried forever.

Idempotency: make repeats safe

Retries create one critical obligation. If a job can run more than once — because it failed partway, or its success notification got lost — then running it twice must not cause harm. A job designed this way is idempotent.

Designing idempotent jobs — often by giving each job a unique ID and recording which IDs have completed — is the discipline that makes queues reliable rather than a source of duplicate-action bugs.

Recap

Queues let you do slow work later, so requests respond immediately instead of blocking on it.
A producer enqueues jobs; independent workers (consumers) pull and run them — decoupled and separately scalable.
Queues absorb spikes, turning a dangerous burst into a manageable backlog workers drain over time.
Queued jobs can be retried, making background work far more robust against transient failures.
Retries demand idempotency — design jobs so running them twice causes no harm.

We've built a capable backend. The final question is getting it running reliably for real users — deployment. Continue to Deploying a Backend.