A founder describes the requirement: "We need to send emails after signup, generate reports nightly, and call an external API when a webhook fires."
The room nods. Someone says, "We'll need a queue. RabbitMQ? SQS? BullMQ?"
For 80% of background-job workloads, the right answer is none of those. A jobs table in Postgres, a worker that polls it, and a cron schedule for the recurring stuff. Same database you're already running. Same auth. Same backups.
This post is that pattern. What it can do, when it's enough, what you give up vs. a real queue service, and how to wire it up.
The pattern
Three pieces:
- A
jobstable in Postgres with a status column and a payload. - A worker process that polls the table for jobs to run.
- A cron schedule (or a webhook) that creates new job rows when there's work.
That's the whole architecture. Maybe 200 lines of code total for the worker. No new infrastructure, no new vendor, no new auth model.
Why this works
Postgres has every primitive you need to build a queue:
- `SELECT ... FOR UPDATE SKIP LOCKED` — atomic claim of a job row. Multiple workers can poll concurrently without claiming the same job.
- JSON columns — store arbitrary payload structure without a schema change per job type.
- Transactions — claim a job, do the work, mark it complete, all in one transaction. If anything fails, the claim is released.
- Indexes — order by
created_atorpriorityis just an index scan.
The thing people worry about ("Postgres isn't a real queue") is real at high scale. Not at the scales most apps actually run at. Below 10,000 jobs per minute (and most apps are well below 100/min), Postgres handles this without breaking a sweat.
A working implementation
The schema:
CREATE TABLE jobs (
id BIGSERIAL PRIMARY KEY,
kind TEXT NOT NULL,
payload JSONB NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
attempts INT NOT NULL DEFAULT 0,
max_attempts INT NOT NULL DEFAULT 3,
run_after TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
error TEXT
);
CREATE INDEX idx_jobs_pending ON jobs (run_after) WHERE status = 'pending';The worker loop (TypeScript):
async function processNextJob(db) {
const job = await db.transaction(async (tx) => {
const [claimed] = await tx.execute(sql`
UPDATE jobs SET status = 'running', started_at = NOW(), attempts = attempts + 1
WHERE id = (
SELECT id FROM jobs
WHERE status = 'pending' AND run_after <= NOW()
ORDER BY run_after
LIMIT 1
FOR UPDATE SKIP LOCKED
)
RETURNING *
`);
return claimed;
});
if (!job) return false;
try {
await handlers[job.kind](job.payload);
await db.execute(sql`
UPDATE jobs SET status = 'completed', completed_at = NOW() WHERE id = ${job.id}
`);
} catch (err) {
const next = job.attempts >= job.max_attempts
? { status: 'failed' }
: { status: 'pending', run_after: new Date(Date.now() + 60_000 * 2 ** job.attempts) };
await db.execute(sql`
UPDATE jobs SET status = ${next.status}, run_after = ${next.run_after ?? job.run_after}, error = ${err.message}
WHERE id = ${job.id}
`);
}
return true;
}
async function workerLoop(db) {
while (true) {
const worked = await processNextJob(db);
if (!worked) await new Promise((r) => setTimeout(r, 1000));
}
}That's the worker. Run it as a separate process (or a fly.io machine, or a Railway service, or a long-lived Vercel function). One worker handles the throughput most apps need; spin up two or three for parallel work.
To enqueue a job from anywhere in the app:
await db.insert(jobs).values({
kind: 'send-welcome-email',
payload: { userId, email },
});Done. The worker picks it up on the next poll.
What you get
This pattern includes most of what a "real" queue offers:
- Retries with exponential backoff — see the
2 ** attemptsin the error branch. - Dead letter handling — jobs that fail
max_attemptstimes move to status'failed'. Query them withSELECT * FROM jobs WHERE status = 'failed'. - Scheduled jobs — set
run_afterto a future timestamp; the worker won't pick it up until then. - Job prioritisation — add a
prioritycolumn, change theORDER BY. - Atomic claim —
SKIP LOCKEDprevents two workers from running the same job. - Observability —
SELECT status, COUNT(*) FROM jobs GROUP BY statusis your dashboard.
The things you don't get out of the box:
- Push notifications to workers. Workers poll. The 1-second sleep in the loop adds up to ~1s of latency on average. For most use cases this is fine; for sub-second response, use Postgres
LISTEN/NOTIFYto wake the worker. - Geographically distributed queues. All workers hit the same Postgres. Cross-region setups need a real queue.
- Massive fan-out at high QPS. SQS handles tens of thousands per second; Postgres job tables tap out around hundreds per second per worker.
A cron schedule for recurring jobs
For "nightly reports" or "every 5 minutes" jobs, you don't need a separate cron service. Two options:
Option A: A scheduled function on your platform.
Vercel Cron, Cloudflare Cron Triggers, Fly Machines with a cron-style trigger. Calls an endpoint on schedule. The endpoint just inserts a job row.
// /api/cron/nightly-reports
export async function GET() {
await db.insert(jobs).values({ kind: 'generate-nightly-report', payload: {} });
return Response.json({ ok: true });
}Option B: pg_cron extension.
Postgres has a pg_cron extension that schedules SQL directly:
SELECT cron.schedule('nightly-reports', '0 2 * * *', $$
INSERT INTO jobs (kind, payload) VALUES ('generate-nightly-report', '{}')
$$);Available on Supabase, Neon (in beta), and self-hosted. Avoid on RDS Postgres (not supported as of writing).
Either works. We default to the platform cron because it's universal; pg_cron is nicer if you're already on a database that supports it.
When this is enough
Specific signals that the Postgres pattern is the right call:
- Your job rate is under a few hundred per second peak.
- Most jobs complete in under a minute (long jobs are fine too, but you have to handle worker restarts).
- Workers run in a single region (or two, with Postgres replicas in each).
- You're already using Postgres for the application data.
- The team would rather not learn a new vendor's API and dashboard.
Most apps fit all five. The total operational complexity is "we have one more long-running process called the worker." That's it.
When you need a real queue
The signals to graduate:
- Job throughput above ~5,000/second sustained. Postgres can't keep up at that rate. Time for SQS, Kafka, or NATS.
- You need cross-region fan-out. Multiple data centres consuming the same queue. Postgres replication isn't built for this.
- You need at-most-once delivery semantics. Postgres gives at-least-once (combined with idempotent handlers). If you need exactly-once for billing or similar, a queue with deduplication.
- You need pub-sub (multiple consumers each get every message). Postgres job tables are work-queue, not pub-sub. Use Kafka, NATS, Redis Streams.
- You need streaming time-windowed operations. Aggregations over rolling 60-second windows. Stream processors are the right tool.
If none of these apply, the queue service is overhead.
A concrete example
A B2B SaaS we worked with had this exact "we need a queue" conversation. The actual jobs:
- Send welcome email after signup. ~50/day.
- Process uploaded CSV (parse, validate, insert). ~10/day, can take 30 seconds each.
- Send daily digest emails. ~2,000/day, batched in one 30-minute run.
- Sync to a third-party CRM when a customer record changes. ~100/day.
- Generate weekly analytics report. ~10/week.
Total: ~3,000 jobs/day average. Peak: 100/minute during the digest window.
The team had drafted an architecture using SQS for jobs, Lambda for processing, DynamoDB for job state, and CloudWatch for monitoring. Five AWS services.
We built the Postgres pattern instead. One table, one worker process running on Fly. Three months later it's handling all five workflows reliably. The team has never thought about job infrastructure again.
The savings: ~$80/month on AWS bills, plus the engineering time that would have gone to AWS-service plumbing.
Idempotency, the underrated discipline
Whatever queue you use (Postgres or otherwise), make your job handlers idempotent. Same job run twice = same result. This is non-negotiable.
How: use the job's natural key to detect duplicates.
async function sendWelcomeEmail({ userId }) {
const user = await db.query.users.findFirst({ where: eq(users.id, userId) });
if (user.welcomeEmailSent) return; // idempotent
await emailService.send({ to: user.email, ... });
await db.update(users).set({ welcomeEmailSent: true }).where(eq(users.id, userId));
}With idempotent handlers, retries are safe and the at-least-once guarantee of the queue is enough. Without idempotency, a retry can double-send the welcome email. Don't ship background jobs without this discipline.
A summary
For most apps with background-job needs:
- Postgres jobs table + worker process + cron schedule.
- 200 lines of code total.
- Idempotent handlers.
- Re-evaluate at 5,000 jobs/second or when fan-out becomes a real need.
The "you need a queue" reflex is from a different era. In 2026, the right default for most apps is to use the database you're already running.
If you want a web app with this pattern baked in from the start, see how we work on web apps.