Back to Journal
netorigoen

Webhook + outbox pattern in the ERP — exactly-once between Finance and Logistics

Finance → Logistics exactly-once: outbox table, idempotency key, exp_backoff retry, consumer-side processed_events. 18 months, zero duplicates.

A classic integration bug: the Finance module issues an invoice and sends a webhook to the Logistics module ("invoice paid, you can start shipping"), but webhook delivery fails — network blip, Logistics deploy in progress, whatever. Finance retries. Two packages ship. The customer gets two invoices. The complaint lands on the CTO's desk. In this article: the way out — outbox table + idempotency key + retry pattern, as we use it on Netorigo between Finance and Logistics.

Why not direct HTTP

The naive solution: Finance sends an HTTP POST directly to Logistics. On error, retry. But two problems:

  1. Transactional consistency is missing. Finance sends the webhook inside its DB transaction — if the transaction rolls back (because of some other error), the webhook has already gone out and Logistics kicks off processing for a non-existent invoice.
  2. Retries are not rolling. If Logistics is unreachable for 30 seconds, all of Finance's billing actions stall for that duration.

Next step: outbox table.

The outbox table

The Finance module records every event (e.g. InvoicePaidEvent) NOT via direct HTTP but with an INSERT into a finance_outbox table — IN THE SAME TRANSACTION as the one updating invoices.status = 'paid'. One SQL transaction, two INSERT/UPDATE: if anything rolls back, both roll back.

The table:

CREATE TABLE finance_outbox (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  event_type TEXT NOT NULL, -- 'invoice.paid', 'invoice.refunded', etc.
  payload JSONB NOT NULL,
  idempotency_key TEXT NOT NULL UNIQUE,
  delivery_status TEXT NOT NULL DEFAULT 'pending', -- pending | delivered | failed
  delivery_attempts INT NOT NULL DEFAULT 0,
  last_attempt_at TIMESTAMPTZ,
  next_attempt_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  delivered_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX ON finance_outbox (delivery_status, next_attempt_at);

The idempotency_key is the stable identifier of the event — e.g. invoice.paid:<invoice_id>:<payment_id>. If Finance tries to send two paid events for the same (invoice_id, payment_id) pair, the UNIQUE constraint prevents it.

The delivery worker

A separate NestJS scheduler @Cron('*/5 * * * * *') (every 5 seconds) reads the outbox:

SELECT * FROM finance_outbox
WHERE delivery_status = 'pending'
  AND next_attempt_at <= NOW()
ORDER BY created_at
LIMIT 100
FOR UPDATE SKIP LOCKED;

FOR UPDATE SKIP LOCKED ensures two worker instances don't try to send the same event. For each row the worker resolves the target webhook URL (https://logistics.netorigo.svc/webhooks/finance), POSTs the payload with X-Idempotency-Key header.

On 2xx: delivery_status = 'delivered', delivered_at = NOW(). On 5xx or timeout: delivery_attempts += 1, next_attempt_at = NOW() + exp_backoff(attempts) (5s, 30s, 5min, 30min, 4h). After 8 attempts: delivery_status = 'failed', alert on a Slack channel.

The consumer side: using the idempotency key

The Logistics module's webhook controller does this:

@Post('/webhooks/finance')
async handleFinanceWebhook(@Headers('x-idempotency-key') key: string, @Body() body: any) {
  const existing = await this.processedEventsRepo.findOne({ idempotency_key: key });
  if (existing) {
    // Already processed, respond 200
    return { status: 'ok', already_processed: true };
  }

  await this.processedEventsRepo.insert({
    idempotency_key: key,
    received_at: new Date(),
    payload: body,
  });

  // Now process it
  await this.shipmentService.handleInvoicePaid(body);
  return { status: 'ok' };
}

logistics_processed_events.idempotency_key is UNIQUE — if the Finance worker sends the event twice (because the first 200 response was lost on timeout), the second attempt finds the row via processedEventsRepo.findOne and Logistics doesn't start shipping a second time.

This is the exactly-once at-the-consumer pattern: Finance sends at-least-once, Logistics processes exactly-once.

Cleanup

finance_outbox rows with delivered = true are archived (finance_outbox_archive) after 30 days, deleted after 1 year. logistics_processed_events rows are deleted after 90 days — by then the Finance worker won't be sending retries any more.

The full round trip

Finance writes invoice.paid → finance_outbox INSERT (same TX) → cron worker picks up the row → HTTP POST to Logistics with X-Idempotency-Key → Logistics processed_events check + INSERT (same TX) → shipment-state changes to waiting → 200 OK → Finance marks outbox row delivered.

If anything stalls — Logistics is offline, network is down, Finance restarts, Logistics restarts — the flow resumes on the next cron iteration with an exactly-once guarantee.

Closing

The webhook + outbox + idempotency-key pattern is the most reliable integration foundation for modular ERPs. A 30-row table + a 100-line scheduler + a 20-line consumer guard solve a problem that produces weekly support tickets in classic integrations. 18 months of production: zero duplicate invoices, zero lost shipments — the pattern works.