Back to field notes

Engineering

How to architect a real-time inventory system across sales channels

Batch inventory sync breaks the moment a brand sells anywhere except its own site. Here's how to architect an event-driven inventory system that scales.

How to architect a real-time inventory system across channels
Ryan avatarRyanCo-founder, Engineering4 min read

The hardest engineering problem in a multi-channel consumer brand isn’t the storefront. It’s keeping every channel honest about what’s actually in the warehouse, at the moment a buyer is choosing whether to click “buy”.

Batch sync (a cron every 15 minutes pulling deltas) is the default. It also breaks the day your brand goes from one channel to three. Here’s how to architect a real-time inventory system that holds up.

1. Why batch sync breaks at scale

A 15-minute lag sounds fine until you do the math:

  • 2 channels, 500 SKUs, 30 sales/hour → ~12 oversells per week
  • 5 channels, 2k SKUs, 200 sales/hour → ~9 oversells per hour

Oversells aren’t a UX nit. They’re refunds, expedited shipping costs, customer-service load, and review damage that compounds.

At three channels and above, the cost of batch sync exceeds the cost of rebuilding it properly. Don’t wait for the customer-service team to tell you.

2. Pick one source of truth, then commit

The first architectural decision: which system owns the canonical quantity? Three sensible answers:

Source of truthWhen it’s rightTrade-off
WMSYou own the warehouseSlower to integrate channels
ERPMulti-warehouse, multi-entityHeavy, slow to change
Dedicated IMS layerYou want speed + agnostic to channelsOne more system to operate

Whatever you pick, the rule is: only the source of truth writes. Every other system reads from it. The moment a channel writes its own quantity, you have two sources of truth and you’ve lost.

3. Event-driven sync, the actual architecture

The skeleton:

// Inventory events emitted by the source of truth
type InventoryEvent =
  | { type: "stock.received"; sku: string; qty: number; warehouse: string }
  | { type: "stock.reserved"; sku: string; qty: number; orderId: string }
  | { type: "stock.released"; sku: string; qty: number; orderId: string }
  | { type: "stock.shipped"; sku: string; qty: number; orderId: string };

// Each channel runs a consumer
async function syncShopify(event: InventoryEvent) {
  const available = await getAvailable(event.sku);
  await shopify.inventory.update(event.sku, available);
}

The shape:

  1. Source of truth emits events on stock changes (received, reserved, shipped, released).
  2. A queue holds them (Kafka, NATS, AWS Kinesis, Vercel Queues - pick by ops budget).
  3. One consumer per channel projects events to that channel’s API.
  4. Each consumer is idempotent, events can be replayed without harm.

4. Handle reservations, not just stock

A common bug: the system tracks “on hand” but not “reserved”. A buyer adds to cart, the page says 5 available, two other buyers also see 5 available, they all check out. Oversell.

Fix:

  • Available = On hand − Reserved − Pending
  • Reservations expire (5 minutes for cart, 30 for checkout)
  • Expirations are themselves events that release stock

This turns inventory into a small state machine. Mathematically tedious to code; operationally the difference between “we never oversold” and “we apologise to ten customers a week”.

Key points

  • Available is computed, not stored
  • Reservations have TTLs and release events
  • Reconciliation jobs run hourly to catch drift between projection and source

5. Observability, what to actually log

Three metrics make or break the system in incident triage:

  1. Event lag, how stale is each channel’s projection? (now − last_consumed_event_at) per channel.
  2. Drift rate, how often does reconciliation find a mismatch?
  3. Reservation pressure, how much of “on hand” is reserved right now?

If event lag goes above 60 seconds, you have a degraded channel. If drift rate goes above 0.1%, you have a bug. If reservation pressure goes above 70% during checkout peak, you need to shorten cart TTLs.

Takeaway

Multi-channel inventory at scale is not a sync problem. It’s a state machine problem. Pick one source of truth, emit events, run idempotent projections per channel, and instrument lag and drift. Get this right and the rest of the storefront stack falls into place.

The Nomu platform ships this pattern by default, the inventory layer, the agent-readable stock surface, and the channel projections are all event-driven from day one. Book a demo if you want to see how the pieces fit together.

Ryan avatar

Ryan

Co-founder, Engineering

Co-founder and engineer at Nomu working on platform infrastructure — runtime, data pipelines, and the agent-readable surfaces every storefront exposes. Writes about backend engineering for commerce, AI infrastructure, and the systems decisions that make a brand scalable.