SystemDesign Pro
ProjectsPathsKnowledgebaseAbout
PrivacyTermsRefundsCookiesContact
© 2026 SystemDesign Pro. All rights reserved.
messagingfailure-handlingreplayoperations

Dead-Letter Queue (DLQ)

Isolate repeatedly failing messages for triage without blocking healthy traffic.

Definition

A DLQ stores events that exceed retry policy so the main stream can continue while failures are investigated.

When To Use
  • Asynchronous event pipelines where some payloads can be poison messages.
  • Workflows requiring bounded retries and operator visibility.
  • Systems needing replay after targeted remediation.
When Not To Use
  • Workloads where every message must block upstream until success.
  • Without ownership/runbooks for DLQ drain and replay.
  • Scenarios lacking message context needed for root-cause analysis.
Tradeoffs
  • Protects main throughput, but can accumulate large failure debt.
  • Improves resilience, while adding replay and governance complexity.
  • Avoids global pipeline stalls, but requires strong observability discipline.
Common Failure Modes
  • DLQ grows silently and becomes unbounded cost center.
  • Replays without fixes cause repeated poison loops.
  • Insufficient payload metadata blocks actionable triage.
Interview Framing
Use this structure when the interviewer asks for this pattern explicitly.

Specify retry policy, quarantine criteria, replay safeguards, and DLQ SLOs/ownership model.

Related Project Deep Dives

Serverless Event Router with Dead-Letter Intelligence
Design a serverless event routing system using AWS EventBridge patterns with content-based routing, intelligent retry strategies, dead-letter queue analytics, and poison pill handling for mission-critical event-driven architectures.
beginnerPremium
Event Replay Platform for Debugging Microservices
Design an event replay platform that allows developers to capture, store, and replay events from microservices for debugging and testing purposes. Enable time-travel debugging across distributed systems.
beginnerPremium
Change Data Capture (CDC) Pipeline
Design a system that captures database changes in real-time and streams them to downstream systems with schema evolution support, exactly-once delivery, and multi-database compatibility.
intermediatePremium

Related Concepts

Backpressure
Control producer rate based on downstream capacity to avoid queue explosions and cascading failures.
Idempotency Keys
Guarantee repeated client retries do not create duplicate side effects.
Exactly-Once Processing (Practical)
Achieve effective exactly-once outcomes via idempotency, transactions, and dedup rather than magic guarantees.