resiliencereliabilitylatencyservice-mesh
Circuit Breaker
Protect services from cascading failures by short-circuiting calls to unhealthy dependencies.
Definition
Circuit breakers monitor downstream errors/latency and temporarily block requests when dependency health degrades.
When To Use
- Synchronous service-to-service calls with bounded latency budgets.
- Critical APIs where dependency failure can exhaust thread or connection pools.
- Systems requiring graceful degradation paths.
When Not To Use
- Fully asynchronous decoupled workflows where queues already absorb spikes.
- Local in-process dependencies with negligible failure probability.
- Without fallback behavior, where open-circuit still causes user-facing hard failures.
Tradeoffs
- Contains blast radius, but may reject recoverable traffic during open state.
- Improves resilience, but requires threshold tuning and observability rigor.
- Protects capacity, while introducing policy complexity per dependency.
Common Failure Modes
- Misconfigured thresholds flap breaker state and destabilize latency.
- No fallback strategy leaves users with persistent errors.
- Breaker state not partition-aware blocks healthy regional paths.
Interview Framing
Use this structure when the interviewer asks for this pattern explicitly.
Describe open/half-open transitions, fallback strategy, and metrics/alerts that govern tuning.
Related Project Deep Dives
API Rate-Limiting as a Multi-Region Service
Design a globally consistent rate limiting service with low latency and multi-region enforcement.
Serverless Event Router with Dead-Letter Intelligence
Design a serverless event routing system using AWS EventBridge patterns with content-based routing, intelligent retry strategies, dead-letter queue analytics, and poison pill handling for mission-critical event-driven architectures.
Related Concepts
Backpressure
Control producer rate based on downstream capacity to avoid queue explosions and cascading failures.
Dead-Letter Queue (DLQ)
Isolate repeatedly failing messages for triage without blocking healthy traffic.
Leader Election
Select a single coordinator for shared work while preserving failover safety.