Blog / Architecture
Architecture

The Saga Pattern: Managing Distributed Transactions Without 2PC

Implementing the saga pattern with choreography and orchestration approaches, compensating transactions, and a real e-commerce order flow example.

Akhil Sharma

Akhil Sharma

February 6, 2026

11 min read

The Saga Pattern: Managing Distributed Transactions Without 2PC

In a monolith, you wrap related operations in a database transaction — either everything succeeds or everything rolls back. In microservices, each service has its own database. There's no shared transaction boundary. When creating an order requires reserving inventory (service A), charging payment (service B), and scheduling shipment (service C), how do you handle partial failures?

Two-phase commit (2PC) is the textbook answer. It's also the wrong answer for most microservice architectures — it requires all participants to be available, holds locks across services, and doesn't scale. The saga pattern is the practical alternative.

Saga Basics

A saga is a sequence of local transactions. Each local transaction updates its own service's database and publishes an event or message. If a step fails, the saga executes compensating transactions for all previously completed steps — undoing their effects.

Key insight: compensating transactions don't "undo" in the database ROLLBACK sense. They're new transactions that semantically reverse the effect. An inventory reservation is compensated by releasing the reservation. A payment charge is compensated by a refund.

Choreography: Event-Driven Sagas

Each service listens for events and decides what to do next. No central coordinator. Services communicate through events.

python

Choreography Trade-offs

Pros:

  • Simple to implement for small sagas (3-4 steps)
  • Loose coupling — services only know about events, not each other
  • No single point of failure

Cons:

  • Hard to understand the overall flow — logic is scattered across services
  • Difficult to add new steps (must update event listeners in multiple services)
  • No central place to see saga status or handle timeouts
  • Cyclic event dependencies can create infinite loops

Orchestration: Centralized Saga Coordinator

A saga orchestrator (SEC) manages the flow. It sends commands to services and listens for their responses, deciding what to do next based on a state machine.

Advanced System Design Cohort

We build this end-to-end in the cohort.

Live sessions, real systems, your questions answered in real time. Next cohort starts 2nd July 2026 — 20 seats.

Reserve your spot →
python

Orchestration Trade-offs

Pros:

  • Clear overview of the saga flow in one place
  • Easy to add, remove, or reorder steps
  • Centralized error handling and compensation logic
  • Easy to query saga status

Cons:

  • The orchestrator is a single point of failure (mitigate with HA deployment)
  • Risk of the orchestrator becoming a "god service" that knows too much about other services
  • Tighter coupling between the orchestrator and participant services

Designing Compensating Transactions

Not every action is easily compensatable. Consider:

ActionCompensating TransactionComplexity
Reserve inventoryRelease reservationEasy
Debit accountCredit account (refund)Easy
Send emailSend correction/cancellation emailPossible but imperfect
Charge credit cardRefund (but transaction fees aren't refunded)Lossy
Ship physical packageInitiate return processHard, slow
Delete dataCannot restore if not backed upImpossible

Pivot vs Retriable transactions: Some saga steps are pivot transactions — once they succeed, the saga must complete. Steps before the pivot are compensatable. Steps after the pivot are retriable (they must eventually succeed, with retries).

Timeout Handling

Sagas can get stuck. A service might be down, a message might be lost. You need timeout detection.

python

Choreography vs Orchestration: Decision Guide

For most production systems with more than 3 steps, orchestration is the better choice. The centralized view of saga state and explicit compensation logic outweighs the coupling trade-off. Use choreography for simple, well-understood flows where the team wants maximum decoupling.

2PC vs Saga: When Each Applies

Criteria2PCSaga
ConsistencyStrong (ACID)Eventual
AvailabilityLow (all participants must be up)High (tolerates partial failure)
LatencyHigh (lock held during prepare phase)Lower (no distributed locks)
ScalabilityPoor (coordinator bottleneck)Good
Use caseDatabases within one datacenterMicroservices across network

Use 2PC when you need strong consistency between two databases in the same datacenter (e.g., writing to PostgreSQL and a message queue). Use sagas for everything else in a microservice architecture.

The saga pattern accepts that distributed transactions can't provide the same guarantees as local transactions. Instead of pretending otherwise, it designs for partial failure with explicit compensation logic. The result is a system that's more resilient, more scalable, and honest about its consistency guarantees.

Saga Pattern Distributed Transactions Microservices Architecture

become an engineering leader

Advanced System Design Cohort