You’re Reading Part 2 of a 3-Part Series on Paxos Consensus Algorithms in Distributed Systems.


In part 1 of this series, we looked at why consensus is such a tricky problem in distributed systems and how Paxos provides a way out. Through Alice and Bob’s battle for a lock, we saw how Paxos uses majority agreement to make decisions that can’t be undone once chosen.


Even when nodes fail, recover, or rejoin, the system still converges safely on one value. That’s the magic of Paxos—it keeps things consistent in an inconsistent world. In Part 2, we’ll dive into the messier edge cases and see how Paxos still manages to hold things together.


How Paxos Handle Edge Cases


In Part 1, we saw Paxos work smoothly: Alice proposed a value, the nodes accepted it, and even when Bob joined later, the algorithm forced him to carry forward Alice’s decision. Real systems, however, aren’t always this tidy. Messages can get lost, nodes can crash, and multiple proposers might compete simultaneously.


Let’s walk through a few messy scenarios with our familiar friends, Alice and Bob.


Edge Case 1 – Lost Commit (Alice’s Proposal Stalls)


Alice once again proposes AliceLock with proposal number 1001.





Let's try to unpack each failure.


Node 4 never sends back its final commit message (network glitch):


Node 4 goes down after accepting but before replying:


Alice (the proposer) disappears:



Now Bob arrives with BobLock (n=2001). By this time, let's say nodes 2 and 3 are also online and, through consensus, will learn that Alice has the lock eventually.



Lesson: Even if a commit acknowledgment is lost, Paxos ensures safety: Bob cannot override AliceLock. But liveness suffers — Bob’s new value makes no progress.


Edge Case 2: Both Alice and Bob arrive with their proposals simultaneously




Step 1 – First accepts

Now both Alice and Bob have one vote each.


Step 2 – Other nodes respond differently


So far:


Step 3 – Node 3 crashes


Before hearing from either proposer, Node 3 goes down.

This leaves:


Neither proposer can form a majority with Node 3 offline.


Step 4 – Stalemate (temporary)


Lesson: Safety is preserved: no conflicting value is committed yet, since quorum wasn’t reached.


Step 5 – Retry with a higher number

Suppose Bob retries with a new proposal number n=2002.


So Bob learns:


By Paxos rules, he must carry forward BobLock.


Step 6 – Consensus reached


Final decision: BobLock is chosen.


Edge Case 3 – Minority Partition (No Quorum)



Suppose a network partition occurs and only 2 out of 5 nodes are reachable (say nodes 4 and 5).


At the same time, Bob proposes BobLock (n=4002) to node 5.


Result: No value is chosen.

Lesson: Paxos prioritizes safety over availability. With fewer than a majority of nodes alive, the system cannot make progress. This is why Paxos-based systems may stall under minority partitions — it’s a tradeoff for never committing conflicting values.


Edge Case 4 – Out-of-Order / Delayed Messages

Now consider message delays:


But later, a delayed accept message from Alice (n=5001, AliceLock) arrives at Node 1.


Lesson: Paxos tolerates asynchronous, delayed, and reordered messages. Outdated proposals are ignored once a higher-numbered promise exists, preserving safety.


Wrapping Up


So far, we’ve seen Paxos handle a range of real-world messiness:



Paxos guarantees one thing above all else: safety is never compromised. But this comes at the cost of liveness in certain situations — proposers can starve, partitions can halt progress, and competition can cause livelock.


In Part 3, we’ll explore how Raft (and Multi-Paxos) address these practical challenges, making leader-based consensus simpler and more efficient in real-world deployments.