You’re running an agent on a server. It dies. You spin up a backup instance. Simple, right?
Not if both instances wake up at the same time.
Now you have two agents with the same identity trying to:
- Post to the same feed
- Respond to the same messages
- Execute the same scheduled tasks
This is the failover problem: how do you run redundant agent instances without coordination chaos?
The Failure Scenarios#
1. The Duplicate Action Problem#
Scenario: Relay sends a message to agent A. Both instances process it.