|
Clio
develop
The XRP Ledger API server.
|
Decides which node in the cluster should be the writer based on cluster state. More...
#include <WriterDecider.hpp>
Public Member Functions | |
| WriterDecider (boost::asio::thread_pool &ctx, std::unique_ptr< etl::WriterStateInterface > writerState, std::chrono::steady_clock::duration recoveryTime=kRECOVERY_TIME) | |
| Constructs a WriterDecider. | |
| void | onNewState (ClioNode::CUuid selfId, std::shared_ptr< Backend::ClusterData const > clusterData) |
| Handles cluster state changes and decides whether this node should be the writer. | |
Static Public Attributes | |
| static constexpr std::chrono::steady_clock::duration | kRECOVERY_TIME = std::chrono::hours{1} |
Decides which node in the cluster should be the writer based on cluster state.
This class monitors cluster state changes and determines whether the current node should act as the writer to the database.
All non-ReadOnly nodes are sorted by UUID. The first node with etlStarted and cacheIsFull is elected writer. If no fully-ready node exists, the first node with etlStarted is chosen. All others give up writing.
Fallback is the slower but more reliable mechanism based on database write-conflict detection (a node waits ~10 s of DB silence before writing). The cluster enters fallback whenever any non-ReadOnly node publishes DbRole::Fallback — for example during a rolling upgrade when an old node without cluster-coordination support is present.
To avoid the cluster staying in fallback indefinitely, a recovery timer is started when this node enters fallback. After the timer fires the node enters DbRole::FallbackRecovery and coordinates with peers to return to election mode. If any peer is already in FallbackRecovery, the node joins immediately (contagion rule), cancelling its own pending timer.
Nodes in FallbackRecovery continue the fallback write-race so there is no write availability gap during the coordination phase.
| cluster::WriterDecider::WriterDecider | ( | boost::asio::thread_pool & | ctx, |
| std::unique_ptr< etl::WriterStateInterface > | writerState, | ||
| std::chrono::steady_clock::duration | recoveryTime = kRECOVERY_TIME ) |
Constructs a WriterDecider.
| ctx | Thread pool for executing asynchronous operations |
| writerState | Writer state interface for controlling write operations |
| recoveryTime | How long to wait in Fallback before attempting recovery (defaults to kRECOVERY_TIME; pass a short duration in tests) |
| void cluster::WriterDecider::onNewState | ( | ClioNode::CUuid | selfId, |
| std::shared_ptr< Backend::ClusterData const > | clusterData ) |
Handles cluster state changes and decides whether this node should be the writer.
Spawns an asynchronous task that applies the state machine described in the class documentation. Decisions are based on the clusterData snapshot:
clusterData has no value (communication failure), no action is taken.ReadOnly, writing is given up unconditionally.Fallback and a FallbackRecovery node is visible, the contagion rule applies: this node also enters FallbackRecovery and the recovery timer is cancelled.Fallback and the recovery timer is not running, it is started (handles the case where fallback was triggered externally, e.g. by Monitor).FallbackRecovery and no Fallback nodes are visible, the recovery coordination is complete: writing is given up and the fallback recovery flag is cleared so the node enters election mode on the next cycle.Fallback node is visible, this node switches to Fallback and the recovery timer is started.etlStarted && cacheIsFull) non-ReadOnly node is elected writer.| selfId | The UUID of the current node |
| clusterData | Shared pointer to current cluster data; may be empty if communication failed |