LogCabin
|
Interprets and executes operations that have been committed into the Raft log. More...
#include <StateMachine.h>
Classes | |
struct | Session |
Tracks state for a particular client. More... | |
Public Types | |
enum | { MIN_SUPPORTED_VERSION, MAX_SUPPORTED_VERSION } |
typedef Protocol::Client::StateMachineCommand | Command |
typedef Protocol::Client::StateMachineQuery | Query |
Public Member Functions | |
StateMachine (std::shared_ptr< RaftConsensus > consensus, Core::Config &config, Globals &globals) | |
~StateMachine () | |
bool | query (const Query::Request &request, Query::Response &response) const |
Called by ClientService to execute read-only queries on the state machine. | |
void | updateServerStats (Protocol::ServerStats &serverStats) const |
Add information about the state machine state to the given structure. | |
void | wait (uint64_t index) const |
Return once the state machine has applied at least the given entry. | |
bool | waitForResponse (uint64_t logIndex, const Command::Request &command, Command::Response &response) const |
Called by ClientService to get a response for a read-write command on the state machine. | |
bool | isTakingSnapshot () const |
Return true if the server is currently taking a snapshot and false otherwise. | |
void | startTakingSnapshot () |
If the state machine is not taking a snapshot, this starts one. | |
void | stopTakingSnapshot () |
If the server is currently taking a snapshot, abort it. | |
std::chrono::nanoseconds | getInhibit () const |
Return the time for which the state machine will not take any automated snapshots. | |
void | setInhibit (std::chrono::nanoseconds duration) |
Disable automated snapshots for the given duration. | |
Private Types | |
typedef Core::Time::SteadyClock | Clock |
Clock used by watchdog timer thread. | |
typedef Clock::time_point | TimePoint |
Point in time of Clock. | |
Private Member Functions | |
void | apply (const RaftConsensus::Entry &entry) |
Invoked once per committed entry from the Raft log. | |
void | applyThreadMain () |
Main function for thread that waits for new commands from Raft. | |
void | serializeSessions (SnapshotStateMachine::Header &header) const |
Return the sessions table as a protobuf message for writing into a snapshot. | |
void | expireResponses (Session &session, uint64_t firstOutstandingRPC) |
Update the session and clean up unnecessary responses. | |
void | expireSessions (uint64_t clusterTime) |
Remove old sessions. | |
uint16_t | getVersion (uint64_t logIndex) const |
Return the version of the state machine behavior as of the given log index. | |
void | killSnapshotProcess (Core::HoldingMutex holdingMutex, int signum) |
If there is a current snapshot process, send it a signal and return immediately. | |
void | loadSessions (const SnapshotStateMachine::Header &header) |
Restore the sessions table from a snapshot. | |
void | loadSnapshot (Core::ProtoBuf::InputStream &stream) |
Read all of the state machine state from a snapshot file (including version, sessions, and tree). | |
void | loadVersionHistory (const SnapshotStateMachine::Header &header) |
Restore the versionHistory table from a snapshot. | |
void | serializeVersionHistory (SnapshotStateMachine::Header &header) const |
Return the versionHistory table as a protobuf message for writing into a snapshot. | |
bool | shouldTakeSnapshot (uint64_t lastIncludedIndex) const |
Return true if it is time to create a new snapshot. | |
void | snapshotThreadMain () |
Main function for thread that calls takeSnapshot when appropriate. | |
void | snapshotWatchdogThreadMain () |
Main function for thread that checks the progress of the child process. | |
void | takeSnapshot (uint64_t lastIncludedIndex, std::unique_lock< Core::Mutex > &lockGuard) |
Called by snapshotThreadMain to actually take the snapshot. | |
void | warnUnknownRequest (const google::protobuf::Message &request, const char *reason) const |
Called to log a debug message if appropriate when the state machine encounters a query or command that is not understood by the current running version. | |
Private Attributes | |
std::shared_ptr< RaftConsensus > | consensus |
Consensus module from which this state machine pulls commands and snapshots. | |
Globals & | globals |
Server-wide globals. | |
uint64_t | snapshotBlockPercentage |
Used for testing the snapshot watchdog thread. | |
uint64_t | snapshotMinLogSize |
Size in bytes of smallest log to snapshot. | |
uint64_t | snapshotRatio |
Maximum log size as multiple of last snapshot size until server should snapshot. | |
std::chrono::nanoseconds | snapshotWatchdogInterval |
After this much time has elapsed without any progress, the snapshot watchdog thread will kill the snapshotting process. | |
uint64_t | sessionTimeoutNanos |
The time interval after which to remove an inactive client session, in nanoseconds of cluster time. | |
std::chrono::milliseconds | unknownRequestMessageBackoff |
The state machine logs messages when it receives a command or query that is not understood in the current running version. | |
Core::Mutex | mutex |
Protects against concurrent access for all members of this class (except 'consensus', which is itself a monitor. | |
Core::ConditionVariable | entriesApplied |
Notified when lastApplied changes after some entry got applied. | |
Core::ConditionVariable | snapshotSuggested |
Notified when shouldTakeSnapshot(lastApplied) becomes true. | |
Core::ConditionVariable | snapshotStarted |
Notified when a snapshot process is forked. | |
Core::ConditionVariable | snapshotCompleted |
Notified when a snapshot process is joined. | |
bool | exiting |
applyThread sets this to true to signal that the server is shutting down. | |
pid_t | childPid |
The PID of snapshotThread's child process, if any. | |
uint64_t | lastApplied |
The index of the last log entry that this state machine has applied. | |
TimePoint | lastUnknownRequestMessage |
The time when warnUnknownRequest() last printed a debug message. | |
uint64_t | numUnknownRequests |
Total number of commands/queries that this state machine either did not understand or could not process because they were introduced in a newer version. | |
uint64_t | numUnknownRequestsSinceLastMessage |
The number of debug messages suppressed by warnUnknownRequest() since lastUnknownRequestMessage. | |
uint64_t | numSnapshotsAttempted |
The number of times a snapshot has been started. | |
uint64_t | numSnapshotsFailed |
The number of times a snapshot child process has failed to exit cleanly. | |
uint64_t | numRedundantAdvanceVersionEntries |
The number of times a log entry was processed to advance the state machine's running version, but the state machine was already at that version. | |
uint64_t | numRejectedAdvanceVersionEntries |
The number of times a log entry was processed to advance the state machine's running version, but the state machine was already at a larger version. | |
uint64_t | numSuccessfulAdvanceVersionEntries |
The number of times a log entry was processed to successfully advance the state machine's running version, where the state machine was previously at a smaller version. | |
uint64_t | numTotalAdvanceVersionEntries |
The number of times any log entry to advance the state machine's running version was processed. | |
bool | isSnapshotRequested |
Set to true when an administrator has asked the server to take a snapshot; set to false once the server starts any snapshot. | |
TimePoint | maySnapshotAt |
The time at which the server may begin to take automated snapshots. | |
std::unordered_map< uint64_t, Session > | sessions |
Client ID to Session map. | |
Tree::Tree | tree |
The hierarchical key-value store. | |
std::map< uint64_t, uint16_t > | versionHistory |
The log position when the state machine was updated to each new version. | |
std::unique_ptr < Storage::SnapshotFile::Writer > | writer |
The file that the snapshot is being written into. | |
std::thread | applyThread |
Repeatedly calls into the consensus module to get commands to process and applies them. | |
std::thread | snapshotThread |
Takes snapshots with the help of a child process. | |
std::thread | snapshotWatchdogThread |
Watches the child process to make sure it's writing to writer, and kills it otherwise. |
Interprets and executes operations that have been committed into the Raft log.
Version history:
Definition at line 49 of file StateMachine.h.
typedef Protocol::Client::StateMachineCommand LogCabin::Server::StateMachine::Command |
Definition at line 51 of file StateMachine.h.
typedef Protocol::Client::StateMachineQuery LogCabin::Server::StateMachine::Query |
Definition at line 52 of file StateMachine.h.
typedef Core::Time::SteadyClock LogCabin::Server::StateMachine::Clock [private] |
Clock used by watchdog timer thread.
Definition at line 149 of file StateMachine.h.
typedef Clock::time_point LogCabin::Server::StateMachine::TimePoint [private] |
Point in time of Clock.
Definition at line 154 of file StateMachine.h.
anonymous enum |
Definition at line 54 of file StateMachine.h.
LogCabin::Server::StateMachine::StateMachine | ( | std::shared_ptr< RaftConsensus > | consensus, |
Core::Config & | config, | ||
Globals & | globals | ||
) |
Definition at line 43 of file StateMachine.cc.
Definition at line 106 of file StateMachine.cc.
bool LogCabin::Server::StateMachine::query | ( | const Query::Request & | request, |
Query::Response & | response | ||
) | const |
Called by ClientService to execute read-only queries on the state machine.
Definition at line 121 of file StateMachine.cc.
void LogCabin::Server::StateMachine::updateServerStats | ( | Protocol::ServerStats & | serverStats | ) | const |
Add information about the state machine state to the given structure.
Definition at line 136 of file StateMachine.cc.
void LogCabin::Server::StateMachine::wait | ( | uint64_t | index | ) | const |
Return once the state machine has applied at least the given entry.
Definition at line 165 of file StateMachine.cc.
bool LogCabin::Server::StateMachine::waitForResponse | ( | uint64_t | logIndex, |
const Command::Request & | command, | ||
Command::Response & | response | ||
) | const |
Called by ClientService to get a response for a read-write command on the state machine.
logIndex | The index in the log where the command was committed. | |
command | The request. | |
[out] | response | If the return value is true, the response will be filled in here. Otherwise, this will be unmodified. |
Definition at line 173 of file StateMachine.cc.
bool LogCabin::Server::StateMachine::isTakingSnapshot | ( | ) | const |
Return true if the server is currently taking a snapshot and false otherwise.
Definition at line 228 of file StateMachine.cc.
If the state machine is not taking a snapshot, this starts one.
This method will return after the snapshot has been started (it may have already completed).
Definition at line 235 of file StateMachine.cc.
If the server is currently taking a snapshot, abort it.
This method will return after the existing snapshot has been stopped (it's possible that another snapshot will have already started).
Definition at line 253 of file StateMachine.cc.
std::chrono::nanoseconds LogCabin::Server::StateMachine::getInhibit | ( | ) | const |
Return the time for which the state machine will not take any automated snapshots.
Definition at line 267 of file StateMachine.cc.
void LogCabin::Server::StateMachine::setInhibit | ( | std::chrono::nanoseconds | duration | ) |
Disable automated snapshots for the given duration.
duration | If zero or negative, immediately enables automated snapshotting. If positive, disables automated snapshotting for the given duration (or until a later call to setInhibit()). Note that this will not stop the current snapshot; the caller should invoke stopTakingSnapshot() separately if desired. |
Definition at line 279 of file StateMachine.cc.
void LogCabin::Server::StateMachine::apply | ( | const RaftConsensus::Entry & | entry | ) | [private] |
Invoked once per committed entry from the Raft log.
Definition at line 301 of file StateMachine.cc.
void LogCabin::Server::StateMachine::applyThreadMain | ( | ) | [private] |
Main function for thread that waits for new commands from Raft.
Definition at line 386 of file StateMachine.cc.
void LogCabin::Server::StateMachine::serializeSessions | ( | SnapshotStateMachine::Header & | header | ) | const [private] |
Return the sessions table as a protobuf message for writing into a snapshot.
Definition at line 427 of file StateMachine.cc.
void LogCabin::Server::StateMachine::expireResponses | ( | Session & | session, |
uint64_t | firstOutstandingRPC | ||
) | [private] |
Update the session and clean up unnecessary responses.
session | Affected session. |
firstOutstandingRPC | New value for the first outstanding RPC for a session. |
Definition at line 446 of file StateMachine.cc.
void LogCabin::Server::StateMachine::expireSessions | ( | uint64_t | clusterTime | ) | [private] |
Remove old sessions.
clusterTime | Sessions are kept if they have been modified during the last timeout period going backwards from the given time. |
Definition at line 461 of file StateMachine.cc.
uint16_t LogCabin::Server::StateMachine::getVersion | ( | uint64_t | logIndex | ) | const [private] |
Return the version of the state machine behavior as of the given log index.
Note that this is based on versionHistory internally, so if you're changing that variable at the given index, update it first.
Definition at line 482 of file StateMachine.cc.
void LogCabin::Server::StateMachine::killSnapshotProcess | ( | Core::HoldingMutex | holdingMutex, |
int | signum | ||
) | [private] |
If there is a current snapshot process, send it a signal and return immediately.
Definition at line 490 of file StateMachine.cc.
void LogCabin::Server::StateMachine::loadSessions | ( | const SnapshotStateMachine::Header & | header | ) | [private] |
Restore the sessions table from a snapshot.
Definition at line 505 of file StateMachine.cc.
void LogCabin::Server::StateMachine::loadSnapshot | ( | Core::ProtoBuf::InputStream & | stream | ) | [private] |
Read all of the state machine state from a snapshot file (including version, sessions, and tree).
Definition at line 524 of file StateMachine.cc.
void LogCabin::Server::StateMachine::loadVersionHistory | ( | const SnapshotStateMachine::Header & | header | ) | [private] |
Restore the versionHistory table from a snapshot.
Definition at line 555 of file StateMachine.cc.
void LogCabin::Server::StateMachine::serializeVersionHistory | ( | SnapshotStateMachine::Header & | header | ) | const [private] |
Return the versionHistory table as a protobuf message for writing into a snapshot.
Definition at line 579 of file StateMachine.cc.
bool LogCabin::Server::StateMachine::shouldTakeSnapshot | ( | uint64_t | lastIncludedIndex | ) | const [private] |
Return true if it is time to create a new snapshot.
This is called by applyThread as an optimization to avoid waking up snapshotThread upon applying every single entry.
Definition at line 593 of file StateMachine.cc.
void LogCabin::Server::StateMachine::snapshotThreadMain | ( | ) | [private] |
Main function for thread that calls takeSnapshot when appropriate.
Definition at line 623 of file StateMachine.cc.
void LogCabin::Server::StateMachine::snapshotWatchdogThreadMain | ( | ) | [private] |
Main function for thread that checks the progress of the child process.
Definition at line 652 of file StateMachine.cc.
void LogCabin::Server::StateMachine::takeSnapshot | ( | uint64_t | lastIncludedIndex, |
std::unique_lock< Core::Mutex > & | lockGuard | ||
) | [private] |
Called by snapshotThreadMain to actually take the snapshot.
Definition at line 720 of file StateMachine.cc.
void LogCabin::Server::StateMachine::warnUnknownRequest | ( | const google::protobuf::Message & | request, |
const char * | reason | ||
) | const [private] |
Called to log a debug message if appropriate when the state machine encounters a query or command that is not understood by the current running version.
request | Problematic command/query. |
reason | Explains why 'request' is problematic. Should complete the sentence "This version of the state machine (%lu) " + reason, and it should not contain end punctuation. |
Definition at line 807 of file StateMachine.cc.
std::shared_ptr<RaftConsensus> LogCabin::Server::StateMachine::consensus [private] |
Consensus module from which this state machine pulls commands and snapshots.
Definition at line 268 of file StateMachine.h.
Globals& LogCabin::Server::StateMachine::globals [private] |
Server-wide globals.
Used to unblock signal handlers in child process.
Definition at line 273 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::snapshotBlockPercentage [private] |
Used for testing the snapshot watchdog thread.
The probability that a snapshotting process will deadlock on purpose before starting, as a percentage.
Definition at line 280 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::snapshotMinLogSize [private] |
Size in bytes of smallest log to snapshot.
Definition at line 285 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::snapshotRatio [private] |
Maximum log size as multiple of last snapshot size until server should snapshot.
Definition at line 291 of file StateMachine.h.
std::chrono::nanoseconds LogCabin::Server::StateMachine::snapshotWatchdogInterval [private] |
After this much time has elapsed without any progress, the snapshot watchdog thread will kill the snapshotting process.
A special value of 0 disables the watchdog entirely.
Definition at line 298 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::sessionTimeoutNanos [private] |
The time interval after which to remove an inactive client session, in nanoseconds of cluster time.
Definition at line 304 of file StateMachine.h.
std::chrono::milliseconds LogCabin::Server::StateMachine::unknownRequestMessageBackoff [private] |
The state machine logs messages when it receives a command or query that is not understood in the current running version.
This controls the minimum interval between such messages to prevent spamming the debug log.
Definition at line 312 of file StateMachine.h.
Core::Mutex LogCabin::Server::StateMachine::mutex [mutable, private] |
Protects against concurrent access for all members of this class (except 'consensus', which is itself a monitor.
Definition at line 318 of file StateMachine.h.
Core::ConditionVariable LogCabin::Server::StateMachine::entriesApplied [mutable, private] |
Notified when lastApplied changes after some entry got applied.
Also notified upon exiting. This is used for client threads to wait; see wait().
Definition at line 325 of file StateMachine.h.
Core::ConditionVariable LogCabin::Server::StateMachine::snapshotSuggested [mutable, private] |
Notified when shouldTakeSnapshot(lastApplied) becomes true.
Also notified upon exiting and when maySnapshotAt changes. This is used for snapshotThread to wake up only when necessary.
Definition at line 332 of file StateMachine.h.
Core::ConditionVariable LogCabin::Server::StateMachine::snapshotStarted [mutable, private] |
Notified when a snapshot process is forked.
Also notified upon exiting. This is used so that the watchdog thread knows to begin checking the progress of the child process, and also in startTakingSnapshot().
Definition at line 340 of file StateMachine.h.
Core::ConditionVariable LogCabin::Server::StateMachine::snapshotCompleted [mutable, private] |
Notified when a snapshot process is joined.
Also notified upon exiting. This is used so that stopTakingSnapshot() knows when it's done.
Definition at line 347 of file StateMachine.h.
bool LogCabin::Server::StateMachine::exiting [private] |
applyThread sets this to true to signal that the server is shutting down.
Definition at line 353 of file StateMachine.h.
pid_t LogCabin::Server::StateMachine::childPid [private] |
The PID of snapshotThread's child process, if any.
This is used by applyThread to signal exits: if applyThread is exiting, it sends SIGTERM to this child process. A childPid of 0 indicates that there is no child process.
Definition at line 361 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::lastApplied [private] |
The index of the last log entry that this state machine has applied.
This variable is only written to by applyThread, so applyThread is free to access this variable without holding 'mutex'. Other readers must hold 'mutex'.
Definition at line 369 of file StateMachine.h.
TimePoint LogCabin::Server::StateMachine::lastUnknownRequestMessage [mutable, private] |
The time when warnUnknownRequest() last printed a debug message.
Used to prevent spamming the debug log.
Definition at line 375 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numUnknownRequests [mutable, private] |
Total number of commands/queries that this state machine either did not understand or could not process because they were introduced in a newer version.
Definition at line 382 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numUnknownRequestsSinceLastMessage [mutable, private] |
The number of debug messages suppressed by warnUnknownRequest() since lastUnknownRequestMessage.
Used to prevent spamming the debug log.
Definition at line 388 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numSnapshotsAttempted [private] |
The number of times a snapshot has been started.
In addition to being a useful stat, the watchdog thread uses this to know whether it's been watching the same snapshot or whether a new one has been started, and startTakingSnapshot() waits for this to change before returning.
Definition at line 397 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numSnapshotsFailed [private] |
The number of times a snapshot child process has failed to exit cleanly.
Definition at line 402 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numRedundantAdvanceVersionEntries [private] |
The number of times a log entry was processed to advance the state machine's running version, but the state machine was already at that version.
Definition at line 409 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numRejectedAdvanceVersionEntries [private] |
The number of times a log entry was processed to advance the state machine's running version, but the state machine was already at a larger version.
Definition at line 416 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numSuccessfulAdvanceVersionEntries [private] |
The number of times a log entry was processed to successfully advance the state machine's running version, where the state machine was previously at a smaller version.
Definition at line 423 of file StateMachine.h.
uint64_t LogCabin::Server::StateMachine::numTotalAdvanceVersionEntries [private] |
The number of times any log entry to advance the state machine's running version was processed.
Should be the sum of redundant, rejected, and successful counts.
Definition at line 430 of file StateMachine.h.
bool LogCabin::Server::StateMachine::isSnapshotRequested [private] |
Set to true when an administrator has asked the server to take a snapshot; set to false once the server starts any snapshot.
Snapshots that are requested due to this flag are permitted to begin even if automated snapshots have been inhibited with setInhibit().
Definition at line 438 of file StateMachine.h.
The time at which the server may begin to take automated snapshots.
Normally this is set to some time in the past. When automated snapshots are inhibited with setInhibit(), this will be set to a future time.
Definition at line 445 of file StateMachine.h.
std::unordered_map<uint64_t, Session> LogCabin::Server::StateMachine::sessions [private] |
Definition at line 481 of file StateMachine.h.
The hierarchical key-value store.
Used in readOnlyTreeRPC and readWriteTreeRPC.
Definition at line 487 of file StateMachine.h.
std::map<uint64_t, uint16_t> LogCabin::Server::StateMachine::versionHistory [private] |
The log position when the state machine was updated to each new version.
First component: log index. Second component: version number. Used to evolve state machine over time.
This is used by getResponse() to determine the running version at a given log index (to determine whether a command would have been applied), and it's used elsewhere to determine the state machine's current running version.
Invariant: the pair (index 0, version 1) is always present.
Definition at line 501 of file StateMachine.h.
std::unique_ptr<Storage::SnapshotFile::Writer> LogCabin::Server::StateMachine::writer [private] |
The file that the snapshot is being written into.
Also used by to track the progress of the child process for the watchdog thread. This is non-empty if and only if childPid > 0.
Definition at line 508 of file StateMachine.h.
std::thread LogCabin::Server::StateMachine::applyThread [private] |
Repeatedly calls into the consensus module to get commands to process and applies them.
Definition at line 514 of file StateMachine.h.
std::thread LogCabin::Server::StateMachine::snapshotThread [private] |
Takes snapshots with the help of a child process.
Definition at line 519 of file StateMachine.h.
std::thread LogCabin::Server::StateMachine::snapshotWatchdogThread [private] |
Watches the child process to make sure it's writing to writer, and kills it otherwise.
This is to detect any possible deadlock that might occur if a thread in the parent at the time of the fork held a lock that the child process then tried to access. See https://github.com/logcabin/logcabin/issues/121 for more rationale.
Definition at line 528 of file StateMachine.h.