-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the unicast based recovery algorithm to do replication policy check #11996
base: main
Are you sure you want to change the base?
Conversation
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
This was because of failures in "AccessTenant" related (with error "traced too many lines") and clog related (I see a transaction retried many times without progress, not sure about the real error) tests. |
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
@@ -2538,15 +2539,16 @@ ACTOR Future<Void> TagPartitionedLogSystem::epochEnd(Reference<AsyncVar<Referenc | |||
Version minDV = std::numeric_limits<Version>::max(); | |||
Version maxEnd = 0; | |||
state std::vector<Future<Void>> changes; | |||
state std::vector<std::tuple<int, std::vector<TLogLockResult>>> logGroupResults; | |||
state std::vector<std::tuple<int, std::vector<TLogLockResult>, bool>> logGroupResults; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tuple has become hard to read. Can it be made into a structure instead? I think it would improve the code.
// policy. We are doing it this way as checking it this way is more efficient | ||
// than checking on an individual version basis (which would require us to | ||
// build the nonavailable log server set for each version in the unavaialble | ||
// version list). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is a little hard to follow. Maybe something more like this?
// At least (N - replicationFactor + 1) log servers must be available.
// Otherwise, the unavailable log servers alone would not be sufficient
// to satisfy the replication policy.
//
// @note This check is intentionally more restrictive than necessary.
// Instead of verifying whether the unavailable log servers within
// the specific set that received the version satisfy the replication policy,
// we check whether the entire set of unavailable log servers meets the policy.
//
// This approach is chosen because it is computationally more efficient.
// Checking availability on a per-version basis would require constructing
// a unique set of unavailable log servers for each version in the unavailable
// version list, which would add significant overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the comment. Thanks for the edit!
Extend the version vector/unicast based recovery algorithm to do the replication policy check while deciding whether a version can be recovered from the set of available log servers. This will make the algorithm compatible with the non-unicast/"main" algorithm while handling non-reporting log servers during recovery.
Test that exposed this issue:
build_output/bin/fdbserver -r simulation --crash -f /root/src/foundationdb/tests/slow/RyowCorrectness.toml -b off -s 29779152
A "getRange()" call was getting blocked because recovery was not completing, which was because "replication_factor" number of log servers were not reporting during recovery. But these set non-reporting log servers were not completing the replication policy, so extending the recovery algorithm to do the replication policy check allowed recovery to progress and the test to succeed.
Note that this extension will be able to make recovery progress only in cases where the non-reporting log servers won't meet the replication policy. But this will make the algorithm compatible with "main" while handling such scenarios.
Testing:
Id (with version vector disabled): 20250305-205711-sre-b53cba5eecb4dadb (started).
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormain
if this is the youngest branch)