Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-3.6] *: support DowngradeInfo field in maintenence.Status API #19460

Open
wants to merge 4 commits into
base: release-3.6
Choose a base branch
from

Conversation

k8s-infra-cherrypick-robot

This is an automated cherry-pick of #19451

/assign ahrtr

1. Update DowngradeUpgradeMembersByID

If it's downgrading process, the desire version of cluster should be
target one.
If it's upgrading process, the desire version of cluster should be
determined by mininum binary version of members.

2. Remove AssertProcessLogs from DowngradeEnable

The log message "The server is ready to downgrade" appears only when the storage
version monitor detects a mismatch between the cluster and storage versions.

If traffic is insufficient to trigger a commit or if an auto-commit occurs right
after reading the storage version, the monitor may fail to update it, leading
to errors like:

```bash
"msg":"failed to update storage version","cluster-version":"3.6.0",
"error":"cannot detect storage schema version: missing confstate information"
```

Given this, we should remove the AssertProcessLogs statement.

Similar to etcd-io#19313

Signed-off-by: Wei Fu <[email protected]>
@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: k8s-infra-cherrypick-robot
Once this PR has been reviewed and has the lgtm label, please assign spzala for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link

Hi @k8s-infra-cherrypick-robot. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ahrtr
Copy link
Member

ahrtr commented Feb 21, 2025

/ok-to-test

@ahrtr
Copy link
Member

ahrtr commented Feb 21, 2025

@fuweid Could you please update the 3.6 changelog as well?

@k8s-ci-robot
Copy link

@k8s-infra-cherrypick-robot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-e2e-amd64 5329549 link true /test pull-etcd-e2e-amd64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

codecov bot commented Feb 21, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 68.90%. Comparing base (0f89474) to head (5329549).

Files with missing lines Patch % Lines
etcdctl/ctlv3/command/printer.go 0.00% 3 Missing ⚠️
etcdctl/ctlv3/command/printer_fields.go 0.00% 2 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
server/etcdserver/api/v3rpc/maintenance.go 74.26% <100.00%> (+0.77%) ⬆️
etcdctl/ctlv3/command/printer_fields.go 0.00% <0.00%> (ø)
etcdctl/ctlv3/command/printer.go 0.00% <0.00%> (ø)

... and 19 files with indirect coverage changes

@@               Coverage Diff               @@
##           release-3.6   #19460      +/-   ##
===============================================
- Coverage        68.99%   68.90%   -0.10%     
===============================================
  Files              420      420              
  Lines            35753    35762       +9     
===============================================
- Hits             24669    24642      -27     
- Misses            9662     9693      +31     
- Partials          1422     1427       +5     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f89474...5329549. Read the comment docs.

@ahrtr
Copy link
Member

ahrtr commented Feb 21, 2025

TestDowngradeCancellationAfterDowngrading1InClusterOf3 failed.

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/etcd-io_etcd/19460/pull-etcd-e2e-amd64/1893043923723489280

After upgrading the member to 3.6, saw lots of the following error message.

It seems that it's a test bug. The server returned the corrected version, but the test case was expecting a wrong cluster version (3.5.0). cc @fuweid

"got": {"etcdserver":"3.6.0-rc.0","etcdcluster":"3.6.0","storage":"3.6.0"}, "want": {"etcdserver":"3.6.0","etcdcluster":"3.5.0","storage":""}}

@fuweid
Copy link
Member

fuweid commented Feb 21, 2025

@ahrtr it's test issue related to DowngradeUpgradeMembersByID change. File pull request later

@ahrtr
Copy link
Member

ahrtr commented Feb 21, 2025

The following code snip should have issue.

clusterVersion := targetVersion.String()
if !isDowngrade && len(membersToChange) != len(clus.Procs) {
clusterVersion = currentVersion.String()
}

The implementation should be something like below,

 clusterVersion := targetVersion.String() 
 if !isDowngrade && len(membersToChange) != len(clus.Procs) && downgradeNotCancelled { 
 	clusterVersion = currentVersion.String() 
 }

@ahrtr
Copy link
Member

ahrtr commented Feb 21, 2025

@ahrtr it's test issue related to DowngradeUpgradeMembersByID change. File pull request later

OK, please raise the PR on main. After merging that PR, can you please manually backport both PRs to release-3.6?

@fuweid
Copy link
Member

fuweid commented Feb 21, 2025

@ahrtr it's test issue related to DowngradeUpgradeMembersByID change. File pull request later

OK, please raise the PR on main. After merging that PR, can you please manually backport both PRs to release-3.6?

Yes. will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants