Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 99

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 619

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1169

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176
8000 docs(hep): harvester upgrade v2 by starbops · Pull Request #9161 · harvester/harvester · GitHub
Nothing Special   »   [go: up one dir, main page]

Skip to content

Conversation

starbops
Copy link
Member

Problem:

Solution:

Related Issue(s):

#7101
#7112

Test plan:

Additional documentation or context

Signed-off-by: Zespre Chang <zespre.chang@suse.com>
@starbops starbops force-pushed the hep-upgrade-enhancement branch from d736a7f to 734d03b Compare September 17, 2025 00:21
@starbops starbops marked this pull request as ready for review October 3, 2025 02:56
@starbops starbops force-pushed the hep-upgrade-enhancement branch from 8232e46 to 9437748 Compare October 3, 2025 02:58
8000
- add upgrade & version crds
- add test plan

Signed-off-by: Zespre Chang <zespre.chang@suse.com>
@starbops starbops force-pushed the hep-upgrade-enhancement branch from 9437748 to 55e6217 Compare October 3, 2025 03:02
Signed-off-by: Zespre Chang <zespre.chang@suse.com>
Signed-off-by: Zespre Chang <zespre.chang@suse.com>
@Vicente-Cheng
Copy link
Contributor

Hey folks,
Please help review the upgrade v2 design.

cc @ibrokethecloud, @ihcsim, @w13915984028

Copy link
Member
@w13915984028 w13915984028 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the big and well writen HEP.

A few concerns for open discussion
(1) The alternative solution of upgrade-repo.
(2) The decoupling with Rancher, some hidden dependencies might only be observed on developing/testing stage.
(3) Given Harvester's 4-month release plan, as said in the HEP, it needs more thoughts about the feature delivery stage by stage.


#### Stage 1 - Transition Period (version N-1 to N upgrades, where N is the debut minor version of Harvester Upgrade V2)

The inner workings of Upgrade Manager in this stage will be essentially the same as before; the significant difference is that it is built and packaged separately from the primary Harvester artifact. It will also have its own Deployment in contrast to the main Harvester controller manager Deployment. The main concerns will be:
Copy link
Member
@w13915984028 w13915984028 Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to import another requriement:

an upgrade matrix management upon Harvester/UpgradeShim <-> Upgrade Manager, to ensure the versions are matching between them

10000

#### Plan 1 - Static Pods with BackingImage Disk Files

The main idea is to leverage static pods, Longhorn BackingImage disk files, and hostPath volumes. The `minNumberOfCopies` field of the BackingImage for the ISO image is set to the number of nodes of the cluster (subject to change) to distribute the disk files to all nodes. By mounting the BackingImage disk file directly to a specific path on the host filesystem for each node, static pods on each node can access the ISO image content using the hostPath volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minNumberOfCopies field: ... not sure it works effectively on massive cluster with say 100 nodes.

1. (Optional) Preloading the Upgrade Manager container image
1. Deploying Upgrade Manager

### Upgrade Repository
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both plan 1 and plan 2 seem not robust enough

1. Node drain and RKE2 upgrade, corresponds to the **drain** stage
1. OS upgrade, corresponds to the **post-drain** stage

Since the hook mechanism in Rancher v2prov is no longer usable, Upgrade Manager relies on Plan CRs backed by System Upgrade Controller to execute node-upgrade tasks. One of the hidden advantages of this change is that it allows the operating system upgrade to be separated from the overall upgrade, thus enabling true zero-downtime (for nodes) upgrades. Another one is that it unblocks the way for users to have granular control over the node-upgrade order through node selectors. The new node-upgrade phase looks like the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it allows the operating system upgrade to be separated from the overall upgrade

how to, Harvester or user makes the decision?

simply skip when it is not necessary or selectively do it on another upgrade?

The challenge of switching to SUC Plans is that the node-upgrade phase is no longer executed as a control loop, but as a one-time task. Upgrade Manager needs to reconcile Plans instead of machine-plan Secrets, and there will be no other entities, such as the embedded Rancher, which abstracts the lifecycle management of downstream clusters, to organize the upgrade nuances.

> [!IMPORTANT]
> The infinite-retry vs. fail-fast will be a key topic for discussion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems no an easy/straightforward decision, we might need to copy those hacks or patches from Rancher to cover various cases


Another significant aspect is installation. Do we rely on Rancher System Agent to bootstrap clusters? What are the impacts if we drop it entirely? Do we need to develop a new installation method to fill the gap left by the decommissioning of the Rancher System Agent?

It appears that we do rely on the Rancher System Agent to bootstrap nodes except for the initial one. If we remove the Rancher System Agent, there might not be issues in day 2 management; however, we cannot live without it when it comes to cluster bootstrapping. If we leave it as is, the Rancher System Agent generates numerous logs containing error messages (because it is no longer able to communicate with the Rancher Manager) and does nothing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it is no longer able to communicate with the Rancher Manager

The embeded rancher deployment is still inside Harvester, what's the reason that it can't communicate, due to below?

Decouple 
8E8D
the cluster itself with the embedded Rancher Manager by removing the kubernetesVersion and rkeConfig fields from the local provisioning cluster CR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

0