Upgrade from 5.x to 6.0: Difference between revisions

From Proxmox VE
Jump to navigation Jump to search
m (→‎In-place upgrade: move the edge-edge-case "no root passwd" way below, this is almost never relevant)
m (Replace bullet points with headings)
Line 215: Line 215:
== Troubleshooting ==
== Troubleshooting ==


* Failing upgrade to "buster"
=== Failing upgrade to "buster" ===


Make sure that the repository configuration for Buster is correct.
Make sure that the repository configuration for Buster is correct.
Line 223: Line 223:
  apt -f install
  apt -f install


* Unable to boot due to grub failure
=== Unable to boot due to grub failure ===


See [[Recover From Grub Failure]]
See [[Recover From Grub Failure]]

Revision as of 10:15, 23 March 2020

Introduction

Proxmox VE 6.x introduces several new major features. Carefully plan the upgrade, make and verify backups before beginning, and test extensively. Depending on the existing configuration, several manual steps—including some downtime—may be required.

Note: A valid and tested backup is always needed before starting the upgrade process. Test the backup beforehand in a test lab setup.

In case the system is customized and/or uses additional packages or any other third party repositories/packages, ensure those packages are also upgraded to and compatible with Debian Buster.

In general, there are two ways to upgrade a Proxmox VE 5.x system to Proxmox VE 6.x:

  • A new installation on a new hardware (and restoring VMs from the backup)
  • An in-place upgrade via apt (step-by-step)

In both cases emptying the browser cache and reloading the GUI page is required after the upgrade.

New installation

  • Backup all VMs and containers to external storage (see Backup and Restore).
  • Backup all files in /etc (required: files in /etc/pve, as well as /etc/passwd, /etc/network/interfaces, /etc/resolv.conf, as well as anything deviating from a default installation)
  • Install Proxmox VE from the ISO (this will delete all data on the existing host).
  • Rebuild your cluster, if applicable.
  • Restore the file /etc/pve/storage.cfg (this will make the external storage used for backup available).
  • Restore firewall configs /etc/pve/firewall/ and /etc/pve/nodes/<node>/host.fw (if applicable).
  • Restore full VMs from backups (see Backup and Restore).

Administrators comfortable with the command line can follow the procedure Bypassing backup and restore when upgrading if all VMs/CTs are on one shared storage.

In-place upgrade

In-place upgrades are done with apt. Familiarity with apt is required to proceed with this upgrade mechanism.

Preconditions

  • Upgrade to the latest version of Proxmox VE 5.4.
  • Reliable access to all configured storage.
  • A healthy cluster.
  • Valid and tested backup of all VMs and CTs (in case something goes wrong).
  • Correct configuration of the repository.
  • At least 1GB free disk space at root mount point.
  • Ceph: upgrade the Ceph cluster to Nautilus after you have upgraded: Follow the guide Ceph Luminous to Nautilus
  • Check known upgrade issues

Testing the Upgrade

An upgrade test can easily be performed using a standalone server first. Install the Proxmox VE 5.4 ISO on some test hardware; then upgrade this installation to the latest minor version of Proxmox VE 5.4 (see Package repositories). To replicate the production setup as closely as possible, copy or create all relevant configurations to the test machine. Then start the upgrade. It is also possible to install Proxmox VE 5.4 in a VM and test the upgrade in this environment.

Actions step-by-step

The following actions need to be done on the command line of each Proxmox VE node in your cluster (via console or ssh; preferably via console to avoid interrupted ssh connections). Remember: make sure that a valid backup of all VMs and CTs has been created before proceeding.

Continuously use the pve5to6 checklist script

A small checklist program named pve5to6 is included in the latest Proxmox VE 5.4 packages. The program will provide hints and warnings about potential issues before, during and after the upgrade process. One can call it by executing:

 pve5to6

This script only checks and reports things. By default, no changes to the system are made and thus none of the issues will be automatically fixed. One should have in mind that Proxmox VE can be heavily customized, so the script may not recognize all the possible problems of a particular setup!

It is recommended to re-run the script after each attempt to fix an issue. This ensures that the actions taken actually fixed the respective warning.

Cluster: always upgrade to Corosync 3 first

With Corosync 3 the on-the-wire format has changed. It is now incompatible with Corosync 2.x because it switched out the underlying multicast UDP stack with kronosnet. Configuration files generated by a Proxmox VE with version 5.2 or newer, are already compatible with the new Corosync 3.x (at least enough to process the upgrade without any issues).

Important Note: before the upgrade, stop all HA management services first—no matter which way you choose for upgrading to Corosync 3. Stopping all HA services ensures that no cluster nodes get fenced during the upgrade. This also means that there will not be any HA functionality available for the short duration of the Corosync upgrade.

First, make sure that all warnings that are reported by the checklist script and not related to Corosync are fixed or determined to be benign/false negatives. Next, stop the local resource manager "pve-ha-lrm" on each node. Only after they have been stopped, also stop the cluster resource manager "pve-ha-crm" on each node; use the GUI (Node -> Services) or the CLI by running the following command on each node:

systemctl stop pve-ha-lrm

Only after the above was done for all nodes, run the following on each node:

systemctl stop pve-ha-crm

Then add the Proxmox Corosync 3 Stretch repository:

echo "deb http://download.proxmox.com/debian/corosync-3/ stretch main" > /etc/apt/sources.list.d/corosync3.list

and run

apt update

Then make sure again that only corosync, kronosnet and their libraries will be updated or newly installed:

apt list --upgradeable
Listing... Done
corosync/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcmap4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcorosync-common4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libcpg4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libqb0/stable 1.0.5-1~bpo9+2 amd64 [upgradable from: 1.0.3-1~bpo9]
libquorum5/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
libvotequorum8/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]

There are two ways to proceed with the Corosync upgrade:

  • Upgrade nodes one by one. Initially, the newly upgraded node(s) will not have be quorate on their own. Once at least half of the nodes plus one have been upgraded, the upgraded partition will become quorate and the not-yet-upgraded partition will lose quorum. Once all nodes have been upgraded, they should form a healthy, quorate cluster again.
  • Upgrade all nodes simultaneously, e.g. using parallel ssh/screen/tmux.

Note: changes to any VM/CT or the cluster in general are not allowed for the duration of the upgrade!

Pre-download the upgrade to corosync-3 on all nodes, e.g., with:

 apt dist-upgrade --download-only

Then run the actual upgrade on all nodes:

 apt dist-upgrade

At any point in this procedure, the local view of the cluster quorum on a node can be verified with:

 pvecm status

Once the update to Corosync 3.x is done on all nodes, restart the local resource manager and cluster resource manager on all nodes:

 systemctl start pve-ha-lrm
 systemctl start pve-ha-crm

Move important Virtual Machines and Containers

If any VMs and CTs need to keep running for the duration of the upgrade, migrate them away from the node that is currently upgraded. A migration of a VM or CT from an older version of Proxmox VE to a newer version will always work. A migration from a newer Proxmox VE version to an older version may work, but is in general not supported. Keep this in mind when planning your cluster upgrade.

Update the configured APT repositories

First, make sure that the system is running using the latest Proxmox VE 5.4 packages:

apt update
apt dist-upgrade

Update all Debian repository entries to Buster.

sed -i 's/stretch/buster/g' /etc/apt/sources.list

Disable all Proxmox VE 5.x repositories. This includes the pve-enterprise repository, the pve-no-subscription repository and the pvetest repository.

To do so add the # symbol to comment out these repositories in the /etc/apt/sources.list.d/pve-enterprise.list and /etc/apt/sources.list files. See Package_Repositories

Add the Proxmox VE 6 Package Repository

echo "deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise" > /etc/apt/sources.list.d/pve-enterprise.list

For the no-subscription repository see Package Repositories. It can be something like:

sed -i -e 's/stretch/buster/g' /etc/apt/sources.list.d/pve-install-repo.list 

(Ceph only) Replace ceph.com repositories with proxmox.com ceph repositories

echo "deb http://download.proxmox.com/debian/ceph-luminous buster main" > /etc/apt/sources.list.d/ceph.list

If there is a backports line, remove it - currently, the upgrade has not been tested with packages from the backports repository installed.

Update the repositories data:

apt update

Upgrade the system to Debian Buster and Proxmox VE 6.0

This action will take some time depending on the system performance - up to 60 min or more. On high-performance servers with SSD storage, the dist-upgrade can be finished in 5 minutes.

Start with this step to get the initial set of upgraded packages:

 apt dist-upgrade

During the steps above, you may be asked to approve some of the new packages replacing configuration files. They are not relevant to the Proxmox VE upgrade, so you can choose what you want to do.

Reboot the system in order to use the new PVE kernel

After the Proxmox VE upgrade

For Clusters

  • remove extra the corosync 3 repository used to upgrade corosync on PVE 5 / Stretch if not done already. If you followed the steps here you can simply execute the following command to do so:
rm /etc/apt/sources.list.d/corosync3.list

For Hyperconverged Ceph

Now you should upgrade the Ceph cluster to the Nautilus release, following the article Ceph Luminous to Nautilus.

Checklist issues

proxmox-ve package is too old

Check the configured package repository entries (see Package_Repositories) and run

apt update

followed by

apt dist-upgrade

to get the latest PVE 5.x packages before upgrading to PVE 6.x

corosync 2.x installed, cluster-wide upgrade to 3.x needed!

See section Upgrade to corosync 3 first

Known upgrade issues

General

As a Debian based Distribution, Proxmox VE is affected by most issues and changes affecting Debian. So ensure to read the Upgrade specific issues for buster

Especially the OpenSSL default version and security level raised (normally checked by the pve5to6 tool) and Semantics for using environment variables for su changed

No 'root' Password set

The root account must have a password set (that you remember). If not, the sudo package will be uninstalled during the upgrade, and so you will not be able to log in again as root if that latter has no password set. If you used the official Proxmox VE or Debian installer and you didn't removed after the installation you are fine.

GlusterFS

At least the current version of Gluster (6.5-1, available at gluster.org) has conflicts with our packages. The upgrade has to be done with the option

-o Dpkg::Options::="--force-overwrite"

if this version is installed. This might be true for future versions as well.

Troubleshooting

Failing upgrade to "buster"

Make sure that the repository configuration for Buster is correct.

If there was a network failure and the upgrade has been made partially try to repair the situation with

apt -f install

Unable to boot due to grub failure

See Recover From Grub Failure

External links