Upgrade from 5.x to 6.0: Difference between revisions
Line 228: | Line 228: | ||
** Windows and some Linux VMs using systemd-boot should do that automatically | ** Windows and some Linux VMs using systemd-boot should do that automatically | ||
* '''After the upgrade''', you can recreate the EFI boot entries (e.g., with efibootmgr) and delete the <code>BOOTX64.EFI</code> again(if it did not exist before). | * '''After the upgrade''', you can recreate the EFI boot entries (e.g., with efibootmgr) and delete the <code>BOOTX64.EFI</code> again(if it did not exist before). | ||
* '''If you already upgraded''' and it does not boot, Start-up | * '''If you already upgraded''' and it does not boot, Start-up the VM and press ESC in the console to get into the OVMF menu. Then "Boot Maintenance Manager" -> "Boot From File" -> choose Disk with the Efi System Partition. Now you can navigate to the EFI executable, for example for Debian: <code>EFI/debian/grubx64.efi</code> or for Fedora: <code>EFI/fedora/shimx64-fedora.efi</code>. Once you found the correct one boot and after that restore the efiboot entry, see your Distribution's Documentation for this or use also the OVMF firmware gui. | ||
== Troubleshooting == | == Troubleshooting == |
Revision as of 16:54, 12 May 2020
Introduction
Proxmox VE 6.x introduces several new major features. Carefully plan the upgrade, make and verify backups before beginning, and test extensively. Depending on the existing configuration, several manual steps—including some downtime—may be required.
Note: A valid and tested backup is always needed before starting the upgrade process. Test the backup beforehand in a test lab setup.
In case the system is customized and/or uses additional packages or any other third party repositories/packages, ensure those packages are also upgraded to and compatible with Debian Buster.
In general, there are two ways to upgrade a Proxmox VE 5.x system to Proxmox VE 6.x:
- A new installation on a new hardware (and restoring VMs from the backup)
- An in-place upgrade via apt (step-by-step)
In both cases emptying the browser cache and reloading the GUI page is required after the upgrade.
New installation
- Backup all VMs and containers to external storage (see Backup and Restore).
- Backup all files in /etc (required: files in /etc/pve, as well as /etc/passwd, /etc/network/interfaces, /etc/resolv.conf, as well as anything deviating from a default installation)
- Install Proxmox VE from the ISO (this will delete all data on the existing host).
- Rebuild your cluster, if applicable.
- Restore the file /etc/pve/storage.cfg (this will make the external storage used for backup available).
- Restore firewall configs /etc/pve/firewall/ and /etc/pve/nodes/<node>/host.fw (if applicable).
- Restore full VMs from backups (see Backup and Restore).
Administrators comfortable with the command line can follow the procedure Bypassing backup and restore when upgrading if all VMs/CTs are on one shared storage.
In-place upgrade
In-place upgrades are done with apt. Familiarity with apt is required to proceed with this upgrade mechanism.
Preconditions
- Upgrade to the latest version of Proxmox VE 5.4.
- Reliable access to all configured storage.
- A healthy cluster.
- Valid and tested backup of all VMs and CTs (in case something goes wrong).
- Correct configuration of the repository.
- At least 1GB free disk space at root mount point.
- Ceph: upgrade the Ceph cluster to Nautilus after you have upgraded: Follow the guide Ceph Luminous to Nautilus
- Check known upgrade issues
Testing the Upgrade
An upgrade test can easily be performed using a standalone server first. Install the Proxmox VE 5.4 ISO on some test hardware; then upgrade this installation to the latest minor version of Proxmox VE 5.4 (see Package repositories). To replicate the production setup as closely as possible, copy or create all relevant configurations to the test machine. Then start the upgrade. It is also possible to install Proxmox VE 5.4 in a VM and test the upgrade in this environment.
Actions step-by-step
The following actions need to be done on the command line of each Proxmox VE node in your cluster (via console or ssh; preferably via console to avoid interrupted ssh connections). Remember: make sure that a valid backup of all VMs and CTs has been created before proceeding.
Continuously use the pve5to6 checklist script
A small checklist program named pve5to6 is included in the latest Proxmox VE 5.4 packages. The program will provide hints and warnings about potential issues before, during and after the upgrade process. One can call it by executing:
pve5to6
This script only checks and reports things. By default, no changes to the system are made and thus none of the issues will be automatically fixed. One should have in mind that Proxmox VE can be heavily customized, so the script may not recognize all the possible problems of a particular setup!
It is recommended to re-run the script after each attempt to fix an issue. This ensures that the actions taken actually fixed the respective warning.
Cluster: always upgrade to Corosync 3 first
With Corosync 3 the on-the-wire format has changed. It is now incompatible with Corosync 2.x because it switched out the underlying multicast UDP stack with kronosnet. Configuration files generated by a Proxmox VE with version 5.2 or newer, are already compatible with the new Corosync 3.x (at least enough to process the upgrade without any issues).
Important Note: before the upgrade, stop all HA management services first—no matter which way you choose for upgrading to Corosync 3. Stopping all HA services ensures that no cluster nodes get fenced during the upgrade. This also means that there will not be any HA functionality available for the short duration of the Corosync upgrade.
First, make sure that all warnings that are reported by the checklist script and not related to Corosync are fixed or determined to be benign/false negatives. Next, stop the local resource manager "pve-ha-lrm" on each node. Only after they have been stopped, also stop the cluster resource manager "pve-ha-crm" on each node; use the GUI (Node -> Services) or the CLI by running the following command on each node:
systemctl stop pve-ha-lrm
Only after the above was done for all nodes, run the following on each node:
systemctl stop pve-ha-crm
Then add the Proxmox Corosync 3 Stretch repository:
echo "deb http://download.proxmox.com/debian/corosync-3/ stretch main" > /etc/apt/sources.list.d/corosync3.list
and run
apt update
Then make sure again that only corosync, kronosnet and their libraries will be updated or newly installed:
apt list --upgradeable Listing... Done corosync/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1] libcmap4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1] libcorosync-common4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1] libcpg4/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1] libqb0/stable 1.0.5-1~bpo9+2 amd64 [upgradable from: 1.0.3-1~bpo9] libquorum5/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1] libvotequorum8/stable 3.0.2-pve2~bpo9 amd64 [upgradable from: 2.4.4-pve1]
There are two ways to proceed with the Corosync upgrade:
- Upgrade nodes one by one. Initially, the newly upgraded node(s) will not have be quorate on their own. Once at least half of the nodes plus one have been upgraded, the upgraded partition will become quorate and the not-yet-upgraded partition will lose quorum. Once all nodes have been upgraded, they should form a healthy, quorate cluster again.
- Upgrade all nodes simultaneously, e.g. using parallel ssh/screen/tmux.
Note: changes to any VM/CT or the cluster in general are not allowed for the duration of the upgrade!
Pre-download the upgrade to corosync-3 on all nodes, e.g., with:
apt dist-upgrade --download-only
Then run the actual upgrade on all nodes:
apt dist-upgrade
At any point in this procedure, the local view of the cluster quorum on a node can be verified with:
pvecm status
Once the update to Corosync 3.x is done on all nodes, restart the local resource manager and cluster resource manager on all nodes:
systemctl start pve-ha-lrm systemctl start pve-ha-crm
Move important Virtual Machines and Containers
If any VMs and CTs need to keep running for the duration of the upgrade, migrate them away from the node that is currently upgraded. A migration of a VM or CT from an older version of Proxmox VE to a newer version will always work. A migration from a newer Proxmox VE version to an older version may work, but is in general not supported. Keep this in mind when planning your cluster upgrade.
Update the configured APT repositories
First, make sure that the system is running using the latest Proxmox VE 5.4 packages:
apt update apt dist-upgrade
Update all Debian repository entries to Buster.
sed -i 's/stretch/buster/g' /etc/apt/sources.list
Disable all Proxmox VE 5.x repositories. This includes the pve-enterprise repository, the pve-no-subscription repository and the pvetest repository.
To do so add the # symbol to comment out these repositories in the /etc/apt/sources.list.d/pve-enterprise.list and /etc/apt/sources.list files. See Package_Repositories
Add the Proxmox VE 6 Package Repository
echo "deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise" > /etc/apt/sources.list.d/pve-enterprise.list
For the no-subscription repository see Package Repositories. It can be something like:
sed -i -e 's/stretch/buster/g' /etc/apt/sources.list.d/pve-install-repo.list
(Ceph only) Replace ceph.com repositories with proxmox.com ceph repositories
echo "deb http://download.proxmox.com/debian/ceph-luminous buster main" > /etc/apt/sources.list.d/ceph.list
If there is a backports line, remove it - currently, the upgrade has not been tested with packages from the backports repository installed.
Update the repositories data:
apt update
Upgrade the system to Debian Buster and Proxmox VE 6.0
This action will take some time depending on the system performance - up to 60 min or more. On high-performance servers with SSD storage, the dist-upgrade can be finished in 5 minutes.
Start with this step to get the initial set of upgraded packages:
apt dist-upgrade
During the steps above, you may be asked to approve some of the new packages replacing configuration files. They are not relevant to the Proxmox VE upgrade, so you can choose what you want to do.
Reboot the system in order to use the new PVE kernel
After the Proxmox VE upgrade
For Clusters
- remove extra the corosync 3 repository used to upgrade corosync on PVE 5 / Stretch if not done already. If you followed the steps here you can simply execute the following command to do so:
rm /etc/apt/sources.list.d/corosync3.list
For Hyperconverged Ceph
Now you should upgrade the Ceph cluster to the Nautilus release, following the article Ceph Luminous to Nautilus.
Checklist issues
proxmox-ve package is too old
Check the configured package repository entries (see Package_Repositories) and run
apt update
followed by
apt dist-upgrade
to get the latest PVE 5.x packages before upgrading to PVE 6.x
corosync 2.x installed, cluster-wide upgrade to 3.x needed!
See section Upgrade to corosync 3 first
Known upgrade issues
General
As a Debian based Distribution, Proxmox VE is affected by most issues and changes affecting Debian. So ensure to read the Upgrade specific issues for buster
Especially the OpenSSL default version and security level raised (normally checked by the pve5to6 tool) and Semantics for using environment variables for su changed
Upgrade wants to remove package 'proxmox-ve'
If you have installed Proxmox VE on top of Debian Stretch, you may have installed the package 'linux-image-amd64' which conflicts with current 6.x setups. To solve this you have to remove this package with
apt remove linux-image-amd64
before the dist-upgrade.
No 'root' Password set
The root account must have a password set (that you remember). If not, the sudo package will be uninstalled during the upgrade, and so you will not be able to log in again as root if that latter has no password set. If you used the official Proxmox VE or Debian installer and you didn't removed after the installation you are fine.
GlusterFS
At least the current version of Gluster (6.5-1, available at gluster.org) has conflicts with our packages. The upgrade has to be done with the option
-o Dpkg::Options::="--force-overwrite"
if this version is installed. This might be true for future versions as well.
Qemu Virtual Machines booting with OVMF (EFI)
An issue with EFI Disks was fixed in qemu-server 6.1-12. An EFI disk on a storage which doesn't allow for small (128 KB) images (for example: CEPH, ZFS, LVM(thin)), was not correctly mapped to the VM. While fixed now, such existing setup need manual intervention.
- You do not have to do anything if your EFI disks is using qcow2 or "raw" on a file based storage.
- Before the upgrade, make sure that on your ESP, the EFI boot binary exists at
\EFI\BOOT\BOOTX64.EFI
(the default EFI Boot fallback).- Windows and some Linux VMs using systemd-boot should do that automatically
- After the upgrade, you can recreate the EFI boot entries (e.g., with efibootmgr) and delete the
BOOTX64.EFI
again(if it did not exist before). - If you already upgraded and it does not boot, Start-up the VM and press ESC in the console to get into the OVMF menu. Then "Boot Maintenance Manager" -> "Boot From File" -> choose Disk with the Efi System Partition. Now you can navigate to the EFI executable, for example for Debian:
EFI/debian/grubx64.efi
or for Fedora:EFI/fedora/shimx64-fedora.efi
. Once you found the correct one boot and after that restore the efiboot entry, see your Distribution's Documentation for this or use also the OVMF firmware gui.
Troubleshooting
Failing upgrade to "buster"
Make sure that the repository configuration for Buster is correct.
If there was a network failure and the upgrade has been made partially try to repair the situation with
apt -f install
Unable to boot due to grub failure
OVH server blocked
If you use OVH and after the upgrade cannot access your server anymore (no GUI, no SSH) read this tutorial in the Proxmox VE forum.