Upgrade from 3.x to 4.0

From Proxmox VE
Revision as of 09:53, 31 March 2017 by Thomas Lamprecht (talk | contribs) (→‎New installation: elaborate how to finish this procedure, avoid open ending)
Jump to navigation Jump to search

Introduction

Proxmox VE 4.x introduces major new features, therefore the upgrade must be carefully planned and tested. Depending on your existing configuration, several manual steps are required, including some downtime. NEVER start the upgrade process without a valid backup and without testing the same in a test lab setup.

Major upgrades for V4.x:

  • OpenVZ is removed, a conversion via backup/restore to LXC is needed
  • New corosync version, therefore clusters has to be re-established
  • New HA manager (replacing RGmanager, involving a complete HA re-configuration)

If you run a customized installation and/or you installed additional packages, for example for distributed storage like Ceph or sheepdog, DRBD or any other third party packages, you need to make sure that you also upgrade these package to Debian Jessie.

V4.x supports only the new DRBD9 which is not backwards compatible with the 8.x version and is considered only a technology preview.

Generally speaking there are two possibilities to move from 3.x to 4.x

  • New installation on new hardware (and restore VM´s from backup) - safest way, recommended!
  • In-place upgrade via apt, step by step

In both cases you'd better empty the browser's cache after upgrade and reload the GUI page or there is the possibility that you see a lot of glitches.

New installation

  • Backup all VMs and containers to external media (see Backup and Restore)
  • Backup all files in /etc You will need various files in /etc/pve, as well as /etc/passwd, /etc/network/interfaces, /etc/resolv.conf and others depending on what has been configured from the defaults.
  • Install Proxmox VE from ISO (this will wipe all data on the existing host)
  • Rebuild the cluster if you had any
  • Restore the file /etc/pve/storage.cfg (this will re-map and make available any external media you used for backup)
  • Restore firewall configs /etc/pve/firewall/ and /etc/pve/nodes/<node>/host.fw (if relevant)
  • Restore full VMs from Backups (see Backup and Restore)
  • Restore/Convert containers (see Convert OpenVZ to LXC)

Bypassing Backup and Restore

The following is only for advanced users which have knowledge about Proxmox configuration files!

Since Backup and Restore can be a time-consuming process in the following a more rapid method is described - possible only

  • for KVM (i.e. not for containers)
  • if the (virtual) disk(s) for the VM(s) is (are) located at a storage which is not touched by the installation process (e.g. NFS at an external server)

The steps

  • Backup all VMs
  • Restore full VMs from Backups

will be replaced by

  • Backup <vmid>.conf file(s) for the respective machine(s), they are located under /etc/pve/nodes/<nodename>/lxc/ and /etc/pve/nodes/<nodename>/qemu-server/ respectively
  • Backup those storages from the storage configuration ( /etc/pve/storage.cfg ) which are shared and untouched, simply copy the respective lines and append them once to the newly build clusters /etc/pve/storage.cfg
  • Restore <vmid>.conf file(s) for the respective machine(s)

Note: /etc/pve/lxc/ and /etc/pve/qemu-server/ are virtual symlinks for the current nodes lxc and qemu directory.

After you restored the VM configs and restored the external shared Storage configuration - so that it is accessible under the same name in the new cluster again - you should be able to start the VMs again. No additional reboot is required.

In-place upgrade

In-place upgrades are done with apt, so make sure that you are familiar with apt before you start here.

Preconditions

  • upgraded to latest V3.4 version
  • reliable access to all configured storages
  • healthy cluster
  • no VM or CT running (note: VM live migration from 3.4 to 4.x node or vice versa NOT possible)
  • valid backup of all OpenVZ containers (needed for the conversion to LXC)
  • valid backup of all VM (only needed if something goes wrong)
  • Correct repository configuration (accessible both wheezy and jessie)
  • at least 1GB free disk space at root mount point

Actions Step by Step

All has to be done on each Proxmox node's command line (via console or ssh; preferably via console in order to exclude interrupted ssh connections) , some of the steps are optional. If a whole cluster should be upgraded, keep in mind the cluster name and HA configuration like failoverdomains, fencing etc since these have to be restored after upgrade by the new WEB GUI. Again, make sure that you have a valid backup of all CT and VM before you start.

Tip: It is advisable to perform a dry-run of the upgrade first. Install the PVE 3.4 ISO on testing hardware, then upgrade this installation to the latest minor version of PVE 3.4 using the test repo (see Package repositories) then copy/create relevant configurations to the test machine to replicate your production setup as closely as possible.

Remove Proxmox VE 3.x packages in order to avoid dependency errors

First make sure that your actual installation is "clean", tentatively run

apt-get update && apt-get dist-upgrade

Then start the removal:

apt-get remove proxmox-ve-2.6.32 pve-manager corosync-pve openais-pve redhat-cluster-pve pve-cluster pve-firmware 

Adapt repository locations and update the apt database, point all to jessie, e.g.:

sed -i 's/wheezy/jessie/g' /etc/apt/sources.list
sed -i 's/wheezy/jessie/g' /etc/apt/sources.list.d/pve-enterprise.list
apt-get update

If there is a backports line then remove it. Currently, pve-manager and ceph-common have unmet dependencies with regards to package versions in the jessie backports repo.

In case Ceph server is used: Ceph repositories for jessie can be found at http://download.ceph.com, therefore etc/apt/sources.list.d/ceph.list will contain e.g.:

 deb http://download.ceph.com/debian-hammer jessie main


You also need to install the Ceph repository key to apt, for details, check the wiki on ceph.com.

Install the new kernel

Check first what the current new kernel's version is

apt-cache search pve-kernel | sort

- at the moment (September 2016) it is 4.4.19-1 - and install it:

apt-get install pve-kernel-4.4.19-1-pve pve-firmware

Upgrade the basic system to Debian Jessie

This action will consume some time - depending on the systems performance, this can take up to 60 min or even more. If you run on SSD, the dist-upgrade can be finished in 5 minutes.

Start with this step to get the initial set of upgraded packages.

apt-get upgrade

Once that's done, move on to the remaining packages to upgrade, with:

apt-get dist-upgrade

During either of the above, you may be asked to approve of some new packages replacing configuration files. Do with them as you see fit, but they are not relevant to the Proxmox upgrade.

Reboot the system in order to activate the new kernel.

Install Proxmox VE 4.x

Finally, install the new Proxmox VE 4.x packages with one single command:

apt-get install proxmox-ve

Then you should purge configuration files from packages which are no longer needed (Note: purging vzctl will delete all files in /var/lib/vz/private and /var/lib/vz/root , only run this if you have backed up your openvz containers.):

dpkg --purge vzctl
dpkg --purge redhat-cluster-pve

Remove the old kernel (not a must, but recommended), e.g. (the kernel version has to be adapted to the currently installed one - there can be more old kernels too. Use dpkg --list | grep pve-kernel to find any 2.6.* kernels to remove):

apt-get remove pve-kernel-2.6.*

Finally, reboot and test if all is working as expected.

Optional: OpenVZ conversion

Convert the previously backed up containers to LXC, following the HowTo on Convert OpenVZ to LXC

You can also remove the obsolete OpenVZ container data from your local storage.

rm -f /etc/pve/openvz/<ct-id>.conf
rm -R <storage-path>/private/*

Cluster upgrade

It is not possible to mix Proxmox VE 3.x and earlier with Proxmox VE 4.x cluster

Due to the new corosync 2.x, the cluster has to be re-established again. Please use the same clustername.

  • at the first node
pvecm create <clustername>
  • at all other nodes:
pvecm add <first-node´s-IP> -force

The HA configuration (fail-over, fencing etc.) has to be re-configured manually, now supported from WEB GUI, see High Availability Cluster 4.x

After upgrading the last node remove the V3.x cluster data:

rm /etc/pve/cluster.conf

Troubleshooting

  • Failing upgrade to latest Proxmox VE 3.x or removal of old packages:

Make sure that the original repository configuration (for wheezy) is correct. The change to "jessie" repositories has to be done after the removal of old Proxmox VE.

In case of Ceph is used: note that recently the repository url has changed to http://download.ceph.com/

  • Failing upgrade to "jessie"

Make the sure that the repository configuration for jessie is correct.

If there was a network failure and the upgrade has been made partially try to repair the situation with

apt-get -fy install
  • Unable to boot due to grub failure

See Recover_From_Grub_Failure

External links