[PVE-User] PVE 3.4 to 4.1 upgrade terror history

Alain Péan alain.pean at lpn.cnrs.fr
Mon Jan 11 10:40:20 CET 2016


Le 11/01/2016 10:08, Eneko Lacunza a écrit :
> This is a failed upgrade report from last friday.
>
> We have a 4-node Proxmox cluster, with 3 of them running ceph MON and 
> 3 osd daemons each. All nodes are updated last PVE 3.4 :
>
> node1: vms + ceph mon + 3xceph osd
> node2: vms + ceph mon + 3xceph osd
> node3: vms + ceph mon + 3xceph osd
> node4: vms (actually 0 right now) + vzdumps (nfs)
>
> To upgrade node2, we first moved all VMs running on "node2" to node1 & 
> node3, then followed the wiki upgrade guide:
> https://pve.proxmox.com/wiki/Upgrade_from_3.x_to_4.0
>
> All was going quite well. We installed pve-kernel-4.2.2-1-pve and not 
> the last available, which I think was the mistake.
>
> The problem was that after "apt-get dist-upgrade" the server won't 
> boot - it just kernel panic-ed. We tried booting with a PVE 3.4 kernel 
> but the new userland (systemd) wasn't able to boot.
>
> We tried various things (even installing the latest PVE kernel .deb 
> with debian 8 rescue pendrives) but weren't able to fix it, so finally 
> had to resinstall the server from scratch with the PVE 4.1 ISO.
>
> Reinstalling with ISO was successful and we have even recovered the 
> ceph OSD on that node.
>
> This node2 server is a Dell T610, do you think the 4.2.2-1 kernel was 
> faulty for this server? Maybe updating the wiki with the last PVE 4.1 
> kernel could help others?
>
> Maybe it could be a good idea to also install the debian 8 kernel, so 
> that there is another option to boot in case the PVE kernel doesn't work?

Hi Eneko,

I myself did the upgrade during 1st January week-end (long week-end and 
nobody present...). I installed the last available kernel and also 
pve-firmware that are important in order to have the correct drivers for 
our servers (Dell PE R710 and R630), for netwok cards and raid controllers.

What I did is to search for the available kernels, and installed the the 
last one, after updating the repositories to Jessie :
# apt-cache search pve-kernel
pve-kernel-4.2.3-2-pve - The Proxmox PVE Kernel Image
pve-kernel-4.2.2-1-pve - The Proxmox PVE Kernel Image
pve-kernel-4.2.3-1-pve - The Proxmox PVE Kernel Image
pve-kernel-4.2.6-1-pve - The Proxmox PVE Kernel Image
pve-firmware - Binary firmware code for the pve-kernel
pve-kernel-2.6.32-40-pve - The Proxmox PVE Kernel Image
pve-kernel-2.6.32-43-pve - The Proxmox PVE Kernel Image
pve-kernel-2.6.32-37-pve - The Proxmox PVE Kernel Image
pve-kernel-2.6.32-39-pve - The Proxmox PVE Kernel Image

Then :
# apt-get install pve-kernel-4.2.6-1-pve pve-firmware

Then upgrade to Jessie :
# apt-get dist-upgrade

as stated in the documentation.

My servers rebooted fine after that.

I had problems, but due to our network configration (we use a proxy, and 
DNS servers are themselves virtualized), and the fact we had to rebuild 
the cluster, dur to he upgrade from corosync 1.x to 2.x. We don't use 
Ceph for now, our VMs are locally stored, or on a Dell Equallogic NAS, 
with iSCSI.

Alain

-- 
Administrateur Système/Réseau
Laboratoire de Photonique et Nanostructures (LPN/CNRS - UPR20)
Centre de Recherche Alcatel Data IV - Marcoussis
route de Nozay - 91460 Marcoussis
Tel : 01-69-63-61-34




More information about the pve-user mailing list