Ceph Hammer to Jewel
Introduction
This HOWTO explains the upgrade from Ceph Hammer to Jewel (10.2.5 or higher). We strongly recommend that you update the cluster node by node.
Assumption
In this HOWTO we assume that all nodes are on the very latest Proxmox VE 4.4 or higher version and Ceph is on Version Hammer.
Preparation
Change the current Ceph repositories from Hammer to Jewel.
sed -i 's/hammer/jewel/' /etc/apt/sources.list.d/ceph.list
More information see Ceph Packages
Upgrade
Upgrade the node with the following commands.
apt-get update && apt-get dist-upgrade
It will upgrade all repository's on your node.
Stop daemons
To prevent a re-balance set the OSD to noout. This can be done on the GUI in the OSD Tab or with this command
ceph osd set noout
Kill all OSD on this node, this can be only done on the command line.
killall ceph-osd
Stop the Monitor on this node. To get the <UNIQUE ID> you can use the tab completion.
systemctl stop ceph-mon.<MON-ID>.<UNIQUE ID>.service
Set permission
Ceph use since Infernalis 'ceph' as user for the daemons and no more root. This increase the security but need a change of the permission on some directory's.
chown ceph: -R /var/lib/ceph/ chown :root -R /var/log/ceph/
In the log dir root must still have access to rotate the logs.
The following commands must be executed for every OSD on this node. OSD-ID is the number of the osd on this node and you can get it from the GUI or "ceph osd tree"
readlink -f /var/lib/ceph/osd/ceph-<OSD-ID>/journal chown ceph: <output of the command before>
Start the daemon
To ensure that Ceph startup in the correct order you should do the following steps.
cp /usr/share/doc/pve-manager/examples/ceph.service /etc/systemd/system/ceph.service systemctl daemon-reload systemctl enable ceph.service
The first daemon which we start is the Monitor, but from now on we use systemd.
systemctl start ceph-mon@<MON-ID>.service systemctl enable ceph-mon@<MON-ID>.service
Then start all OSD's on this node
systemctl start ceph-osd@<OSD-ID>.service
After your node has successful starts the daemons, delete the 'noout' flag. On the GUI or by this command.
ceph osd unset noout
Now check if you Ceph cluster is healthy.
ceph -s
The last thing what have been done is to purge the packet ceph. It is no problem if this packed stays, but then you get every day a error message from logrotate.
apt-get purge ceph apt-get install ceph-mon ceph-osd libleveldb1 libsnappy1
if so you can continue.
Upgrade all nodes
Now you must repeat this steps on the other nodes on your cluster, until all nodes are upgraded. Start over with Preparation
Set the tunables
Now you get a warning that you ceph cluster has 'legacy tunables'
ceph osd crush tunables hammer
ceph osd set require_jewel_osds
To get away this warning read carefully the following link and if you decide to set tunables read the text below before you apply this change.
Ceph warning when tunables are non optimal
If you set the tunable and your Ceph cluster is used you should do this when the cluster has the least load. Also try to use the backfill option to do this process slower.
Important Do not set tunable to optimal because then krbd will not work. Set it to hammer it you want to use krbd (needed for containers)