Ceph Hammer to Jewel

From Proxmox VE
Revision as of 09:51, 5 January 2017 by Martin (talk | contribs) (page created)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

This HOWTO explains the upgrade from Ceph Hammer to Jewel (10.2.5 or higher). We strongly recommend that you update the cluster node by node.

Assumption

In this HOWTO we assume that all nodes are on the very latest Proxmox VE 4.4 or higher version and Ceph is on Version Hammer.

Preparation

Change the current Ceph repositories from Hammer to Jewel.

sed -i 's/hammer/jewel/' /etc/apt/sources.list.d/ceph.list

More information see Ceph Packages

Upgrade

Upgrade the node with the following commands.

apt-get update && apt-get dist-upgrade

It will upgrade all repository's on your node.

Stop daemons

To prevent a re-balance set the OSD to noout. This can be done on the GUI in the OSD Tab or with this command

ceph osd set noout

Kill all OSD on this node, this can be only done on the command line.

killall ceph-osd

Stop the Monitor on this node. To get the <UNIQUE ID> you can use the tab completion.

systemctl stop ceph-mon.<MON-ID>.<UNIQUE ID>.service

Set permission

Ceph use since Infernalis 'ceph' as user for the daemons and no more root. This increase the security but need a change of the permission on some directory's.

chown ceph: -R /var/lib/ceph/
chown :root -R /var/log/ceph/

In the log dir root must still have access to rotate the logs.

The following commands must be executed for every OSD on this node. OSD-ID is the number of the osd on this node and you can get it from the GUI or "ceph osd tree"

readlink -f /var/lib/ceph/osd/ceph-<OSD-ID>/journal
chown ceph: <output of the command before>

Start the daemon

To ensure that Ceph startup in the correct order you should do the following steps.

cp /usr/share/doc/pve-manager/examples/ceph.service /etc/systemd/system/ceph.service
systemctl daemon-reload
systemctl enable ceph.service

The first daemon which we start is the Monitor, but from now on we use systemd.

systemctl start ceph-mon@<MON-ID>.service
systemctl enable ceph-mon@<MON-ID>.service

Then start all OSD's on this node

systemctl start ceph-osd@<OSD-ID>.service

After your node has successful starts the daemons, delete the 'noout' flag. On the GUI or by this command.

ceph osd unset noout

Now check if you Ceph cluster is healthy.

ceph -s

The last thing what have been done is to purge the packet ceph. It is no problem if this packed stays, but then you get every day a error message from logrotate.

apt-get purge ceph
apt-get install  ceph-mon ceph-osd libleveldb1 libsnappy1

if so you can continue.

Upgrade all nodes

Now you must repeat this steps on the other nodes on your cluster, until all nodes are upgraded. Start over with Preparation

Set the tunables

Now you get a warning that you ceph cluster has 'legacy tunables'

ceph osd crush tunables hammer
ceph osd set require_jewel_osds

To get away this warning read carefully the following link and if you decide to set tunables read the text below before you apply this change.

Ceph warning when tunables are non optimal

If you set the tunable and your Ceph cluster is used you should do this when the cluster has the least load. Also try to use the backfill option to do this process slower.

Ceph Backfill

Important Do not set tunable to optimal because then krbd will not work. Set it to hammer it you want to use krbd (needed for containers)