[PVE-User] Boot disk corruption after Ceph OSD destroy with cleanup

Fri Mar 22 08:35:58 CET 2019

On Thu, Mar 21, 2019 at 03:58:53PM +0100, Eneko Lacunza wrote:
> Hi all,
> 
> We have removed an OSD disk from a server in our office cluster, removing
> partitions (with --cleanup 1) and that has made the server unable to boot
> (we have seen this in 2 servers in a row...)
> 
> Looking at the command output:
> 
> --- cut ---
> root at sanmarko:~# pveceph osd destroy 5 --cleanup 1
> destroy OSD osd.5
> Remove osd.5 from the CRUSH map
> Remove the osd.5 authentication key.
> Remove OSD osd.5
> Unmount OSD osd.5 from  /var/lib/ceph/osd/ceph-5
> remove partition /dev/sda1 (disk '/dev/sda', partnum 1)
> The operation has completed successfully.
> remove partition /dev/sdd7 (disk '/dev/sdd', partnum 7)
> Warning: The kernel is still using the old partition table.
> The new table will be used at the next reboot or after you
> run partprobe(8) or kpartx(8)
> The operation has completed successfully.
> wipe disk: /dev/sda
> 200+0 records in
> 200+0 records out
> 209715200 bytes (210 MB, 200 MiB) copied, 1.29266 s, 162 MB/s
> wipe disk: /dev/sdd
> 200+0 records in
> 200+0 records out
> 209715200 bytes (210 MB, 200 MiB) copied, 1.00753 s, 208 MB/s
> --- cut ---
> 
> Boot disk is SSD, look that scripts says it is wiping /dev/sdd!! It should
> do that to the journal partition? (dev/sdd7)
> 
> This cluster is on PVE 5.3 .
Can you please update, I suppose you don't have the pve-manager with
version 5.3-10 or newer installed yet. There the issue has been fixed.

But if you do and the issue still persists, then please post the
'pveversion -v'.

Thanks.

--
Cheers,
Alwin