[PVE-User] Boot disk corruption after Ceph OSD destroy with cleanup

Eneko Lacunza elacunza at binovo.es
Fri Mar 22 09:03:22 CET 2019


Hi Alwin,

El 22/3/19 a las 8:35, Alwin Antreich escribió:
> On Thu, Mar 21, 2019 at 03:58:53PM +0100, Eneko Lacunza wrote:
>> We have removed an OSD disk from a server in our office cluster, removing
>> partitions (with --cleanup 1) and that has made the server unable to boot
>> (we have seen this in 2 servers in a row...)
>>
>> Looking at the command output:
>>
>> --- cut ---
>> root at sanmarko:~# pveceph osd destroy 5 --cleanup 1
>> destroy OSD osd.5
>> Remove osd.5 from the CRUSH map
>> Remove the osd.5 authentication key.
>> Remove OSD osd.5
>> Unmount OSD osd.5 from  /var/lib/ceph/osd/ceph-5
>> remove partition /dev/sda1 (disk '/dev/sda', partnum 1)
>> The operation has completed successfully.
>> remove partition /dev/sdd7 (disk '/dev/sdd', partnum 7)
>> Warning: The kernel is still using the old partition table.
>> The new table will be used at the next reboot or after you
>> run partprobe(8) or kpartx(8)
>> The operation has completed successfully.
>> wipe disk: /dev/sda
>> 200+0 records in
>> 200+0 records out
>> 209715200 bytes (210 MB, 200 MiB) copied, 1.29266 s, 162 MB/s
>> wipe disk: /dev/sdd
>> 200+0 records in
>> 200+0 records out
>> 209715200 bytes (210 MB, 200 MiB) copied, 1.00753 s, 208 MB/s
>> --- cut ---
>>
>> Boot disk is SSD, look that scripts says it is wiping /dev/sdd!! It should
>> do that to the journal partition? (dev/sdd7)
>>
>> This cluster is on PVE 5.3 .
> Can you please update, I suppose you don't have the pve-manager with
> version 5.3-10 or newer installed yet. There the issue has been fixed.
>
> But if you do and the issue still persists, then please post the
> 'pveversion -v'.
Seems both servers were on 5.3-8, thanks for the hint.

Maybe it would be helpful if you can publish some release notes for each 
package push made to pve-enterprise/pve-non-subscription (maybe 
capturing changed package's changelog?), so that this kind of (maybe 
corner but) grave problems are better communicated when the fix isn't 
first released on a point release.

Thanks a lot
Eneko

-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list