[PVE-User] Some erros in Ceph - PVE6

Eneko Lacunza elacunza at binovo.es
Mon Mar 30 09:09:18 CEST 2020


Hi Gilberto,

Generally, you have to wait when Ceph is doing rebalancing etc. until it 
finishes. Some things can go for hours.

Also, try no to change Ceph parameters without being sure and 
researching documentation and mailing lists. This is a new cluster and 
you have done things most Ceph users won't do until some years after 
initial setup :)

I suggest next time you have to remove OSD disks from all servers, you 
do as follows:

1. Out the OSDs (one by one for minimum impact).
2. Wait for rebalancing to finish
3. Down + remove the OSDs

https://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual

Cheers
Eneko

El 29/3/20 a las 16:46, Gilberto Nunes escribió:
> Hi guys
>
> I have installed Proxmox 6 and activate 3 servers with PVE 6 and Ceph.
> In this 3 server, I have 4 HDD:
> 3 SAS 500GB
> 1 SAS 2 TB
> However, we need to remove this 3 500GB of eache server... So I do out and
> stop and I am waiting for rebalance, but is took too long...
> Get this message:
> Reduced data availability: 2 pgs inactive, 2 pgs down
> pg 1.3a is down, acting [11,9,10]
> pg 1.23a is down, acting [11,9,10]
> (This 11,9,10 it's the 2 TB SAS HDD)
> And
> too many PGs per OSD (571 > max 250)
> I already tried decrease the number of PG to 256
> ceph osd pool set VMS pg_num 256
> but it seem no effect att all:
> ceph osd pool get VMS pg_num
> pg_num: 571
>
> Now, the sitution is that:
>
> ceph -s
>   cluster:
>     id:     93c55c6b-ce64-4e1a-92bc-0bc529d695f2
>     health: HEALTH_WARN
>             Reduced data availability: 2 pgs inactive, 2 pgs down
>             Degraded data redundancy: 6913/836472 objects degraded (0.826%),
> 18 pgs degraded, 19 pgs undersized
>             too many PGs per OSD (571 > max 250)
>
>   services:
>     mon: 5 daemons, quorum pve3,pve4,pve5,pve7,pve6 (age 51m)
>     mgr: pve3(active, since 39h), standbys: pve5, pve7, pve6, pve4
>     osd: 12 osds: 3 up (since 16m), 3 in (since 16m); 19 remapped pgs
>
>   data:
>     pools:   1 pools, 571 pgs
>     objects: 278.82k objects, 1.1 TiB
>     usage:   2.9 TiB used, 2.5 TiB / 5.5 TiB avail
>     pgs:     0.350% pgs not active
>              6913/836472 objects degraded (0.826%)
>              550 active+clean
>              17  active+undersized+degraded+remapped+backfill_wait
>              2   down
>              1   active+undersized+degraded+remapped+backfilling
>              1   active+undersized+remapped+backfill_wait
>
>   io:
>     client:   15 KiB/s rd, 1.0 MiB/s wr, 3 op/s rd, 102 op/s wr
>     recovery: 15 MiB/s, 3 objects/s
>
>   progress:
>     Rebalancing after osd.2 marked out
>       [=============================.]
>     Rebalancing after osd.7 marked out
>       [============================..]
>     Rebalancing after osd.6 marked out
>       [==========....................]
>
>
> Do I need to do something or just leave Ceph do this work??
>
> Thanks a lot!
>
> Cheers
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list