[PVE-User] Ceph - PG expected clone missing

Karsten Becker karsten.becker at ecologic.eu
Mon Feb 19 18:54:38 CET 2018


Hmm, a little bit:


> 2018-02-19 14:29:50.309181 7fc3e82ef700  1 osd.29 pg_epoch: 48372
pg[10.7b9( v 48371'1976510 (48031'1975009,48371'1976510]
local-lis/les=48362/48363 n=1816 ec=36999/1069 lis/c 48362/48362 les/
> c/f 48363/48371/12767 48361/48372/48372) [29,10,22] r=0 lpr=48372
pi=[48362,48372)/1 luod=0'0 crt=48371'1976510 mlcod 0'0 active]
start_peering_interval up [29,10,22] -> [29,10,22], acting [10
> ,22,32] -> [29,10,22], acting_primary 10 -> 29, up_primary 29 -> 29,
role -1 -> 0, features acting 2305244844532236283 upacting
2305244844532236283
> 2018-02-19 14:29:50.309317 7fc3e82ef700  1 osd.29 pg_epoch: 48372
pg[10.7b9( v 48371'1976510 (48031'1975009,48371'1976510]
local-lis/les=48362/48363 n=1816 ec=36999/1069 lis/c 48362/48362 les/
> c/f 48363/48371/12767 48361/48372/48372) [29,10,22] r=0 lpr=48372
pi=[48362,48372)/1 crt=48371'1976510 mlcod 0'0 inconsistent]
state<Start>: transitioning to Primary
> 2018-02-19 14:30:34.445237 7fc3e6aec700  0 log_channel(cluster) log
[DBG] : 10.7b9 repair starts
> 2018-02-19 14:31:07.147350 7fc3e6aec700 -1 osd.29 pg_epoch: 48373
pg[10.7b9( v 48373'1976520 (48031'1975009,48373'1976520]
local-lis/les=48372/48373 n=1816 ec=36999/1069 lis/c 48372/48372 les/
> c/f 48373/48373/12767 48361/48372/48372) [29,10,22] r=0 lpr=48372
crt=48373'1976520 lcod 48373'1976519 mlcod 48373'1976519
active+clean+scrubbing+deep+inconsistent+repair] _scan_snaps no head
> for 10:9deb7da1:::rbd_data.966489238e1f29.0000000000004619:18 (have MIN)
> 2018-02-19 14:31:23.281765 7fc3e6aec700 -1 log_channel(cluster) log
[ERR] : repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head expected
clone 10:9defb021:::rbd_data.231
> 3975238e1f29.000000000002cbb5:64e 1 missing
> 2018-02-19 14:31:23.281780 7fc3e6aec700  0 log_channel(cluster) log
[INF] : repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head 1 missing
clone(s)
> 2018-02-19 14:32:05.166585 7fc3e6aec700 -1 log_channel(cluster) log
[ERR] : 10.7b9 repair 1 errors, 0 fixed


Whereas this should be the additional info that may help:

> c/f 48373/48373/12767 48361/48372/48372) [29,10,22] r=0 lpr=48372
crt=48373'1976520 lcod 48373'1976519 mlcod 48373'1976519
active+clean+scrubbing+deep+inconsistent+repair] _scan_snaps no head


During a night an automated qm snapshots of a Windows Server VM seems to
have failed. But it's suboptimal if this crashes Ceph in this way...


Best
Karsten





On 19.02.2018 16:01, Alwin Antreich wrote:
> Hi Karsten,
> 
> On Mon, Feb 19, 2018 at 02:36:41PM +0100, Karsten Becker wrote:
>> Hi,
>>
>> I have one damaged PG in my Ceph cluster. All OSDs are BlueStore. How do I
>> fix this?
>>
>>
>>> 2018-02-19 14:30:24.371058 mon.0 [ERR] overall HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
>>> 2018-02-19 14:30:37.733236 mon.0 [ERR] Health check update: Possible data damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
>>> 2018-02-19 14:31:24.371286 mon.0 [ERR] overall HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent, 1 pg repair
>>> 2018-02-19 14:31:23.281772 osd.29 [ERR] repair 10.7b9 10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head expected clone 10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:64e 1 missing
>>> 2018-02-19 14:31:23.281784 osd.29 [INF] repair 10.7b9 10:9defb021:::rbd_data.2313975238e1f29.000000000002cbb5:head 1 missing clone(s)
>>> 2018-02-19 14:32:05.166591 osd.29 [ERR] 10.7b9 repair 1 errors, 0 fixed
>>> 2018-02-19 14:32:05.580906 mon.0 [ERR] Health check update: Possible data damage: 1 pg inconsistent (PG_DAMAGED)
>>
>>
>> "ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
>> started scrub "ceph pg scrub 10.7b9" also.
>>
>> size=3 min_size=2... if it's of interest.
>>
>> Any help appreciated.
>>
>> Best from Berlin/Germany
>> Karsten
>>
> Check your osd.29, the disk may be faulty.
> 
> Can you see more in the log of the osd.29?
> 
> --
> Cheers,
> Alwin
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin (Charlottenburg), HRB 57947



More information about the pve-user mailing list