[PVE-User] Ceph and cold bootstrap...

Marco Gaiarin gaio at sv.lnf.it
Fri Sep 16 10:53:21 CEST 2016


I'm testing some error condition on my test ceph storage cluster.

Today i've booted it (cold boot, was off by yesterday).

The log say:

 2016-09-16 09:38:38.015517 mon.0 10.27.251.7:6789/0 59 : cluster [INF] mon.0 calling new monitor election
 2016-09-16 09:38:38.078034 mon.1 10.27.251.11:6789/0 59 : cluster [INF] mon.1 calling new monitor election
 2016-09-16 09:38:38.551916 mon.1 10.27.251.11:6789/0 62 : cluster [WRN] message from mon.0 was stamped 0.125730s in the future, clocks not synchronized
 2016-09-16 09:38:38.555582 mon.0 10.27.251.7:6789/0 62 : cluster [INF] mon.0 at 0 won leader election with quorum 0,1
 2016-09-16 09:38:38.583642 mon.0 10.27.251.7:6789/0 63 : cluster [INF] HEALTH_OK
 2016-09-16 09:38:38.641007 mon.0 10.27.251.7:6789/0 64 : cluster [WRN] mon.1 10.27.251.11:6789/0 clock skew 0.14422s > max 0.05s
 2016-09-16 09:38:38.677682 mon.0 10.27.251.7:6789/0 65 : cluster [INF] monmap e2: 2 mons at {0=10.27.251.7:6789/0,1=10.27.251.11:6789/0}
 2016-09-16 09:38:38.679335 mon.0 10.27.251.7:6789/0 66 : cluster [INF] pgmap v4383: 192 pgs: 192 active+clean; 7582 MB data, 14938 MB used, 1837 GB / 1852 GB avail
 2016-09-16 09:38:38.679402 mon.0 10.27.251.7:6789/0 67 : cluster [INF] mdsmap e1: 0/0/0 up
 2016-09-16 09:38:38.679475 mon.0 10.27.251.7:6789/0 68 : cluster [INF] osdmap e20: 2 osds: 1 up, 2 in
 2016-09-16 09:38:38.797378 mon.0 10.27.251.7:6789/0 69 : cluster [INF] pgmap v4384: 192 pgs: 102 stale+active+clean, 90 active+clean; 7582 MB data, 14938 MB used, 1837 GB / 1852 GB avail
 2016-09-16 09:38:44.976103 mon.1 10.27.251.11:6789/0 65 : cluster [WRN] message from mon.0 was stamped 0.074651s in the future, clocks not synchronized
 2016-09-16 09:39:38.583757 mon.0 10.27.251.7:6789/0 72 : cluster [INF] HEALTH_WARN; 102 pgs stale; 1/2 in osds are down
 2016-09-16 09:43:42.025626 mon.0 10.27.251.7:6789/0 73 : cluster [INF] osd.1 out (down for 303.348204)
 2016-09-16 09:43:42.064945 mon.0 10.27.251.7:6789/0 74 : cluster [INF] osdmap e21: 2 osds: 1 up, 1 in
 2016-09-16 09:43:42.164823 mon.0 10.27.251.7:6789/0 75 : cluster [INF] pgmap v4385: 192 pgs: 102 stale+active+clean, 90 active+clean; 7582 MB data, 7484 MB used, 918 GB / 926 GB avail
 2016-09-16 09:44:38.584772 mon.0 10.27.251.7:6789/0 83 : cluster [INF] HEALTH_WARN; 102 pgs stale; 102 pgs stuck stale; too many PGs per OSD (384 > max 300)
 2016-09-16 09:45:32.454684 mon.0 10.27.251.7:6789/0 105 : cluster [WRN] message from mon.1 was stamped 0.050064s in the future, clocks not synchronized

my test VM, set for autostart, seems in 'limbo' (proxmox say it is
started, but was not).
One of the two OSD was in down/out state.


I've waited some time to clock to syncronize, then i've started the
OSD in down state, after that i was able to start the VM.


Considering that a little clock skew at bootup could be normal, it is a
consequence of having only two OSD (and/or monitor), or could be a
common error condition?

There's some way to prevent, or automatically (script?) fix them at
bootup?


Thanks.

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
    http://www.lanostrafamiglia.it/25/index.php/component/k2/item/123
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



More information about the pve-user mailing list