[PVE-User] PVE, Ceph, OSD in stop/out state: how to restart from commandline?

Marco Gaiarin gaio at sv.lnf.it
Wed Jun 7 11:17:09 CEST 2017


Mandi! Fabian Grünbichler
  In chel di` si favelave...

> OSDs are supposed to be enabled by UDEV rules automatically. This does
> not work on all systems, so PVE installs a ceph.service which triggers a
> scan for OSDs on all available disks.
> Either calling "systemctl restart ceph.service" or "ceph-disk
> activate-all" should start all available OSDs which haven't been started
> yet.

Good. I've missed that.


> The reason why you are not seeing ceph-osd at X systemd units for OSDs
> which haven't been available on this boot is that these units are
> purposely lost on a reboot, and only re-enabled for the current boot
> when ceph-disk starts the OSD (in systemd speech, they are "runtime"
> enabled). This kind of makes sense, since a OSD service can only be
> started if its disk is there, and if the disk is there it is supposed to
> have already been started via the UDEV rule.
> Which PVE and Ceph versions are you on? Is there anything out of the
> ordinary about your setup? Could you provide a log of the boot where the
> OSDs failed to start? The ceph.service should catch all the OSDs missed
> by UDEV on boot..

I'm using latest PVE 4.4, ceph hammer.

As i've sayed, power went off many times, at least 2. Looking at logs
(syslog):


1) main power outgage, server shut off:

Jun  2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph -f.


2) power return just to be able to boot the server, then server stop
again. But just here only one OSD start.
[ i don't know if power returns many times, here, but not sufficient to
  write logs ]

Jun  3 22:10:51 vedovanera ceph[1893]: === mon.1 ===
Jun  3 22:10:51 vedovanera ceph[1893]: Starting Ceph mon.1 on vedovanera...
Jun  3 22:10:51 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  3 22:10:51 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  3 22:10:51 vedovanera ceph[1893]: Running as unit ceph-mon.1.1496520651.342882341.service.
Jun  3 22:10:51 vedovanera ceph[1893]: Starting ceph-create-keys on vedovanera...
Jun  3 22:10:51 vedovanera kernel: [   12.777843] ip6_tables: (C) 2000-2006 Netfilter Core Team
Jun  3 22:10:51 vedovanera kernel: [   12.788040] ip_set: protocol 6
Jun  3 22:10:51 vedovanera kernel: [   12.949441] XFS (sdc1): Mounting V4 Filesystem
Jun  3 22:10:52 vedovanera kernel: [   13.114011] XFS (sdc1): Ending clean mount
Jun  3 22:10:52 vedovanera ceph[1893]: === osd.3 ===
Jun  3 22:10:52 vedovanera ceph[1893]: 2017-06-03 22:10:52.370824 7f5e9c313700  0 -- :/3217267806 >> 10.27.251.8:6789/0 pipe(0x7f5e98061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e9805c1f0).fault
Jun  3 22:10:53 vedovanera bash[2122]: starting mon.1 rank 1 at 10.27.251.8:6789/0 mon_data /var/lib/ceph/mon/ceph-1 fsid 8794c124-c2ec-4e81-8631-742992159bd6
Jun  3 22:10:56 vedovanera ceph[1893]: 2017-06-03 22:10:56.541384 7f5e9c111700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.9:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ef0).fault
Jun  3 22:11:02 vedovanera ceph[1893]: 2017-06-03 22:11:02.541210 7f5e9c313700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.7:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c006470).fault
Jun  3 22:11:08 vedovanera ceph[1893]: 2017-06-03 22:11:08.541237 7f5e9c212700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.9:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun  3 22:11:11 vedovanera ceph[1893]: 2017-06-03 22:11:11.541285 7f5e9c313700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.11:6789/0 pipe(0x7f5e8c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c00c5f0).fault
Jun  3 22:11:14 vedovanera ceph[1893]: 2017-06-03 22:11:14.541215 7f5e9c212700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.12:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun  3 22:11:17 vedovanera ceph[1893]: 2017-06-03 22:11:17.541237 7f5e9c313700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.7:6789/0 pipe(0x7f5e8c008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c00c5f0).fault
Jun  3 22:11:20 vedovanera ceph[1893]: 2017-06-03 22:11:20.541254 7f5e9c212700  0 -- 10.27.251.8:0/3217267806 >> 10.27.251.11:6789/0 pipe(0x7f5e8c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun  3 22:11:22 vedovanera ceph[1893]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.3 --keyring=/var/lib/ceph/osd/ceph-3/keyring osd crush create-or-move -- 3 1.82 host=vedovanera root=default'
Jun  3 22:11:22 vedovanera ceph[1893]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.3']' returned non-zero exit status 1
Jun  3 22:11:22 vedovanera kernel: [   43.350784] XFS (sdd1): Mounting V4 Filesystem
Jun  3 22:11:22 vedovanera kernel: [   43.447046] XFS (sdd1): Ending clean mount
Jun  3 22:11:22 vedovanera ceph[1893]: === osd.5 ===
Jun  3 22:11:26 vedovanera ceph[1893]: 2017-06-03 22:11:26.541307 7fb7b7fff700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.9:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ef0).fault
Jun  3 22:11:32 vedovanera ceph[1893]: 2017-06-03 22:11:32.541205 7fb7bc2a8700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.7:6789/0 pipe(0x7fb7b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0006470).fault
Jun  3 22:11:37 vedovanera ceph[1893]: 2017-06-03 22:11:37.570726 7fb7bc1a7700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.9:6789/0 pipe(0x7fb7b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun  3 22:11:41 vedovanera ceph[1893]: 2017-06-03 22:11:41.541195 7fb7bc2a8700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.11:6789/0 pipe(0x7fb7b0008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b000c5f0).fault
Jun  3 22:11:44 vedovanera ceph[1893]: 2017-06-03 22:11:44.541230 7fb7bc1a7700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.12:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun  3 22:11:47 vedovanera ceph[1893]: 2017-06-03 22:11:47.541209 7fb7bc2a8700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.7:6789/0 pipe(0x7fb7b0008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b000c5f0).fault
Jun  3 22:11:50 vedovanera ceph[1893]: 2017-06-03 22:11:50.541203 7fb7bc1a7700  0 -- 10.27.251.8:0/279741970 >> 10.27.251.11:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun  3 22:11:52 vedovanera ceph[1893]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.5 --keyring=/var/lib/ceph/osd/ceph-5/keyring osd crush create-or-move -- 5 0.91 host=vedovanera root=default'
Jun  3 22:11:52 vedovanera ceph[1893]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.5']' returned non-zero exit status 1
Jun  3 22:11:52 vedovanera kernel: [   73.636441] XFS (sdb1): Mounting V4 Filesystem
Jun  3 22:11:52 vedovanera kernel: [   73.734257] XFS (sdb1): Ending clean mount
Jun  3 22:11:52 vedovanera ceph[1893]: === osd.4 ===
Jun  3 22:11:53 vedovanera ceph[1893]: 2017-06-03 22:11:53.541240 7f3b187db700  0 -- :/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b14061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b1405c1f0).fault
Jun  3 22:12:02 vedovanera ceph[1893]: 2017-06-03 22:12:02.541217 7f3b186da700  0 -- 10.27.251.8:0/416680248 >> 10.27.251.7:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault
Jun  3 22:12:11 vedovanera ceph[1893]: 2017-06-03 22:12:11.541200 7f3b187db700  0 -- 10.27.251.8:0/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b08006e20 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b0800b0c0).fault
Jun  3 22:12:14 vedovanera ceph[1893]: 2017-06-03 22:12:14.541196 7f3b186da700  0 -- 10.27.251.8:0/416680248 >> 10.27.251.12:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault
Jun  3 22:12:16 vedovanera ceph[1893]: 2017-06-03 22:12:16.860344 7f3b187db700  0 -- 10.27.251.8:0/416680248 >> 10.27.251.7:6789/0 pipe(0x7f3b08006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b0800b0c0).fault
Jun  3 22:12:18 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  3 22:12:18 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  3 22:12:20 vedovanera ceph[1893]: 2017-06-03 22:12:20.541149 7f3b186da700  0 -- 10.27.251.8:0/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault

3) power return:

Jun  4 15:36:15 vedovanera ceph[1901]: === mon.1 ===
Jun  4 15:36:15 vedovanera ceph[1901]: Starting Ceph mon.1 on vedovanera...
Jun  4 15:36:15 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  4 15:36:15 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  4 15:36:15 vedovanera ceph[1901]: Running as unit ceph-mon.1.1496583375.326412133.service.
Jun  4 15:36:15 vedovanera ceph[1901]: Starting ceph-create-keys on vedovanera...
Jun  4 15:36:15 vedovanera bash[2118]: starting mon.1 rank 1 at 10.27.251.8:6789/0 mon_data /var/lib/ceph/mon/ceph-1 fsid 8794c124-c2ec-4e81-8631-742992159bd6
Jun  4 15:36:15 vedovanera kernel: [   11.701050] XFS (sda1): Mounting V4 Filesystem
Jun  4 15:36:15 vedovanera kernel: [   11.818819] ip6_tables: (C) 2000-2006 Netfilter Core Team
Jun  4 15:36:15 vedovanera kernel: [   11.828320] ip_set: protocol 6
Jun  4 15:36:15 vedovanera kernel: [   11.831759] XFS (sda1): Ending clean mount
Jun  4 15:36:15 vedovanera ceph[1901]: === osd.2 ===
Jun  4 15:36:18 vedovanera ceph[1901]: 2017-06-04 15:36:18.541236 7f60ac3d7700  0 -- :/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f60a8061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f60a805c1f0).fault
Jun  4 15:36:21 vedovanera ceph[1901]: 2017-06-04 15:36:21.541317 7f60ac2d6700  0 -- :/1951895097 >> 10.27.251.9:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c004ef0).fault
Jun  4 15:36:27 vedovanera ceph[1901]: 2017-06-04 15:36:27.541272 7f60ac1d5700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f609c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c006470).fault
Jun  4 15:36:33 vedovanera ceph[1901]: 2017-06-04 15:36:33.541255 7f60ac3d7700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.9:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c0057d0).fault
Jun  4 15:36:36 vedovanera ceph[1901]: 2017-06-04 15:36:36.541283 7f60ac1d5700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.11:6789/0 pipe(0x7f609c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00c5f0).fault
Jun  4 15:36:39 vedovanera ceph[1901]: 2017-06-04 15:36:39.541286 7f60ac3d7700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.12:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c005bb0).fault
Jun  4 15:36:42 vedovanera ceph[1901]: 2017-06-04 15:36:42.541222 7f60ac1d5700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f609c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00c5f0).fault
Jun  4 15:36:45 vedovanera ceph[1901]: 2017-06-04 15:36:45.541256 7f60ac3d7700  0 -- 10.27.251.8:0/1951895097 >> 10.27.251.11:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00fbf0).fault
Jun  4 15:36:45 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.2 --keyring=/var/lib/ceph/osd/ceph-2/keyring osd crush create-or-move -- 2 1.82 host=vedovanera root=default'
Jun  4 15:36:45 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.2']' returned non-zero exit status 1
Jun  4 15:36:46 vedovanera kernel: [   42.122846] XFS (sdc1): Mounting V4 Filesystem
Jun  4 15:36:46 vedovanera kernel: [   42.281840] XFS (sdc1): Ending clean mount
Jun  4 15:36:46 vedovanera ceph[1901]: === osd.3 ===
Jun  4 15:36:48 vedovanera ceph[1901]: 2017-06-04 15:36:48.541288 7f368c4dc700  0 -- :/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f3688061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f368805c1f0).fault
Jun  4 15:36:51 vedovanera ceph[1901]: 2017-06-04 15:36:51.541359 7f368c3db700  0 -- :/1031853535 >> 10.27.251.9:6789/0 pipe(0x7f367c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c004ef0).fault
Jun  4 15:36:57 vedovanera ceph[1901]: 2017-06-04 15:36:57.541267 7f368c2da700  0 -- 10.27.251.8:0/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f367c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c006470).fault
Jun  4 15:37:06 vedovanera ceph[1901]: 2017-06-04 15:37:06.541330 7f368c3db700  0 -- 10.27.251.8:0/1031853535 >> 10.27.251.11:6789/0 pipe(0x7f367c0080e0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c00e0d0).fault
Jun  4 15:37:09 vedovanera ceph[1901]: 2017-06-04 15:37:09.541277 7f368c2da700  0 -- 10.27.251.8:0/1031853535 >> 10.27.251.12:6789/0 pipe(0x7f367c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c00eb30).fault
Jun  4 15:37:12 vedovanera ceph[1901]: 2017-06-04 15:37:12.541280 7f368c3db700  0 -- 10.27.251.8:0/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f367c0080e0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c0103e0).fault
Jun  4 15:37:15 vedovanera ceph[1901]: 2017-06-04 15:37:15.541249 7f368c2da700  0 -- 10.27.251.8:0/1031853535 >> 10.27.251.11:6789/0 pipe(0x7f367c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c012120).fault
Jun  4 15:37:16 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.3 --keyring=/var/lib/ceph/osd/ceph-3/keyring osd crush create-or-move -- 3 1.82 host=vedovanera root=default'
Jun  4 15:37:16 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.3']' returned non-zero exit status 1
Jun  4 15:37:16 vedovanera kernel: [   72.458631] XFS (sdd1): Mounting V4 Filesystem
Jun  4 15:37:16 vedovanera kernel: [   72.538849] XFS (sdd1): Ending clean mount
Jun  4 15:37:16 vedovanera ceph[1901]: === osd.5 ===
Jun  4 15:37:27 vedovanera ceph[1901]: 2017-06-04 15:37:27.545272 7fc898df7700  0 -- 10.27.251.8:0/3502424686 >> 10.27.251.7:6789/0 pipe(0x7fc88c000da0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c005040).fault
Jun  4 15:37:36 vedovanera ceph[1901]: 2017-06-04 15:37:36.545352 7fc898ef8700  0 -- 10.27.251.8:0/3502424686 >> 10.27.251.11:6789/0 pipe(0x7fc88c006e20 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00e020).fault
Jun  4 15:37:39 vedovanera ceph[1901]: 2017-06-04 15:37:39.545309 7fc898df7700  0 -- 10.27.251.8:0/3502424686 >> 10.27.251.12:6789/0 pipe(0x7fc88c000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00eb90).fault
Jun  4 15:37:42 vedovanera ceph[1901]: 2017-06-04 15:37:42.545271 7fc898ef8700  0 -- 10.27.251.8:0/3502424686 >> 10.27.251.7:6789/0 pipe(0x7fc88c006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c0104a0).fault
Jun  4 15:37:45 vedovanera ceph[1901]: 2017-06-04 15:37:45.545256 7fc898df7700  0 -- 10.27.251.8:0/3502424686 >> 10.27.251.11:6789/0 pipe(0x7fc88c000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00cbc0).fault
Jun  4 15:37:46 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.5 --keyring=/var/lib/ceph/osd/ceph-5/keyring osd crush create-or-move -- 5 0.91 host=vedovanera root=default'
Jun  4 15:37:46 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.5']' returned non-zero exit status 1
Jun  4 15:37:46 vedovanera kernel: [  102.738714] XFS (sdb1): Mounting V4 Filesystem
Jun  4 15:37:46 vedovanera kernel: [  102.814995] XFS (sdb1): Ending clean mount
Jun  4 15:37:46 vedovanera ceph[1901]: === osd.4 ===
Jun  4 15:37:55 vedovanera ceph[1901]: 2017-06-04 15:37:55.921892 7f0c142f7700  0 -- 10.27.251.8:0/398750406 >> 10.27.251.7:6789/0 pipe(0x7f0c00000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f0c00005040).fault
Jun  4 15:38:01 vedovanera ceph[1901]: 2017-06-04 15:38:01.215940 7f0c154fb700 -1 monclient: _check_auth_rotating possible clock skew, rotating keys expired way too early (before 2017-06-04 14:38:01.215935)
Jun  4 15:38:01 vedovanera ceph[1901]: create-or-move updated item name 'osd.4' weight 0.91 at location {host=vedovanera,root=default} to crush map
Jun  4 15:38:01 vedovanera ceph[1901]: Starting Ceph osd.4 on vedovanera...
Jun  4 15:38:01 vedovanera ceph[1901]: Running as unit ceph-osd.4.1496583466.847479891.service.
Jun  4 15:38:01 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun  4 15:38:01 vedovanera ceph[1901]: ceph-disk: Error: One or more partitions failed to activate
Jun  4 15:38:01 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun  4 15:38:01 vedovanera bash[4907]: starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /var/lib/ceph/osd/ceph-4/journal
Jun  4 15:38:03 vedovanera systemd[1]: Startup finished in 2.081s (kernel) + 1min 57.357s (userspace) = 1min 59.439s.

4) i've start the faulty OSD via web interface:

Jun  4 16:02:30 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun  4 16:02:30 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun  4 16:02:30 vedovanera bash[16353]: starting osd.5 at :/0 osd_data /var/lib/ceph/osd/ceph-5 /var/lib/ceph/osd/ceph-5/journal
Jun  4 16:03:34 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun  4 16:03:34 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun  4 16:03:34 vedovanera bash[17125]: starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
Jun  4 16:04:43 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun  4 16:04:43 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768;  /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun  4 16:04:43 vedovanera bash[18009]: starting osd.3 at :/0 osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal



Looking at logs for osd.2, for example, there's no ''intermediate''
logs, eg:

2017-06-02 19:07:20.970337 7f4065e40700  0 log_channel(cluster) log [INF] : 3.fe deep-scrub starts
2017-06-02 19:07:22.474508 7f4065e40700  0 log_channel(cluster) log [INF] : 3.fe deep-scrub ok
2017-06-02 19:07:27.283142 7f4060f29700  0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6805/12122 pipe(0x26581000 sd=37 :40028 s=2 pgs=65 cs=1 l=0 c=0x3be1700).fault with nothing to send, going to standby
2017-06-02 19:07:27.286888 7f405c3e5700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6807/12122 pipe(0x27da2000 sd=47 :0 s=1 pgs=0 cs=0 l=1 c=0x307343c0).fault
2017-06-02 19:07:27.286934 7f405c4e6700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6806/12122 pipe(0x27d90000 sd=149 :0 s=1 pgs=0 cs=0 l=1 c=0x1c611760).fault
2017-06-02 19:07:27.287907 7f406142e700  0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6821/11698 pipe(0x26541000 sd=27 :47418 s=2 pgs=66 cs=1 l=0 c=0x3be1180).fault with nothing to send, going to standby
2017-06-02 19:07:27.290979 7f405e809700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6822/11698 pipe(0x199c3000 sd=152 :0 s=1 pgs=0 cs=0 l=1 c=0x1c612100).fault
2017-06-02 19:07:27.291636 7f405f213700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6823/11698 pipe(0x27a50000 sd=153 :0 s=1 pgs=0 cs=0 l=1 c=0x1c612ec0).fault
2017-06-02 19:07:27.301191 7f405b9db700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6826/12444 pipe(0x13bec000 sd=152 :0 s=1 pgs=0 cs=0 l=1 c=0x1c6123c0).fault
2017-06-02 19:07:27.301287 7f4060e28700  0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6825/12444 pipe(0x266a6000 sd=65 :43868 s=2 pgs=63 cs=1 l=0 c=0x3be1860).fault with nothing to send, going to standby
2017-06-02 19:07:27.301317 7f405b5d7700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6827/12444 pipe(0x27c94000 sd=153 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce9860).fault
2017-06-02 19:07:27.329514 7f406122c700  0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6817/11123 pipe(0x5166000 sd=29 :33652 s=2 pgs=72 cs=1 l=0 c=0x3be0ec0).fault, initiating reconnect
2017-06-02 19:07:27.331358 7f407e0f5700  0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6817/11123 pipe(0x5166000 sd=29 :33652 s=1 pgs=72 cs=2 l=0 c=0x3be0ec0).fault
2017-06-02 19:07:27.334450 7f405bee0700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6818/11123 pipe(0x27deb000 sd=154 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce6f20).fault
2017-06-02 19:07:27.334506 7f405bddf700  0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6819/11123 pipe(0x27de1000 sd=155 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce8c00).fault
2017-06-04 16:03:34.410880 7feecc646880  0 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-osd, pid 17130
2017-06-04 16:03:34.503807 7feecc646880  0 filestore(/var/lib/ceph/osd/ceph-2) backend xfs (magic 0x58465342)
2017-06-04 16:03:34.791701 7feecc646880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is supported and appears to work
2017-06-04 16:03:34.791712 7feecc646880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2017-06-04 16:03:34.807974 7feecc646880  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2017-06-04 16:03:34.808049 7feecc646880  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_feature: extsize is disabled by conf
2017-06-04 16:03:36.686401 7feecc646880  0 filestore(/var/lib/ceph/osd/ceph-2) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2017-06-04 16:03:42.786238 7feecc646880  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 20: 49964625920 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-06-04 16:03:42.901178 7feecc646880  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 20: 49964625920 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-06-04 16:03:42.942584 7feecc646880  0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
2017-06-04 16:03:43.034138 7feecc646880  0 osd.2 4002 crush map has features 1107558400, adjusting msgr requires for clients
2017-06-04 16:03:43.034150 7feecc646880  0 osd.2 4002 crush map has features 1107558400 was 8705, adjusting msgr requires for mons
2017-06-04 16:03:43.034162 7feecc646880  0 osd.2 4002 crush map has features 1107558400, adjusting msgr requires for osds
2017-06-04 16:03:43.034173 7feecc646880  0 osd.2 4002 load_pgs
2017-06-04 16:04:15.169521 7feecc646880  0 osd.2 4002 load_pgs opened 253 pgs
2017-06-04 16:04:15.178051 7feecc646880 -1 osd.2 4002 log_to_monitors {default=true}
2017-06-04 16:04:15.201447 7feeb6454700  0 osd.2 4002 ignoring osdmap until we have initialized


I hope can be useful. Thanks.

-- 
dott. Marco Gaiarin				        GNUPG Key ID: 240A3D66
  Associazione ``La Nostra Famiglia''          http://www.lanostrafamiglia.it/
  Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
  marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797

		Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
      http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
	(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)



More information about the pve-user mailing list