[PVE-User] Cephfs starting 2nd MDS

Ronny Aasen ronny+pve-user at aasen.cx
Wed Aug 8 09:20:02 CEST 2018


your .conf references mds.1   (id =1)
but your command starts the mds with id=scvirt03

so the block in ceph.conf is not used.
replace [mds.1] with  [mds.scvirt03]

btw: iirc you can not have just numerical id's for mds's for some 
versions now, so mds.1 would not be valid either.


kind regards
Ronny Aasen


On 08. aug. 2018 07:54, Vadim Bulst wrote:
> Hi Alwin,
> 
> thanks for your advise. But no success. Still same error.
> 
> mds-section:
> 
> [mds.1]
>          host = scvirt03
>          keyring = /var/lib/ceph/mds/ceph-scvirt03/keyring
> 
> Vadim
> 
> 
> On 07.08.2018 15:30, Alwin Antreich wrote:
>> Hello Vadim,
>>
>> On Tue, Aug 7, 2018, 12:13 Vadim Bulst <vadim.bulst at bbz.uni-leipzig.de>
>> wrote:
>>
>>> Dear list,
>>>
>>> I'm trying to bring up a second mds with no luck.
>>>
>>> This is what my ceph.conf looks like:
>>>
>>> [global]
>>>
>>>             auth client required = cephx
>>>             auth cluster required = cephx
>>>             auth service required = cephx
>>>             cluster network = 10.10.144.0/24
>>>             filestore xattr use omap = true
>>>             fsid = 5349724e-fa96-4fd6-8e44-8da2a39253f7
>>>             keyring = /etc/pve/priv/$cluster.$name.keyring
>>>             osd journal size = 5120
>>>             osd pool default min size = 1
>>>             public network = 172.18.144.0/24
>>>             mon allow pool delete = true
>>>
>>> [osd]
>>>             keyring = /var/lib/ceph/osd/ceph-$id/keyring
>>>
>>> [mon.2]
>>>             host = scvirt03
>>>             mon addr = 172.18.144.243:6789
>>>
>>> [mon.0]
>>>             host = scvirt01
>>>             mon addr = 172.18.144.241:6789
>>> [mon.1]
>>>             host = scvirt02
>>>             mon addr = 172.18.144.242:6789
>>>
>>> [mds.0]
>>>            host = scvirt02
>>> [mds.1]
>>>            host = scvirt03
>>>
>>>
>>> I did the following to set up the service:
>>>
>>> apt install ceph-mds
>>>
>>> mkdir /var/lib/ceph/mds
>>>
>>> mkdir /var/lib/ceph/mds/ceph-$(hostname -s)
>>>
>>> chown -R ceph:ceph /var/lib/ceph/mds
>>>
>>> chmod -R 0750 /var/lib/ceph/mds
>>>
>>> ceph auth get-or-create mds.$(hostname -s) mon 'allow profile mds' mgr
>>> 'allow profile mds' osd 'allow rwx' mds 'allow' >
>>> /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
>>>
>>> chmod -R 0600 /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
>>>
>>> systemctl enable ceph-mds@$(hostname -s).service
>>>
>>> systemctl start ceph-mds@$(hostname -s).service
>>>
>>>
>>> The service will not start. I also did the same procedure with the first
>>> mds which is running with no problems.
>>>
>>> 1st mds:
>>>
>>> root at scvirt02:/home/urzadmin# systemctl status -l ceph-mds@$(hostname
>>> -s).service
>>>ceph-mds at scvirt02.service - Ceph metadata server daemon
>>>       Loaded: loaded (/lib/systemd/system/ceph-mds at .service; enabled;
>>> vendor preset: enabled)
>>>      Drop-In: /lib/systemd/system/ceph-mds at .service.d
>>>               └─ceph-after-pve-cluster.conf
>>>       Active: active (running) since Thu 2018-06-07 13:08:58 CEST; 2
>>> months 0 days ago
>>>     Main PID: 612704 (ceph-mds)
>>>       CGroup:
>>> /system.slice/system-ceph\x2dmds.slice/ceph-mds at scvirt02.service
>>>               └─612704 /usr/bin/ceph-mds -f --cluster ceph --id scvirt02
>>> --setuser ceph --setgroup ceph
>>>
>>> Jul 29 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-29 06:25:01.792601
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 3831071 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Jul 30 06:25:02 scvirt02 ceph-mds[612704]: 2018-07-30 06:25:02.081591
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 184355 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Jul 31 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-31 06:25:01.448571
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 731440 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 01 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-01 06:25:01.274541
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 1278492 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 02 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-02 06:25:02.009054
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 1825500 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 03 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-03 06:25:02.042845
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 2372815 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 04 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-04 06:25:01.404619
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 2919837 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 05 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-05 06:25:01.214749
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 3467000 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 06 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-06 06:25:01.149512
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 4014197 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>> Aug 07 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-07 06:25:01.863104
>>> 7f6e4bae0700 -1 received  signal: Hangup from  PID: 367698 task name:
>>> killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  
>>> UID: 0
>>>
>>> 2nd mds:
>>>
>>> root at scvirt03:/home/urzadmin# systemctl status -l ceph-mds@$(hostname
>>> -s).service
>>>ceph-mds at scvirt03.service - Ceph metadata server daemon
>>>       Loaded: loaded (/lib/systemd/system/ceph-mds at .service; enabled;
>>> vendor preset: enabled)
>>>      Drop-In: /lib/systemd/system/ceph-mds at .service.d
>>>               └─ceph-after-pve-cluster.conf
>>>       Active: inactive (dead) since Tue 2018-08-07 10:27:18 CEST; 1h
>>> 38min ago
>>>      Process: 3620063 ExecStart=/usr/bin/ceph-mds -f --cluster 
>>> ${CLUSTER}
>>> --id scvirt03 --setuser ceph --setgroup ceph (code=exited,
>>> status=0/SUCCESS)
>>>     Main PID: 3620063 (code=exited, status=0/SUCCESS)
>>>
>>> Aug 07 10:27:17 scvirt03 systemd[1]: Started Ceph metadata server 
>>> daemon.
>>> Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: starting mds.scvirt03 at -
>>> Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008338
>>> 7f6be03816c0 -1 auth: unable to find a keyring on
>>> /etc/pve/priv/ceph.mds.scvirt03.keyring: (13) Permission denied
>>> Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008351
>>> 7f6be03816c0 -1 mds.scvirt03 ERROR: failed to get monmap: (13)
>>> Permission denied
>>>
>>>
>>> content of /etc/pve/priv
>>>
>>> root at scvirt03:/home/urzadmin# ls -la /etc/pve/priv/
>>> total 5
>>> drwx------ 2 root www-data    0 Apr 15  2017 .
>>> drwxr-xr-x 2 root www-data    0 Jan  1  1970 ..
>>> -rw------- 1 root www-data 1675 Apr 15  2017 authkey.key
>>> -rw------- 1 root www-data 1976 Jul  6 15:41 authorized_keys
>>> drwx------ 2 root www-data    0 Apr 16  2017 ceph
>>> -rw------- 1 root www-data   63 Apr 15  2017 ceph.client.admin.keyring
>>> -rw------- 1 root www-data  214 Apr 15  2017 ceph.mon.keyring
>>> -rw------- 1 root www-data 4224 Jul  6 15:41 known_hosts
>>> drwx------ 2 root www-data    0 Apr 15  2017 lock
>>> -rw------- 1 root www-data 3243 Apr 15  2017 pve-root-ca.key
>>> -rw------- 1 root www-data    3 Jul  6 15:41 pve-root-ca.srl
>>> -rw------- 1 root www-data   36 May 23 13:03 urzbackup.cred
>>>
>>>
>>> What could be the reason this failure?
>>>
>> The ceph user has no permissions to access the the keyring under 
>> /etc/pve.
>> Add a section for [mds] into the ceph.conf pointing to the keyring, 
>> similar
>> to the OSD one. This way the MDS will find the key in It's working
>> directory.
>>
>> Cheers,
>> Alwin
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user




More information about the pve-user mailing list