[PVE-User] Proxmox CEPH 6 servers failures!

Gilberto Nunes gilberto.nunes32 at gmail.com
Fri Oct 5 18:26:39 CEST 2018


>> How many monitor nodes do you have, and where are they located?

Before
SIDE-A
pve-ceph01 - 1 mon
pve-ceph02 - 1 mon
pve-ceph03 - 1 mon

SIDE-B
pve-ceph04 - 1 mon
pve-ceph05 - 1 mon
pve-ceph06 - 1 mon

Now
SIDE-A
pve-ceph01 - 1 mon
pve-ceph02 - 1 mon
pve-ceph03 - 1 mon

SIDE-B
pve-ceph04 - 1 mon
pve-ceph05 - 1 mon
pve-ceph06 - < I remove this monitor >

https://imageshack.com/a/img923/4214/i2ugyC.png


---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em sex, 5 de out de 2018 às 13:16, Lindsay Mathieson <
lindsay.mathieson at gmail.com> escreveu:

> Your Ceph cluster requires quorum to operate and that is based on your
> monitor nodes, not the OSD ones, which your diagram earlier doesn't detail.
>
> How many monitor nodes do you have, and where are they located?
>
> nb. You should only have an odd number of monitor nodes.
>
>
> On 5/10/2018 10:53 PM, Gilberto Nunes wrote:
> > Folks...
> >
> > I CEPH servers are in the same network: 10.10.10.0/24...
> > There is a optic channel between the builds: buildA and buildB, just to
> > identified!
> > When I create the cluster in first time, 3 servers going down in buildB,
> > and the remain ceph servers continued to worked properly...
> > I do not understand why now this cant happens anymore!
> > Sorry if I sound like a newbie! I still learn about it!
> > ---
> > Gilberto Nunes Ferreira
> >
> > (47) 3025-5907
> > (47) 99676-7530 - Whatsapp / Telegram
> >
> > Skype: gilberto.nunes36
> >
> >
> >
> >
> >
> > Em sex, 5 de out de 2018 às 09:44, Marcus Haarmann <
> > marcus.haarmann at midoco.de> escreveu:
> >
> >> Gilberto,
> >>
> >> the underlying problem is a ceph problem and not related to VMs or
> >> Proxmox.
> >> The ceph system requires a mayority of monitor nodes to be active.
> >> Your setup seems to have 3 mon nodes, which results in a loss of quorum
> >> when two of these servers are gone.
> >> Check "ceph -s" on each side if you see any reaction of ceph.
> >> If not, probably not enough mons are present.
> >>
> >> Also, when one side is down you should see a non-presence of some OSD
> >> instances.
> >> In this case, ceph might be up but your VMs which are spread over the
> OSD
> >> disks,
> >> might block because of the non-accessibility of the primary storage.
> >> The distribution of data over the OSD instances is steered by the crush
> >> map.
> >> You should make sure to have enough copies configured and the crush map
> >> set up in a way
> >> that on each side of your cluster is minimum one copy.
> >> In case the crush map is mis-configured, all copies of your data may be
> on
> >> the wrong side,
> >> esulting in proxmox not being able to access the VM data.
> >>
> >> Marcus Haarmann
> >>
> >>
> >> Von: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >> An: "pve-user" <pve-user at pve.proxmox.com>
> >> Gesendet: Freitag, 5. Oktober 2018 14:31:20
> >> Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!
> >>
> >> Nice.. Perhaps if I create a VM in Proxmox01 and Proxmox02, and join
> this
> >> VM into Cluster Ceph, can I solve to quorum problem?
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em sex, 5 de out de 2018 às 09:23, dorsy <dorsyka at yahoo.com> escreveu:
> >>
> >>> Your question has already been answered. You need majority to have
> >> quorum.
> >>> On 2018. 10. 05. 14:10, Gilberto Nunes wrote:
> >>>> Hi
> >>>> Perhaps this can help:
> >>>>
> >>>> https://imageshack.com/a/img921/6208/X7ha8R.png
> >>>>
> >>>> I was thing about it, and perhaps if I deploy a VM in both side, with
> >>>> Proxmox and add this VM to the CEPH cluster, maybe this can help!
> >>>>
> >>>> thanks
> >>>> ---
> >>>> Gilberto Nunes Ferreira
> >>>>
> >>>> (47) 3025-5907
> >>>> (47) 99676-7530 - Whatsapp / Telegram
> >>>>
> >>>> Skype: gilberto.nunes36
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Em sex, 5 de out de 2018 às 03:55, Alexandre DERUMIER <
> >>> aderumier at odiso.com>
> >>>> escreveu:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> Can you resend your schema, because it's impossible to read.
> >>>>>
> >>>>>
> >>>>> but you need to have to quorum on monitor to have the cluster
> >> working.
> >>>>>
> >>>>> ----- Mail original -----
> >>>>> De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >>>>> À: "proxmoxve" <pve-user at pve.proxmox.com>
> >>>>> Envoyé: Jeudi 4 Octobre 2018 22:05:16
> >>>>> Objet: [PVE-User] Proxmox CEPH 6 servers failures!
> >>>>>
> >>>>> Hi there
> >>>>>
> >>>>> I have something like this:
> >>>>>
> >>>>> CEPH01 ----|
> >>>>> |----- CEPH04
> >>>>> |
> >>>>> |
> >>>>> CEPH02
> >> ----|-----------------------------------------------------|----
> >>>>> CEPH05
> >>>>> | Optic Fiber
> >>>>> |
> >>>>> CEPH03 ----|
> >>>>> |--- CEPH06
> >>>>>
> >>>>> Sometime, when Optic Fiber not work, and just CEPH01, CEPH02 and
> >> CEPH03
> >>>>> remains, the entire cluster fail!
> >>>>> I find out the cause!
> >>>>>
> >>>>> ceph.conf
> >>>>>
> >>>>> [global] auth client required = cephx auth cluster required = cephx
> >> auth
> >>>>> service required = cephx cluster network = 10.10.10.0/24 fsid =
> >>>>> e67534b4-0a66-48db-ad6f-aa0868e962d8 keyring =
> >>>>> /etc/pve/priv/$cluster.$name.keyring mon allow pool delete = true osd
> >>>>> journal size = 5120 osd pool default min size = 2 osd pool default
> >> size
> >>> =
> >>>>> 3
> >>>>> public network = 10.10.10.0/24 [osd] keyring =
> >>>>> /var/lib/ceph/osd/ceph-$id/keyring [mon.pve-ceph01] host = pve-ceph01
> >>> mon
> >>>>> addr = 10.10.10.100:6789 mon osd allow primary affinity = true
> >>>>> [mon.pve-ceph02] host = pve-ceph02 mon addr = 10.10.10.110:6789 mon
> >> osd
> >>>>> allow primary affinity = true [mon.pve-ceph03] host = pve-ceph03 mon
> >>> addr
> >>>>> =
> >>>>> 10.10.10.120:6789 mon osd allow primary affinity = true
> >>> [mon.pve-ceph04]
> >>>>> host = pve-ceph04 mon addr = 10.10.10.130:6789 mon osd allow primary
> >>>>> affinity = true [mon.pve-ceph05] host = pve-ceph05 mon addr =
> >>>>> 10.10.10.140:6789 mon osd allow primary affinity = true
> >>> [mon.pve-ceph06]
> >>>>> host = pve-ceph06 mon addr = 10.10.10.150:6789 mon osd allow primary
> >>>>> affinity = true
> >>>>>
> >>>>> Any help will be welcome!
> >>>>>
> >>>>> ---
> >>>>> Gilberto Nunes Ferreira
> >>>>>
> >>>>> (47) 3025-5907
> >>>>> (47) 99676-7530 - Whatsapp / Telegram
> >>>>>
> >>>>> Skype: gilberto.nunes36
> >>>>> _______________________________________________
> >>>>> pve-user mailing list
> >>>>> pve-user at pve.proxmox.com
> >>>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>>>>
> >>>>> _______________________________________________
> >>>>> pve-user mailing list
> >>>>> pve-user at pve.proxmox.com
> >>>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>>>>
> >>>> _______________________________________________
> >>>> pve-user mailing list
> >>>> pve-user at pve.proxmox.com
> >>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>> _______________________________________________
> >>> pve-user mailing list
> >>> pve-user at pve.proxmox.com
> >>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>>
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
> --
> Lindsay
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>



More information about the pve-user mailing list