[PVE-User] I lost the cluster communication in a 10 nodes cluster

proxmox at elchaka.de proxmox at elchaka.de
Sat Oct 27 10:39:35 CEST 2018


Hi Denis,

I dont know why it happen, but would Hey to Switch to unicast. This helped me in the past  where i had a similar issue. I thought it had to be an issue with the Kernel, where some default value had changed...

- Mehmet 

Am 19. Oktober 2018 09:36:11 MESZ schrieb Alwin Antreich <aa at ipnerd.net>:
>Hi,
>
>On Thu, Oct 18, 2018, 17:24 Denis Morejon <denis.morejon at etecsa.cu>
>wrote:
>
>> I lost the cluster communication again.
>>
>> I have been using Proxmox since version 1, and this is the first time
>It
>> bothers me so much!
>>
>> - All the 10 nodes have the same version
>>
>> (pve-manager/5.2-9/4b30e8f9 (running kernel: 4.13.13-2-pve))
>>
>Is there a reason why you use an old kernel? 4.15.x is now the main
>kernel.
>
>
>> - All they have the same date / time (It is one of the causes It
>could
>> lose the communication)
>>
>> - The environment is ident (No new switch, no new server)
>>
>>
>> And why all these nodes lost the communication at the same time ? If
>> they are 10 at least 5 have to be with problems to lost the quorum
>and
>> then the connection. Is it true?
>>
>It is actually, (10/2)-1 that can have trouble without loosing the
>quorum,
>one partition needs to be bigger.
>
>
>> I think it is something related to this proxmox version.
>>
>> What to do ?
>>
>As Thomas stated, check you multicast traffic. Corosync uses multicast
>for
>it's cluster communication and the cluster filesystem sits on top of
>corosync. So, if corosync is not working, neither is the pmxcfs.
>
>--
>Cheers,
>Alwin
>_______________________________________________
>pve-user mailing list
>pve-user at pve.proxmox.com
>https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user



More information about the pve-user mailing list