[PVE-User] Multicast problems with Intel X540 - 10Gtek network card?

Eneko Lacunza elacunza at binovo.es
Wed Dec 5 11:08:33 CET 2018


Hi Ronny,

El 4/12/18 a las 20:03, Ronny Aasen escribió:
> vmbr10 is a bridge (or as switch by another name)
It is a bridge in proxmox1 host.
> if you want the switch to work reliably with multicast you probably 
> need to enable multicast querier.
> |echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_querier|
I tried enabling the querier on the Dell switch, but didn't fix the issue...
>
> or you can disable snooping, so that it treats multicast as broadcast. |
> echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping|
>
> this problem with multicast traffic also may lead to unreliable ipv6 
> nd  and nd-ra usage.
> https://pve.proxmox.com/wiki/Multicast_notes have some more notes and 
> exampes around mulicast_querier
>

Yes, I read that page, but the issue didn't make sense to me (doesn't 
make yet... ;)  )

Seems that using a network port on another switch, although not used for 
multicast, was confusing someone...

Thanks a lot
Eneko

> kind regards
> Ronny Aasen
>
>
>
> On 04.12.2018 17:54, Eneko Lacunza wrote:
>> Hi all,
>>
>> Seems I found the solution.
>>
>> eth3 on proxmox1 is a broadcom 1gbit card connected to HPE switch; it 
>> is VLAN 10 untagged on the switch end.
>>
>> I changed the vmbr10 bridge to use eth4.10 on the X540 card, and 
>> after ifdown/ifup and corosync and pve-cluster restart, now 
>> everything seems good; cluster is stable and omping is happy too 
>> after 10 minutes :)
>>
>> It is strange because multicast is on VLAN 1 network...
>>
>> Cheers and thanks a lot
>> Eneko
>>
>> El 4/12/18 a las 16:18, Eneko Lacunza escribió:
>>>
>>> hi Marcus,
>>>
>>> El 4/12/18 a las 16:09, Marcus Haarmann escribió:
>>>> Hi,
>>>>
>>>> you did not provide details about your configuration.
>>>> How is the network card set up ? Bonding ?
>>>> Send your /etc/network/interfaces details.
>>>> If bonding is active, check if the mode is correct in 
>>>> /proc/net/bonding.
>>>> We encountered differences between /etc/network/interfaces setup 
>>>> and resulting mode.
>>>> Also, check your switch configuration, VLAN setup, MTU etc.
>>> Yes, sorry about that. I have double checked the switch and all 3 
>>> node SFP+ port have the same configuration.
>>>
>>> /etc/network/interfaces  in proxmox1 node:
>>> auto lo
>>> iface lo inet loopback
>>> iface eth0 inet manual
>>> iface eth1 inet manual
>>> iface eth2 inet manual
>>> iface eth3 inet manual
>>> iface eth4 inet manual
>>> iface eth5 inet manual
>>>
>>> auto vmbr10
>>> iface vmbr10 inet static
>>>     address  192.168.10.201
>>>     netmask  255.255.255.0
>>>     bridge_ports eth3
>>>     bridge_stp off
>>>     bridge_fd 0
>>>
>>> auto vmbr0
>>> iface vmbr0 inet static
>>>     address  192.168.0.201
>>>     netmask  255.255.255.0
>>>     gateway  192.168.0.100
>>>     bridge_ports eth4
>>>     bridge_stp off
>>>     bridge_fd 0
>>>
>>> auto eth4.100
>>> iface eth4.100 inet static
>>>     address 10.0.2.1
>>>     netmask 255.255.255.0
>>>     up ip addr add 10.0.3.1/24 dev eth4.100
>>>
>>> Cluster is running on vmbr0 network (192.168.0.0/24)
>>>
>>> Cheers
>>>
>>>>
>>>> Marcus Haarmann
>>>>
>>>>
>>>> Von: "Eneko Lacunza" <elacunza at binovo.es>
>>>> An: "pve-user" <pve-user at pve.proxmox.com>
>>>> Gesendet: Dienstag, 4. Dezember 2018 15:57:10
>>>> Betreff: [PVE-User] Multicast problems with Intel X540 - 10Gtek 
>>>> network card?
>>>>
>>>> Hi all,
>>>>
>>>> We have just updated a 3-node Proxmox cluster from 3.4 to 5.2, Ceph
>>>> hammer to Luminous and the network from 1 Gbit to 10Gbit... one of the
>>>> three Proxmox nodes is new too :)
>>>>
>>>> Generally all was good and VMs are working well. :-)
>>>>
>>>> BUT, we have some problems with the cluster; promxox1 node joins and
>>>> then after about 4 minutes drops from the cluster.
>>>>
>>>> All multicast tests
>>>> https://pve.proxmox.com/wiki/Multicast_notes#Using_omping_to_test_multicast 
>>>>
>>>> run fine except the last one:
>>>>
>>>> *** proxmox1:
>>>>
>>>> root at proxmox1:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>>>>
>>>> proxmox3 : waiting for response msg
>>>>
>>>> proxmox4 : waiting for response msg
>>>>
>>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox3 : given amount of query messages was sent
>>>>
>>>> proxmox4 : given amount of query messages was sent
>>>>
>>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.073/0.184/0.390/0.061
>>>>
>>>> proxmox3 : multicast, xmt/rcv/%loss = 600/262/56%, 
>>>> min/avg/max/std-dev = 0.092/0.207/0.421/0.068
>>>>
>>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.049/0.167/0.369/0.059
>>>>
>>>> proxmox4 : multicast, xmt/rcv/%loss = 600/262/56%, 
>>>> min/avg/max/std-dev = 0.063/0.185/0.386/0.064
>>>>
>>>>
>>>> *** proxmox3:
>>>>
>>>> root at proxmox3:/etc# omping -c 600 -i 1 -F -q proxmox1 proxmox3 
>>>> proxmox4
>>>>
>>>> proxmox1 : waiting for response msg
>>>>
>>>> proxmox4 : waiting for response msg
>>>>
>>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox1 : waiting for response msg
>>>>
>>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox4 : given amount of query messages was sent
>>>>
>>>> proxmox1 : given amount of query messages was sent
>>>>
>>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.083/0.193/1.030/0.055
>>>>
>>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>>> min/avg/max/std-dev = 0.102/0.209/1.050/0.054
>>>>
>>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.041/0.108/0.172/0.026
>>>>
>>>> proxmox4 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>>> min/avg/max/std-dev = 0.048/0.123/0.190/0.030
>>>>
>>>>
>>>> *** root at proxmox4:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 
>>>> proxmox4
>>>>
>>>> proxmox1 : waiting for response msg
>>>>
>>>> proxmox3 : waiting for response msg
>>>>
>>>> proxmox1 : waiting for response msg
>>>>
>>>> proxmox3 : waiting for response msg
>>>>
>>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>>
>>>> proxmox1 : given amount of query messages was sent
>>>>
>>>> proxmox3 : given amount of query messages was sent
>>>>
>>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.085/0.188/0.356/0.040
>>>>
>>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>>> min/avg/max/std-dev = 0.114/0.208/0.377/0.041
>>>>
>>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>>> = 0.048/0.117/0.289/0.023
>>>>
>>>> proxmox3 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>>> min/avg/max/std-dev = 0.064/0.134/0.290/0.026
>>>>
>>>>
>>>> Ok, so it seems we have a network problem on proxmox1 node. Network
>>>> cards are as follows:
>>>>
>>>> - proxmox1: Intel X540 (10Gtek)
>>>> - proxmox3: Intel X710 (Intel)
>>>> - proxmox4: Intel X710 (Intel)
>>>>
>>>> Switch is Dell N1224T-ON.
>>>>
>>>> Does anyone have experience with Intel X540 chip network cards or 
>>>> Linux
>>>> ixgbe network driver or 10Gtek manufacturer?
>>>>
>>>> If we change corosync communication to 1 Gbit network cards (broadcom)
>>>> connected to an old HPE 1800-24G switch, cluster is stable...
>>>>
>>>> We also have a running cluster with Dell n1224T-ON switch and X710
>>>> network cards without issues.
>>>>
>>>> Thanks a lot
>>>> Eneko
>>>>
>>>>
>>>
>>>
>>
>>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list