[PVE-User] Multicast problems with Intel X540 - 10Gtek network card?

Ronny Aasen ronny+pve-user at aasen.cx
Tue Dec 4 20:03:17 CET 2018


vmbr10 is a bridge (or as switch by another name)
if you want the switch to work reliably with multicast you probably need 
to enable multicast querier.
|echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_querier|

or you can disable snooping, so that it treats multicast as broadcast. |
echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping|

this problem with multicast traffic also may lead to unreliable ipv6 nd  
and nd-ra usage.
https://pve.proxmox.com/wiki/Multicast_notes have some more notes and 
exampes around mulicast_querier

kind regards
Ronny Aasen



On 04.12.2018 17:54, Eneko Lacunza wrote:
> Hi all,
>
> Seems I found the solution.
>
> eth3 on proxmox1 is a broadcom 1gbit card connected to HPE switch; it 
> is VLAN 10 untagged on the switch end.
>
> I changed the vmbr10 bridge to use eth4.10 on the X540 card, and after 
> ifdown/ifup and corosync and pve-cluster restart, now everything seems 
> good; cluster is stable and omping is happy too after 10 minutes :)
>
> It is strange because multicast is on VLAN 1 network...
>
> Cheers and thanks a lot
> Eneko
>
> El 4/12/18 a las 16:18, Eneko Lacunza escribió:
>>
>> hi Marcus,
>>
>> El 4/12/18 a las 16:09, Marcus Haarmann escribió:
>>> Hi,
>>>
>>> you did not provide details about your configuration.
>>> How is the network card set up ? Bonding ?
>>> Send your /etc/network/interfaces details.
>>> If bonding is active, check if the mode is correct in 
>>> /proc/net/bonding.
>>> We encountered differences between /etc/network/interfaces setup and 
>>> resulting mode.
>>> Also, check your switch configuration, VLAN setup, MTU etc.
>> Yes, sorry about that. I have double checked the switch and all 3 
>> node SFP+ port have the same configuration.
>>
>> /etc/network/interfaces  in proxmox1 node:
>> auto lo
>> iface lo inet loopback
>> iface eth0 inet manual
>> iface eth1 inet manual
>> iface eth2 inet manual
>> iface eth3 inet manual
>> iface eth4 inet manual
>> iface eth5 inet manual
>>
>> auto vmbr10
>> iface vmbr10 inet static
>>     address  192.168.10.201
>>     netmask  255.255.255.0
>>     bridge_ports eth3
>>     bridge_stp off
>>     bridge_fd 0
>>
>> auto vmbr0
>> iface vmbr0 inet static
>>     address  192.168.0.201
>>     netmask  255.255.255.0
>>     gateway  192.168.0.100
>>     bridge_ports eth4
>>     bridge_stp off
>>     bridge_fd 0
>>
>> auto eth4.100
>> iface eth4.100 inet static
>>     address 10.0.2.1
>>     netmask 255.255.255.0
>>     up ip addr add 10.0.3.1/24 dev eth4.100
>>
>> Cluster is running on vmbr0 network (192.168.0.0/24)
>>
>> Cheers
>>
>>>
>>> Marcus Haarmann
>>>
>>>
>>> Von: "Eneko Lacunza" <elacunza at binovo.es>
>>> An: "pve-user" <pve-user at pve.proxmox.com>
>>> Gesendet: Dienstag, 4. Dezember 2018 15:57:10
>>> Betreff: [PVE-User] Multicast problems with Intel X540 - 10Gtek 
>>> network card?
>>>
>>> Hi all,
>>>
>>> We have just updated a 3-node Proxmox cluster from 3.4 to 5.2, Ceph
>>> hammer to Luminous and the network from 1 Gbit to 10Gbit... one of the
>>> three Proxmox nodes is new too :)
>>>
>>> Generally all was good and VMs are working well. :-)
>>>
>>> BUT, we have some problems with the cluster; promxox1 node joins and
>>> then after about 4 minutes drops from the cluster.
>>>
>>> All multicast tests
>>> https://pve.proxmox.com/wiki/Multicast_notes#Using_omping_to_test_multicast 
>>>
>>> run fine except the last one:
>>>
>>> *** proxmox1:
>>>
>>> root at proxmox1:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox4 : waiting for response msg
>>>
>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox3 : given amount of query messages was sent
>>>
>>> proxmox4 : given amount of query messages was sent
>>>
>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.073/0.184/0.390/0.061
>>>
>>> proxmox3 : multicast, xmt/rcv/%loss = 600/262/56%, 
>>> min/avg/max/std-dev = 0.092/0.207/0.421/0.068
>>>
>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.049/0.167/0.369/0.059
>>>
>>> proxmox4 : multicast, xmt/rcv/%loss = 600/262/56%, 
>>> min/avg/max/std-dev = 0.063/0.185/0.386/0.064
>>>
>>>
>>> *** proxmox3:
>>>
>>> root at proxmox3:/etc# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox4 : waiting for response msg
>>>
>>> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox4 : given amount of query messages was sent
>>>
>>> proxmox1 : given amount of query messages was sent
>>>
>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.083/0.193/1.030/0.055
>>>
>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>> min/avg/max/std-dev = 0.102/0.209/1.050/0.054
>>>
>>> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.041/0.108/0.172/0.026
>>>
>>> proxmox4 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>> min/avg/max/std-dev = 0.048/0.123/0.190/0.030
>>>
>>>
>>> *** root at proxmox4:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 
>>> proxmox4
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox1 : waiting for response msg
>>>
>>> proxmox3 : waiting for response msg
>>>
>>> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>>>
>>> proxmox1 : given amount of query messages was sent
>>>
>>> proxmox3 : given amount of query messages was sent
>>>
>>> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.085/0.188/0.356/0.040
>>>
>>> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>> min/avg/max/std-dev = 0.114/0.208/0.377/0.041
>>>
>>> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev 
>>> = 0.048/0.117/0.289/0.023
>>>
>>> proxmox3 : multicast, xmt/rcv/%loss = 600/600/0%, 
>>> min/avg/max/std-dev = 0.064/0.134/0.290/0.026
>>>
>>>
>>> Ok, so it seems we have a network problem on proxmox1 node. Network
>>> cards are as follows:
>>>
>>> - proxmox1: Intel X540 (10Gtek)
>>> - proxmox3: Intel X710 (Intel)
>>> - proxmox4: Intel X710 (Intel)
>>>
>>> Switch is Dell N1224T-ON.
>>>
>>> Does anyone have experience with Intel X540 chip network cards or Linux
>>> ixgbe network driver or 10Gtek manufacturer?
>>>
>>> If we change corosync communication to 1 Gbit network cards (broadcom)
>>> connected to an old HPE 1800-24G switch, cluster is stable...
>>>
>>> We also have a running cluster with Dell n1224T-ON switch and X710
>>> network cards without issues.
>>>
>>> Thanks a lot
>>> Eneko
>>>
>>>
>>
>>
>
>




More information about the pve-user mailing list