[PVE-User] Multicast problems with Intel X540 - 10Gtek network card?

Eneko Lacunza elacunza at binovo.es
Tue Dec 4 16:18:25 CET 2018


hi Marcus,

El 4/12/18 a las 16:09, Marcus Haarmann escribió:
> Hi,
>
> you did not provide details about your configuration.
> How is the network card set up ? Bonding ?
> Send your /etc/network/interfaces details.
> If bonding is active, check if the mode is correct in /proc/net/bonding.
> We encountered differences between /etc/network/interfaces setup and resulting mode.
> Also, check your switch configuration, VLAN setup, MTU etc.
Yes, sorry about that. I have double checked the switch and all 3 node 
SFP+ port have the same configuration.

/etc/network/interfaces  in proxmox1 node:
auto lo
iface lo inet loopback
iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual
iface eth4 inet manual
iface eth5 inet manual

auto vmbr10
iface vmbr10 inet static
     address  192.168.10.201
     netmask  255.255.255.0
     bridge_ports eth3
     bridge_stp off
     bridge_fd 0

auto vmbr0
iface vmbr0 inet static
     address  192.168.0.201
     netmask  255.255.255.0
     gateway  192.168.0.100
     bridge_ports eth4
     bridge_stp off
     bridge_fd 0

auto eth4.100
iface eth4.100 inet static
     address 10.0.2.1
     netmask 255.255.255.0
     up ip addr add 10.0.3.1/24 dev eth4.100

Cluster is running on vmbr0 network (192.168.0.0/24)

Cheers

>
> Marcus Haarmann
>
>
> Von: "Eneko Lacunza" <elacunza at binovo.es>
> An: "pve-user" <pve-user at pve.proxmox.com>
> Gesendet: Dienstag, 4. Dezember 2018 15:57:10
> Betreff: [PVE-User] Multicast problems with Intel X540 - 10Gtek network card?
>
> Hi all,
>
> We have just updated a 3-node Proxmox cluster from 3.4 to 5.2, Ceph
> hammer to Luminous and the network from 1 Gbit to 10Gbit... one of the
> three Proxmox nodes is new too :)
>
> Generally all was good and VMs are working well. :-)
>
> BUT, we have some problems with the cluster; promxox1 node joins and
> then after about 4 minutes drops from the cluster.
>
> All multicast tests
> https://pve.proxmox.com/wiki/Multicast_notes#Using_omping_to_test_multicast
> run fine except the last one:
>
> *** proxmox1:
>
> root at proxmox1:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>
> proxmox3 : waiting for response msg
>
> proxmox4 : waiting for response msg
>
> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox3 : given amount of query messages was sent
>
> proxmox4 : given amount of query messages was sent
>
> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.073/0.184/0.390/0.061
>
> proxmox3 : multicast, xmt/rcv/%loss = 600/262/56%, min/avg/max/std-dev = 0.092/0.207/0.421/0.068
>
> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.049/0.167/0.369/0.059
>
> proxmox4 : multicast, xmt/rcv/%loss = 600/262/56%, min/avg/max/std-dev = 0.063/0.185/0.386/0.064
>
>
> *** proxmox3:
>
> root at proxmox3:/etc# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>
> proxmox1 : waiting for response msg
>
> proxmox4 : waiting for response msg
>
> proxmox4 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox1 : waiting for response msg
>
> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox4 : given amount of query messages was sent
>
> proxmox1 : given amount of query messages was sent
>
> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.083/0.193/1.030/0.055
>
> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.102/0.209/1.050/0.054
>
> proxmox4 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.041/0.108/0.172/0.026
>
> proxmox4 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.048/0.123/0.190/0.030
>
>
> *** root at proxmox4:~# omping -c 600 -i 1 -F -q proxmox1 proxmox3 proxmox4
>
> proxmox1 : waiting for response msg
>
> proxmox3 : waiting for response msg
>
> proxmox1 : waiting for response msg
>
> proxmox3 : waiting for response msg
>
> proxmox3 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox1 : joined (S,G) = (*, 232.43.211.234), pinging
>
> proxmox1 : given amount of query messages was sent
>
> proxmox3 : given amount of query messages was sent
>
> proxmox1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.085/0.188/0.356/0.040
>
> proxmox1 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.114/0.208/0.377/0.041
>
> proxmox3 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.048/0.117/0.289/0.023
>
> proxmox3 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.064/0.134/0.290/0.026
>
>
> Ok, so it seems we have a network problem on proxmox1 node. Network
> cards are as follows:
>
> - proxmox1: Intel X540 (10Gtek)
> - proxmox3: Intel X710 (Intel)
> - proxmox4: Intel X710 (Intel)
>
> Switch is Dell N1224T-ON.
>
> Does anyone have experience with Intel X540 chip network cards or Linux
> ixgbe network driver or 10Gtek manufacturer?
>
> If we change corosync communication to 1 Gbit network cards (broadcom)
> connected to an old HPE 1800-24G switch, cluster is stable...
>
> We also have a running cluster with Dell n1224T-ON switch and X710
> network cards without issues.
>
> Thanks a lot
> Eneko
>
>


-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list