Multicast notes: Difference between revisions

From Proxmox VE
Jump to navigation Jump to search
mNo edit summary
 
(55 intermediate revisions by 13 users not shown)
Line 1: Line 1:
{{Note|Articles about Proxmox VE 2.0}}  
{{Note|Proxmox VE 6.x use corosync with kronosnet, which currently does not supports multicast.}}


= Introduction =
 
== Introduction ==


Multicast allows a single transmission to be delivered to multiple servers at the same time.  
Multicast allows a single transmission to be delivered to multiple servers at the same time.  


This is the basis for cluster communications in Proxmox VE 2.0.  
This is the basis for cluster communications in Proxmox VE 2.0 to Proxmox VE 5.4. which uses corosync and cman, and would apply to any other solution which utilizes those clustering tools.
 
'''Note''': Proxmox VE 6.0 uses corosync 3 which switched out the underlying transport stack with Kronosnet (knet). Kronosnet currently only supports unicast.
 
If multicast does not work in your network infrastructure, you should fix it so that it does.  If all else fails, use unicast instead, but beware of the node count limitations with unicast.
 
=== IGMP snooping ===
 
IGMP snooping prevents flooding multicast traffic to all ports in the broadcast domain by only allowing traffic destined for ports which have solicited such traffic.  IGMP snooping is a feature offered by most major switch manufacturers and is often enabled by default on switches.  In order for a switch to properly snoop the IGMP traffic, there must be an IGMP querier on the network.  '''If no querier is present, IGMP snooping will actively prevent ALL IGMP/Multicast traffic from being delivered!'''
 
If IGMP snooping is disabled, all multicast traffic will be delivered to all ports which may add unnecessary load, potentially allowing a denial of service attack.
 
=== IGMP querier ===
 
An IGMP querier is a multicast router that generates IGMP queries.  IGMP snooping relies on these queries which are unconditionally forwarded to all ports, as the replies from the destination ports is what builds the internal tables in the switch to allow it to know which traffic to forward.
 
IGMP querier can be enabled on your router, switch, or even linux bridges.


If multicast does not work in your network infrastructure, use unicast instead.
== Configuring IGMP/Multicast ==


= Troubleshooting =
=== Ensuring IGMP Snooping and Querier are enabled on your network (recommended) ===


not all hosting companies allow multicast traffic.
==== Juniper - JunOS ====


Some switches have multicast disabled by default.
Juniper EX switches, by default, enable IGMP snooping on all vlans as can be seen by this config snippet:
<pre>
[edit protocols]
user@switch# show igmp-snooping
vlan all;
</pre>


== test if multicast is working between two nodes with omping==
However, IGMP querier is not enabled by default.  If you are using RVIs (Routed Virtual Interfaces) on your switch already, you can enabled IGMP v2 on the interface which enables the querier.  However, most administrators do not use RVIs in all vlans on their switches and should be configured instead on the router.  The below config setting is the same on Juniper EX switches using RVIs as it is on Juniper SRX service gateways/routers, and effectively enables IGMP querier on the specified interface/vlan.  Note you must set this on all vlans which require multicast!:
<pre>
set protocols igmp $iface version 2
</pre>


aptitude install omping
==== Cisco ====


start omping on all nodes with the following command and check the output, e.g:
On Cisco switches, IGMP snooping is enabled by default. You do have to enable an IGMP snooping querier though:
<pre>
ip igmp snooping querier
</pre>


omping node1 node2 node3
This will enable it for all vlans. You can verify that it is enabled:
<pre>
show ip igmp snooping querier
Vlan      IP Address              IGMP Version  Port           
-------------------------------------------------------------
1        172.16.34.4              v2            Switch                 
2        172.16.34.4              v2            Switch                 
3        172.16.34.4              v2            Switch                 
</pre>


== test if multicast is working between two nodes with ssmping==
==== HP - ProCurve ====


Copied from a post by e100 on forum .
HP Procurve switches, by default, has disabled IGMP on all vlans as can be seen by this config snippet:
<pre>
# show ip igmp
</pre>


*this uses '''ssmping'''
Likewise, IGMP querier is also not enabled by default. When IGMP is enabled on a vlan ProCurve will negotiate with other devices for which to be querier and according to RFC the device with the lowest IP will win.
Note you must set this on all vlans which require multicast! (vlan 30 used for demo):
<pre>
# conf t
(config)# vlan 30
(vlan-30)# ip igmp high-priority-forward
</pre>


Install this on all nodes .
To verify:
<pre>
# sh ip igmp 30       


  aptitude install ssmping
  Status and Counters - IP Multicast (IGMP) Status


run this on Node A:  
VLAN ID : 30
VLAN Name : Proxmox
Querier Address : This switch is Querier


  ssmpingd
  Active Group Addresses Reports Queries Querier Access Port
  ---------------------- ------- ------- -------------------
  239.192.105.237        214020 0                         
</pre>
====Netgear====
Using web/gui:


then on Node B:
=====Per VLAN=====
Enable IGMP snooping.


asmping 224.0.2.1 ip_for_NODE_A_here
Enable IGMP snooping on your VLANs under the IGMP VLAN configuration.


example output
Enable multicast router mode on the ports that uplinks to the other switches.
<pre>asmping joined (S,G) = (*,224.0.2.234)
pinging 192.168.8.6 from 192.168.8.5
  unicast from 192.168.8.6, seq=1 dist=0 time=0.221 ms
  unicast from 192.168.8.6, seq=2 dist=0 time=0.229 ms
multicast from 192.168.8.6, seq=2 dist=0 time=0.261 ms
  unicast from 192.168.8.6, seq=3 dist=0 time=0.198 ms
multicast from 192.168.8.6, seq=3 dist=0 time=0.213 ms
  unicast from 192.168.8.6, seq=4 dist=0 time=0.234 ms
multicast from 192.168.8.6, seq=4 dist=0 time=0.248 ms
  unicast from 192.168.8.6, seq=5 dist=0 time=0.249 ms
multicast from 192.168.8.6, seq=5 dist=0 time=0.263 ms
  unicast from 192.168.8.6, seq=6 dist=0 time=0.250 ms
multicast from 192.168.8.6, seq=6 dist=0 time=0.264 ms
  unicast from 192.168.8.6, seq=7 dist=0 time=0.245 ms
multicast from 192.168.8.6, seq=7 dist=0 time=0.260 ms
</pre>
for more information see


man ssmping
Enable IGMP Querier


and
Leave the global address at 0.0.0.0.


less /usr/share/doc/ssmping/README.gz
Set instead a Querier IP address per VLAN under the Querier VLAN configuration (VLAN10=1.1.1.10 and VLAN15=1.1.1.15.  


== ssmping notes ==
Next switch VLAN10=2.2.2.10 and VLAN15=2.2.2.15, etc).


*there are a few other programs included in ssmping which may be of use. here is a list of the files in the package:
Make sure “Querier Election Participation Mode” is enabled for each VLAN.


apt-file list ssmping
OR
<pre>ssmping: /usr/bin/asmping
ssmping: /usr/bin/mcfirst
ssmping: /usr/bin/ssmping
ssmping: /usr/bin/ssmpingd
ssmping: /usr/share/doc/ssmping/README.gz
ssmping: /usr/share/doc/ssmping/changelog.Debian.gz
ssmping: /usr/share/doc/ssmping/copyright
ssmping: /usr/share/man/man1/asmping.1.gz
ssmping: /usr/share/man/man1/mcfirst.1.gz
ssmping: /usr/share/man/man1/ssmping.1.gz
ssmping: /usr/share/man/man1/ssmpingd.1.gz
</pre>
*If you want to use apt-file do this:


aptitude install apt-file
=====Global=====
apt-file update
Enable IGMP snooping.


then set up a cronjob to do ''apt-file update'' weekly or monthly ..
Enable IGMP snooping on your ports under the IGMP interface configuration.


== cman & iptables ==
Enable multicast router mode on the ports that uplinks to the other switches.
In case ''cman'' crashes with ''cpg_send_message failed: 9'' add those to your rule set:
<pre>iptables -A INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
iptables -A INPUT -p udp -m state --state NEW -m multiport –dports 5404,5405 -j ACCEPT
</pre>


= Use unicast instead of multicast =
Enable IGMP Querier


Unicast is a technology for sending messages to a single network destination. In corosync, unicast is implemented as UDP-unicast (UDPU). Due to increased network traffic (compared to multicast) the number of supported nodes is limited, do not use it with more that 4 cluster nodes.
Set a global Querier IP address (1.1.1.1 and next switch 2.2.2.2, etc.)


*just create the cluster as usual (pvecm create ...)
==== Brocade ====
*follow this howto to create a cluster.conf.new [[Fencing#General_HowTo_for_editing_the_cluster.conf]]
*add the new '''transport="udpu"''' in /etc/pve/cluster.conf.new
<source lang="xml"><cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/></source>
*activate via GUI
*add all nodes you want to join in /etc/hosts and reboot
*before you add a node, make sure you add all other nodes in /etc/hosts


= Multicast with Infiniband =
==== Linux: Enabling Multicast querier on bridges ====
If your router or switch does not support enabling a multicast querier, and you are using a classic linux bridge (not Open vSwitch), then you can enable the multicast querier on the Linux bridge by adding this statement to your /etc/network/interfaces bridge configuration:
<pre>
  post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier )
</pre>


IP over Infiniband (IPoIB) supports Multicast but Multicast traffic is limited to 2044 Bytes when using connected mode even if you set a larger MTU on the IPoIB interface.
=== Disabling IGMP Snooping (not recommended) ===


Corosync has a setting, netmtu, that defaults to 1500 making it compatible with connected mode Infiniband.
==== Juniper - JunOS ====
<pre>
set protocols igmp-snooping vlan all disable
</pre>


== Changing netmtu ==
==== Cisco Managed Switches  ====
<pre>
# conf t
# no ip igmp snooping
</pre>


Changing the netmtu can increase throughput '''The following information is untested.'''
==== HP - ProCurve ====
Disabling IGMP must be done on every vlan where it is enabled.
<pre>
# conf t
(config)# vlan 30
(vlan-30)# no ip igmp
</pre>


Edit the /etc/pve/cluster.conf file Add the section: <source lang="xml">
==== Linux: Disabling Multicast snooping on bridges ====
<totem netmtu="2044" />
</source>


<br> <source lang="xml">
Snooping should be enabled on either the router / switch or on the linux bridge, but it may not work if enabled on both.  If you have a hosting provider that has igmp snooping enabled on the multicast switch, it may be necessary to disable snooping on the linux bridge. In that case use:
<?xml version="1.0"?>
<pre>
<cluster name="clustername" config_version="2">
   post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier )
   <totem netmtu="2044" />
   post-up ( echo 0 > /sys/class/net/$IFACE/bridge/multicast_snooping )
   <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</pre>
  </cman>


  <clusternodes>
== Multicast with Infiniband ==
  <clusternode name="node1" votes="1" nodeid="1"/>
  <clusternode name="node2" votes="1" nodeid="2"/>
  <clusternode name="node3" votes="1" nodeid="3"/></clusternodes>


</cluster>
IP over Infiniband (IPoIB) supports Multicast but Multicast traffic is limited to 2043 Bytes when using connected mode even if you set a larger MTU on the IPoIB interface.
</source>


<br>
Corosync has a setting, netmtu, that defaults to 1500 making it compatible with connected mode Infiniband.


= Netgear Managed Switches  =
== Using omping to test multicast ==


the following are pics of setting to get multicast working on our netgear 7300 series switches. for more information see http://documentation.netgear.com/gs700at/enu/202-10360-01/GS700AT%20Series%20UG-06-18.html
start omping on all nodes with the following command and check the output, e.g:
this is the precise version it sends 10000 packets in a interval of 1ms 
omping -c 10000 -i 0.001 -F -q node1 node2 node3
crude with tons of detail
omping node1 node2 node3
find the multicast address on proxmox 4.X run this:
corosync-cmapctl -g totem.interface.0.mcastaddr
then use muticast address
  omping -m yourmulticastadress node1 node2 node3


<br> [[Image:Multicast-netgear-1.png]]
== Troubleshooting ==
=== Diagnosis from first principles ===
These instructions assume you aren't using unicast UDP (transport="udpu"); I've tried to note where that will make a difference.


[[Image:Multicast-netgear-2.png]]
If you are already experiencing issues, the steps taken to diagnose the problem may make the problem worse in the short term.


[[Image:Multicast-netgear-3.png]]
If you have poor-quality (or even just misconfigured) ethernet switches, some of these tests may crash your entire network, but at least then you'll know where the source of the problem is...


[[File:NetGear-multicast-save-and-apply.png]]
# Ensure all the nodes are in the same subnet.
## If you aren't clear on networking, this boils down to: ''do all your nodes use the same IP address for their default gateway?''<br />
## If you are deliberately using UDPU transport, this is not a hard requirement, but even in that case, having your hosts in the same subnet will make your task significantly easier.
# Ensure all the nodes can (unicast) ping each other without any packet loss at moderately-high packet rates.
## Test using "ping -f".
## Your network needs to be robust enough to have '''''all''''' your nodes flood-pinging each other simultaneously with < 1% packet loss.
# Ensure all the nodes can resolve each other's hostnames.
## The previous test should have taken care of this if you used hostnames instead of IP addresses.
## Otherwise use nslookup(1) or dig(1) to test DNS, or host(1) or ping(1) to test if you're relying on /etc/hosts.
## Theoretically, this shouldn't matter if you're using multicast, but not having this right will likely cause hard-to-diagnose issues later.
# Ensure multicast works at high packet rates.  ''This does not apply if you are deliberately using UDPU.''
## Run omping see below. 
## You may want to use a parallel-SSH client of some sort to ensure omping starts up almost simultaneously on every node.  This will cause each host to send a multicast packet once per millisecond. 
## If this causes your ethernet switch to fail, consider upgrading your switch. 
## The final "%loss" number should be < 1%.
# Ensure multicast works for > 5 minutes at a time.  ''This does not apply if you are deliberately using UDPU.''
## Run "omping -c 600 -i 1 -q <list of all nodes>" on every node simultaneously (see above). 
## This test should take ten (10) minutes to run, which is twice as long as the default IGMPv2 leave timer, thus proving that IGMP snooping isn't the source of any problem.


If all of these tests have succeeded, and you are starting with freshly-installed Proxmox VE nodes, you should be able to form a multicast cluster without any issues.  See below for further notes on UDPU.


= Cisco Managed Switches  =
=== Use unicast (UDPU) instead of multicast, if all else fails ===


Some cisco switchs are a feature enabled by default : igmp snooping.
Unicast is a technology for sending messages to a single network destination. In corosync, unicast is implemented as UDP-unicast (UDPU). Due to increased network traffic (compared to multicast) the number of supported nodes is limited, do not use it with more that 4 cluster nodes.


These feature is used to filter multicast traffic, to avoid to forward it on each ports.
FYI: OVH is a good example of a hosting company where you may need to use UDPU instead of multicast, as your hosts will generally not be able to send or receive multicast traffic to/from each other.  This author wishes he knew what their network engineers were smoking, but since their network works well despite its strangeness, it must have have been something '''really''' good.


But this can sometimes do problems with corosync, so it's better to disable it.
* Carefully read the entire corosync.conf(5) and votequorum(5) manpages.
* create the cluster as usual
* if needed, bring the initial node into quorate state with "pvecm e 1"
* if needed, [[Editing_corosync.conf|edit /etc/pve/corosync.conf]] (remember increasing the version number!); it will later be auto-copied to /etc/corosync/corosync.conf on each node by one of the PVE services, where in turn it will be copied into the local /etc/corosync/corosync.conf [[https://forum.proxmox.com/threads/roles-of-the-different-corosync-conf-files-in-a-cluster.26894/]].
* in the totem{} stanza, add "transport: udpu"
* pre-add the nodes to the nodelist{} stanza.
* on each node : systemctl restart corosync  (if this command does not work, use killall -9 corosync )
* then, on each node : /etc/init.d/pve-cluster restart


'''Important Note:''' if the nodes are not in the same subnet, you will probably also have to edit '''bindnetaddr''' in the totem stanza and change it to "0.0.0.0" for the cluster to initialize.  It defaults to the IP of the first cluster member, and any other members in the same subnet will be able to initialize, but '''''members in a different subnet will see corosync unable to initialize''''' because it can't figure out an IP address to bind to.  There may be security implications to do allowing corosync to bind to the wildcard address.
Simply commenting out the bindnetaddr line may also work equally well, then corosync will figure it out dynamically on each node.


For cisco 2960G, by example, you can disable it with:
[[Category:Troubleshooting]]
<pre>
# conf t
# no ip igmp snooping
</pre>

Latest revision as of 13:01, 22 July 2019

Yellowpin.svg Note: Proxmox VE 6.x use corosync with kronosnet, which currently does not supports multicast.


Introduction

Multicast allows a single transmission to be delivered to multiple servers at the same time.

This is the basis for cluster communications in Proxmox VE 2.0 to Proxmox VE 5.4. which uses corosync and cman, and would apply to any other solution which utilizes those clustering tools.

Note: Proxmox VE 6.0 uses corosync 3 which switched out the underlying transport stack with Kronosnet (knet). Kronosnet currently only supports unicast.

If multicast does not work in your network infrastructure, you should fix it so that it does. If all else fails, use unicast instead, but beware of the node count limitations with unicast.

IGMP snooping

IGMP snooping prevents flooding multicast traffic to all ports in the broadcast domain by only allowing traffic destined for ports which have solicited such traffic. IGMP snooping is a feature offered by most major switch manufacturers and is often enabled by default on switches. In order for a switch to properly snoop the IGMP traffic, there must be an IGMP querier on the network. If no querier is present, IGMP snooping will actively prevent ALL IGMP/Multicast traffic from being delivered!

If IGMP snooping is disabled, all multicast traffic will be delivered to all ports which may add unnecessary load, potentially allowing a denial of service attack.

IGMP querier

An IGMP querier is a multicast router that generates IGMP queries. IGMP snooping relies on these queries which are unconditionally forwarded to all ports, as the replies from the destination ports is what builds the internal tables in the switch to allow it to know which traffic to forward.

IGMP querier can be enabled on your router, switch, or even linux bridges.

Configuring IGMP/Multicast

Ensuring IGMP Snooping and Querier are enabled on your network (recommended)

Juniper - JunOS

Juniper EX switches, by default, enable IGMP snooping on all vlans as can be seen by this config snippet:

[edit protocols]
user@switch# show igmp-snooping
vlan all;

However, IGMP querier is not enabled by default. If you are using RVIs (Routed Virtual Interfaces) on your switch already, you can enabled IGMP v2 on the interface which enables the querier. However, most administrators do not use RVIs in all vlans on their switches and should be configured instead on the router. The below config setting is the same on Juniper EX switches using RVIs as it is on Juniper SRX service gateways/routers, and effectively enables IGMP querier on the specified interface/vlan. Note you must set this on all vlans which require multicast!:

set protocols igmp $iface version 2

Cisco

On Cisco switches, IGMP snooping is enabled by default. You do have to enable an IGMP snooping querier though:

ip igmp snooping querier

This will enable it for all vlans. You can verify that it is enabled:

show ip igmp snooping querier 
Vlan      IP Address               IGMP Version   Port             
-------------------------------------------------------------
1         172.16.34.4              v2            Switch                   
2         172.16.34.4              v2            Switch                   
3         172.16.34.4              v2            Switch                   

HP - ProCurve

HP Procurve switches, by default, has disabled IGMP on all vlans as can be seen by this config snippet:

# show ip igmp

Likewise, IGMP querier is also not enabled by default. When IGMP is enabled on a vlan ProCurve will negotiate with other devices for which to be querier and according to RFC the device with the lowest IP will win. Note you must set this on all vlans which require multicast! (vlan 30 used for demo):

# conf t
(config)# vlan 30
(vlan-30)# ip igmp high-priority-forward

To verify:

# sh ip igmp 30        

 Status and Counters - IP Multicast (IGMP) Status

 VLAN ID : 30
 VLAN Name : Proxmox
 Querier Address : This switch is Querier

  Active Group Addresses Reports Queries Querier Access Port
  ---------------------- ------- ------- -------------------
  239.192.105.237        214020  0                          

Netgear

Using web/gui:

Per VLAN

Enable IGMP snooping.

Enable IGMP snooping on your VLANs under the IGMP VLAN configuration.

Enable multicast router mode on the ports that uplinks to the other switches.

Enable IGMP Querier

Leave the global address at 0.0.0.0.

Set instead a Querier IP address per VLAN under the Querier VLAN configuration (VLAN10=1.1.1.10 and VLAN15=1.1.1.15.

Next switch VLAN10=2.2.2.10 and VLAN15=2.2.2.15, etc).

Make sure “Querier Election Participation Mode” is enabled for each VLAN.

OR

Global

Enable IGMP snooping.

Enable IGMP snooping on your ports under the IGMP interface configuration.

Enable multicast router mode on the ports that uplinks to the other switches.

Enable IGMP Querier

Set a global Querier IP address (1.1.1.1 and next switch 2.2.2.2, etc.)

Brocade

Linux: Enabling Multicast querier on bridges

If your router or switch does not support enabling a multicast querier, and you are using a classic linux bridge (not Open vSwitch), then you can enable the multicast querier on the Linux bridge by adding this statement to your /etc/network/interfaces bridge configuration:

  post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier )

Disabling IGMP Snooping (not recommended)

Juniper - JunOS

set protocols igmp-snooping vlan all disable

Cisco Managed Switches

# conf t
# no ip igmp snooping

HP - ProCurve

Disabling IGMP must be done on every vlan where it is enabled.

# conf t
(config)# vlan 30
(vlan-30)# no ip igmp

Linux: Disabling Multicast snooping on bridges

Snooping should be enabled on either the router / switch or on the linux bridge, but it may not work if enabled on both. If you have a hosting provider that has igmp snooping enabled on the multicast switch, it may be necessary to disable snooping on the linux bridge. In that case use:

  post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier )
  post-up ( echo 0 > /sys/class/net/$IFACE/bridge/multicast_snooping )

Multicast with Infiniband

IP over Infiniband (IPoIB) supports Multicast but Multicast traffic is limited to 2043 Bytes when using connected mode even if you set a larger MTU on the IPoIB interface.

Corosync has a setting, netmtu, that defaults to 1500 making it compatible with connected mode Infiniband.

Using omping to test multicast

start omping on all nodes with the following command and check the output, e.g: this is the precise version it sends 10000 packets in a interval of 1ms

omping -c 10000 -i 0.001 -F -q node1 node2 node3

crude with tons of detail

omping node1 node2 node3

find the multicast address on proxmox 4.X run this:

corosync-cmapctl -g totem.interface.0.mcastaddr

then use muticast address

 omping -m yourmulticastadress node1 node2 node3

Troubleshooting

Diagnosis from first principles

These instructions assume you aren't using unicast UDP (transport="udpu"); I've tried to note where that will make a difference.

If you are already experiencing issues, the steps taken to diagnose the problem may make the problem worse in the short term.

If you have poor-quality (or even just misconfigured) ethernet switches, some of these tests may crash your entire network, but at least then you'll know where the source of the problem is...

  1. Ensure all the nodes are in the same subnet.
    1. If you aren't clear on networking, this boils down to: do all your nodes use the same IP address for their default gateway?
    2. If you are deliberately using UDPU transport, this is not a hard requirement, but even in that case, having your hosts in the same subnet will make your task significantly easier.
  2. Ensure all the nodes can (unicast) ping each other without any packet loss at moderately-high packet rates.
    1. Test using "ping -f".
    2. Your network needs to be robust enough to have all your nodes flood-pinging each other simultaneously with < 1% packet loss.
  3. Ensure all the nodes can resolve each other's hostnames.
    1. The previous test should have taken care of this if you used hostnames instead of IP addresses.
    2. Otherwise use nslookup(1) or dig(1) to test DNS, or host(1) or ping(1) to test if you're relying on /etc/hosts.
    3. Theoretically, this shouldn't matter if you're using multicast, but not having this right will likely cause hard-to-diagnose issues later.
  4. Ensure multicast works at high packet rates. This does not apply if you are deliberately using UDPU.
    1. Run omping see below.
    2. You may want to use a parallel-SSH client of some sort to ensure omping starts up almost simultaneously on every node. This will cause each host to send a multicast packet once per millisecond.
    3. If this causes your ethernet switch to fail, consider upgrading your switch.
    4. The final "%loss" number should be < 1%.
  5. Ensure multicast works for > 5 minutes at a time. This does not apply if you are deliberately using UDPU.
    1. Run "omping -c 600 -i 1 -q <list of all nodes>" on every node simultaneously (see above).
    2. This test should take ten (10) minutes to run, which is twice as long as the default IGMPv2 leave timer, thus proving that IGMP snooping isn't the source of any problem.

If all of these tests have succeeded, and you are starting with freshly-installed Proxmox VE nodes, you should be able to form a multicast cluster without any issues. See below for further notes on UDPU.

Use unicast (UDPU) instead of multicast, if all else fails

Unicast is a technology for sending messages to a single network destination. In corosync, unicast is implemented as UDP-unicast (UDPU). Due to increased network traffic (compared to multicast) the number of supported nodes is limited, do not use it with more that 4 cluster nodes.

FYI: OVH is a good example of a hosting company where you may need to use UDPU instead of multicast, as your hosts will generally not be able to send or receive multicast traffic to/from each other. This author wishes he knew what their network engineers were smoking, but since their network works well despite its strangeness, it must have have been something really good.

  • Carefully read the entire corosync.conf(5) and votequorum(5) manpages.
  • create the cluster as usual
  • if needed, bring the initial node into quorate state with "pvecm e 1"
  • if needed, edit /etc/pve/corosync.conf (remember increasing the version number!); it will later be auto-copied to /etc/corosync/corosync.conf on each node by one of the PVE services, where in turn it will be copied into the local /etc/corosync/corosync.conf [[1]].
  • in the totem{} stanza, add "transport: udpu"
  • pre-add the nodes to the nodelist{} stanza.
  • on each node : systemctl restart corosync (if this command does not work, use killall -9 corosync )
  • then, on each node : /etc/init.d/pve-cluster restart

Important Note: if the nodes are not in the same subnet, you will probably also have to edit bindnetaddr in the totem stanza and change it to "0.0.0.0" for the cluster to initialize. It defaults to the IP of the first cluster member, and any other members in the same subnet will be able to initialize, but members in a different subnet will see corosync unable to initialize because it can't figure out an IP address to bind to. There may be security implications to do allowing corosync to bind to the wildcard address. Simply commenting out the bindnetaddr line may also work equally well, then corosync will figure it out dynamically on each node.