Multicast notes: Difference between revisions
Bread-baker (talk | contribs) No edit summary |
mNo edit summary |
||
(7 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
<div class="sticky-box notice-box">Proxmox VE 6 and newer uses <code>corosync</code> with <code>kronosnet</code> as communication layer, which <u>only supports unicast</u>. This article is only relevant for PVE 5 and older.</div> | |||
== Introduction == | == Introduction == | ||
Multicast allows a single transmission to be delivered to multiple servers at the same time. | Multicast allows a single transmission to be delivered to multiple servers at the same time. | ||
This is the basis for cluster communications in Proxmox VE 2.0 | This is the basis for cluster communications in Proxmox VE 2.0 to Proxmox VE 5.4. which uses corosync and cman, and would apply to any other solution which utilizes those clustering tools. | ||
'''Note''': Proxmox VE 6.0 uses corosync 3 which switched out the underlying transport stack with Kronosnet (knet). Kronosnet currently only supports unicast. | |||
If multicast does not work in your network infrastructure, you should fix it so that it does. If all else fails, use unicast instead, but beware of the node count limitations with unicast. | If multicast does not work in your network infrastructure, you should fix it so that it does. If all else fails, use unicast instead, but beware of the node count limitations with unicast. | ||
Line 88: | Line 91: | ||
=====Per VLAN===== | =====Per VLAN===== | ||
Enable IGMP snooping. | Enable IGMP snooping. | ||
Enable IGMP snooping on your VLANs under the IGMP VLAN configuration. | Enable IGMP snooping on your VLANs under the IGMP VLAN configuration. | ||
Enable multicast router mode on the ports that uplinks to the other switches. | Enable multicast router mode on the ports that uplinks to the other switches. | ||
Enable IGMP Querier | Enable IGMP Querier | ||
Leave the global address at 0.0.0.0. | Leave the global address at 0.0.0.0. | ||
Set instead a Querier IP address per VLAN under the Querier VLAN configuration (VLAN10=1.1.1.10 and VLAN15=1.1.1.15. Next switch VLAN10=2.2.2.10 and VLAN15=2.2.2.15, etc). | |||
Set instead a Querier IP address per VLAN under the Querier VLAN configuration (VLAN10=1.1.1.10 and VLAN15=1.1.1.15. | |||
Next switch VLAN10=2.2.2.10 and VLAN15=2.2.2.15, etc). | |||
Make sure “Querier Election Participation Mode” is enabled for each VLAN. | Make sure “Querier Election Participation Mode” is enabled for each VLAN. | ||
OR | OR | ||
=====Global===== | =====Global===== | ||
Enable IGMP snooping. | Enable IGMP snooping. | ||
Enable IGMP snooping on your ports under the IGMP interface configuration. | Enable IGMP snooping on your ports under the IGMP interface configuration. | ||
Enable multicast router mode on the ports that uplinks to the other switches. | Enable multicast router mode on the ports that uplinks to the other switches. | ||
Enable IGMP Querier | Enable IGMP Querier | ||
Set a global Querier IP address (1.1.1.1 and next switch 2.2.2.2, etc.) | Set a global Querier IP address (1.1.1.1 and next switch 2.2.2.2, etc.) | ||
Line 130: | Line 146: | ||
(config)# vlan 30 | (config)# vlan 30 | ||
(vlan-30)# no ip igmp | (vlan-30)# no ip igmp | ||
</pre> | |||
==== Linux: Disabling Multicast snooping on bridges ==== | |||
Snooping should be enabled on either the router / switch or on the linux bridge, but it may not work if enabled on both. If you have a hosting provider that has igmp snooping enabled on the multicast switch, it may be necessary to disable snooping on the linux bridge. In that case use: | |||
<pre> | |||
post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier ) | |||
post-up ( echo 0 > /sys/class/net/$IFACE/bridge/multicast_snooping ) | |||
</pre> | </pre> | ||
Latest revision as of 12:57, 24 July 2024
corosync
with kronosnet
as communication layer, which only supports unicast. This article is only relevant for PVE 5 and older.Introduction
Multicast allows a single transmission to be delivered to multiple servers at the same time.
This is the basis for cluster communications in Proxmox VE 2.0 to Proxmox VE 5.4. which uses corosync and cman, and would apply to any other solution which utilizes those clustering tools.
Note: Proxmox VE 6.0 uses corosync 3 which switched out the underlying transport stack with Kronosnet (knet). Kronosnet currently only supports unicast.
If multicast does not work in your network infrastructure, you should fix it so that it does. If all else fails, use unicast instead, but beware of the node count limitations with unicast.
IGMP snooping
IGMP snooping prevents flooding multicast traffic to all ports in the broadcast domain by only allowing traffic destined for ports which have solicited such traffic. IGMP snooping is a feature offered by most major switch manufacturers and is often enabled by default on switches. In order for a switch to properly snoop the IGMP traffic, there must be an IGMP querier on the network. If no querier is present, IGMP snooping will actively prevent ALL IGMP/Multicast traffic from being delivered!
If IGMP snooping is disabled, all multicast traffic will be delivered to all ports which may add unnecessary load, potentially allowing a denial of service attack.
IGMP querier
An IGMP querier is a multicast router that generates IGMP queries. IGMP snooping relies on these queries which are unconditionally forwarded to all ports, as the replies from the destination ports is what builds the internal tables in the switch to allow it to know which traffic to forward.
IGMP querier can be enabled on your router, switch, or even linux bridges.
Configuring IGMP/Multicast
Ensuring IGMP Snooping and Querier are enabled on your network (recommended)
Juniper - JunOS
Juniper EX switches, by default, enable IGMP snooping on all vlans as can be seen by this config snippet:
[edit protocols] user@switch# show igmp-snooping vlan all;
However, IGMP querier is not enabled by default. If you are using RVIs (Routed Virtual Interfaces) on your switch already, you can enabled IGMP v2 on the interface which enables the querier. However, most administrators do not use RVIs in all vlans on their switches and should be configured instead on the router. The below config setting is the same on Juniper EX switches using RVIs as it is on Juniper SRX service gateways/routers, and effectively enables IGMP querier on the specified interface/vlan. Note you must set this on all vlans which require multicast!:
set protocols igmp $iface version 2
Cisco
On Cisco switches, IGMP snooping is enabled by default. You do have to enable an IGMP snooping querier though:
ip igmp snooping querier
This will enable it for all vlans. You can verify that it is enabled:
show ip igmp snooping querier Vlan IP Address IGMP Version Port ------------------------------------------------------------- 1 172.16.34.4 v2 Switch 2 172.16.34.4 v2 Switch 3 172.16.34.4 v2 Switch
HP - ProCurve
HP Procurve switches, by default, has disabled IGMP on all vlans as can be seen by this config snippet:
# show ip igmp
Likewise, IGMP querier is also not enabled by default. When IGMP is enabled on a vlan ProCurve will negotiate with other devices for which to be querier and according to RFC the device with the lowest IP will win. Note you must set this on all vlans which require multicast! (vlan 30 used for demo):
# conf t (config)# vlan 30 (vlan-30)# ip igmp high-priority-forward
To verify:
# sh ip igmp 30 Status and Counters - IP Multicast (IGMP) Status VLAN ID : 30 VLAN Name : Proxmox Querier Address : This switch is Querier Active Group Addresses Reports Queries Querier Access Port ---------------------- ------- ------- ------------------- 239.192.105.237 214020 0
Netgear
Using web/gui:
Per VLAN
Enable IGMP snooping.
Enable IGMP snooping on your VLANs under the IGMP VLAN configuration.
Enable multicast router mode on the ports that uplinks to the other switches.
Enable IGMP Querier
Leave the global address at 0.0.0.0.
Set instead a Querier IP address per VLAN under the Querier VLAN configuration (VLAN10=1.1.1.10 and VLAN15=1.1.1.15.
Next switch VLAN10=2.2.2.10 and VLAN15=2.2.2.15, etc).
Make sure “Querier Election Participation Mode” is enabled for each VLAN.
OR
Global
Enable IGMP snooping.
Enable IGMP snooping on your ports under the IGMP interface configuration.
Enable multicast router mode on the ports that uplinks to the other switches.
Enable IGMP Querier
Set a global Querier IP address (1.1.1.1 and next switch 2.2.2.2, etc.)
Brocade
Linux: Enabling Multicast querier on bridges
If your router or switch does not support enabling a multicast querier, and you are using a classic linux bridge (not Open vSwitch), then you can enable the multicast querier on the Linux bridge by adding this statement to your /etc/network/interfaces bridge configuration:
post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier )
Disabling IGMP Snooping (not recommended)
Juniper - JunOS
set protocols igmp-snooping vlan all disable
Cisco Managed Switches
# conf t # no ip igmp snooping
HP - ProCurve
Disabling IGMP must be done on every vlan where it is enabled.
# conf t (config)# vlan 30 (vlan-30)# no ip igmp
Linux: Disabling Multicast snooping on bridges
Snooping should be enabled on either the router / switch or on the linux bridge, but it may not work if enabled on both. If you have a hosting provider that has igmp snooping enabled on the multicast switch, it may be necessary to disable snooping on the linux bridge. In that case use:
post-up ( echo 1 > /sys/devices/virtual/net/$IFACE/bridge/multicast_querier ) post-up ( echo 0 > /sys/class/net/$IFACE/bridge/multicast_snooping )
Multicast with Infiniband
IP over Infiniband (IPoIB) supports Multicast but Multicast traffic is limited to 2043 Bytes when using connected mode even if you set a larger MTU on the IPoIB interface.
Corosync has a setting, netmtu, that defaults to 1500 making it compatible with connected mode Infiniband.
Using omping to test multicast
start omping on all nodes with the following command and check the output, e.g: this is the precise version it sends 10000 packets in a interval of 1ms
omping -c 10000 -i 0.001 -F -q node1 node2 node3
crude with tons of detail
omping node1 node2 node3
find the multicast address on proxmox 4.X run this:
corosync-cmapctl -g totem.interface.0.mcastaddr
then use muticast address
omping -m yourmulticastadress node1 node2 node3
Troubleshooting
Diagnosis from first principles
These instructions assume you aren't using unicast UDP (transport="udpu"); I've tried to note where that will make a difference.
If you are already experiencing issues, the steps taken to diagnose the problem may make the problem worse in the short term.
If you have poor-quality (or even just misconfigured) ethernet switches, some of these tests may crash your entire network, but at least then you'll know where the source of the problem is...
- Ensure all the nodes are in the same subnet.
- If you aren't clear on networking, this boils down to: do all your nodes use the same IP address for their default gateway?
- If you are deliberately using UDPU transport, this is not a hard requirement, but even in that case, having your hosts in the same subnet will make your task significantly easier.
- If you aren't clear on networking, this boils down to: do all your nodes use the same IP address for their default gateway?
- Ensure all the nodes can (unicast) ping each other without any packet loss at moderately-high packet rates.
- Test using "ping -f".
- Your network needs to be robust enough to have all your nodes flood-pinging each other simultaneously with < 1% packet loss.
- Ensure all the nodes can resolve each other's hostnames.
- The previous test should have taken care of this if you used hostnames instead of IP addresses.
- Otherwise use nslookup(1) or dig(1) to test DNS, or host(1) or ping(1) to test if you're relying on /etc/hosts.
- Theoretically, this shouldn't matter if you're using multicast, but not having this right will likely cause hard-to-diagnose issues later.
- Ensure multicast works at high packet rates. This does not apply if you are deliberately using UDPU.
- Run omping see below.
- You may want to use a parallel-SSH client of some sort to ensure omping starts up almost simultaneously on every node. This will cause each host to send a multicast packet once per millisecond.
- If this causes your ethernet switch to fail, consider upgrading your switch.
- The final "%loss" number should be < 1%.
- Ensure multicast works for > 5 minutes at a time. This does not apply if you are deliberately using UDPU.
- Run "omping -c 600 -i 1 -q <list of all nodes>" on every node simultaneously (see above).
- This test should take ten (10) minutes to run, which is twice as long as the default IGMPv2 leave timer, thus proving that IGMP snooping isn't the source of any problem.
If all of these tests have succeeded, and you are starting with freshly-installed Proxmox VE nodes, you should be able to form a multicast cluster without any issues. See below for further notes on UDPU.
Use unicast (UDPU) instead of multicast, if all else fails
Unicast is a technology for sending messages to a single network destination. In corosync, unicast is implemented as UDP-unicast (UDPU). Due to increased network traffic (compared to multicast) the number of supported nodes is limited, do not use it with more that 4 cluster nodes.
FYI: OVH is a good example of a hosting company where you may need to use UDPU instead of multicast, as your hosts will generally not be able to send or receive multicast traffic to/from each other. This author wishes he knew what their network engineers were smoking, but since their network works well despite its strangeness, it must have have been something really good.
- Carefully read the entire corosync.conf(5) and votequorum(5) manpages.
- create the cluster as usual
- if needed, bring the initial node into quorate state with "pvecm e 1"
- if needed, edit /etc/pve/corosync.conf (remember increasing the version number!); it will later be auto-copied to /etc/corosync/corosync.conf on each node by one of the PVE services, where in turn it will be copied into the local /etc/corosync/corosync.conf [[1]].
- in the totem{} stanza, add "transport: udpu"
- pre-add the nodes to the nodelist{} stanza.
- on each node : systemctl restart corosync (if this command does not work, use killall -9 corosync )
- then, on each node : /etc/init.d/pve-cluster restart
Important Note: if the nodes are not in the same subnet, you will probably also have to edit bindnetaddr in the totem stanza and change it to "0.0.0.0" for the cluster to initialize. It defaults to the IP of the first cluster member, and any other members in the same subnet will be able to initialize, but members in a different subnet will see corosync unable to initialize because it can't figure out an IP address to bind to. There may be security implications to do allowing corosync to bind to the wildcard address. Simply commenting out the bindnetaddr line may also work equally well, then corosync will figure it out dynamically on each node.