Infiniband
Introduction
Infiniband can be used with DRBD to speed up replication, this article covers setting up IP over Infiniband(IPoIB)
Subnet Manager
Infiniband requires a subnet manager to function. Many Infiniband switches have a built in subnet manager that can be enabled. When using multiple switches you can enable a subnet manager on all of them for redundancy.
If your switch does not have a subnet manager, or if you are not using a switch then you need to run a subnet manager on your node(s). opensm package in Debian Squeeze and up should be suffecient if you need a subnet manager.
Sockets Direct Protocol (SDP)
SDP can be used with a preload library to speed up TCP/IP communications over Infiniband. DRBD supports SDP and offers some performance gains.
The Linux Kernel does not include the SDP module. If you want to use SDP you need to install OFED. Thus far I have been unable to get OFED to compile for Proxmox 2.0.
IPoIB
IP over Infiniband allows sending IP packets over the Infiniband fabric.
Proxmox 1.X Prerequisites
Debian Lenny network scripts do not work well with Infiniband interfaces. This can be corrected by installing the following packages from Debian squeeze:
ifenslave-2.6_1.1.0-17_amd64.deb net-tools_1.60-23_amd64.deb ifupdown_0.6.10_amd64.deb
Proxmox 2.0
Nothing special is needed with Proxmox 2.0, everything seems to work out of the box.
AFAIK this is needed [ rob f 2013-07-13 ]
aptitude install opensm
Create IPoIB Interface
Bonding
It is not possible to bond Infiniband to increase throughput If you want to use bonding for redundancy create a bonding interface.
/etc/modprobe.d/aliases-bond.conf
alias bond0 bonding options bond0 mode=1 miimon=100 downdelay=200 updelay=200 max_bonds=2
Infiniband interfaces are named ib0,ib1, etc.
Edit /etc/network/interfaces
auto bond0 iface bond0 inet static address 192.168.1.1 netmask 255.255.255.0 slaves ib0 ib1 bond_miimon 100 bond_mode active-backup pre-up modprobe ib_ipoib pre-up echo connected > /sys/class/net/ib0/mode pre-up echo connected > /sys/class/net/ib1/mode pre-up modprobe bond0 mtu 65520
To bring up the interface:
ifup bond0
Without Bonding
Edit /etc/network/interfaces
auto ib0 iface ib0 inet static address 192.168.1.1 netmask 255.255.255.0 pre-up modprobe ib_ipoib pre-up echo connected > /sys/class/net/ib0/mode mtu 65520
To bring up the interface:
ifup ib0
TCP/IP Tuning
These settings performed best on my servers, your mileage may vary.
edit /etc/sysctl.conf
#Infiniband Tuning net.ipv4.tcp_mem=1280000 1280000 1280000 net.ipv4.tcp_wmem = 32768 131072 1280000 net.ipv4.tcp_rmem = 32768 131072 1280000 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.core.rmem_default=16777216 net.core.wmem_default=16777216 net.core.optmem_max=1524288 net.ipv4.tcp_sack=0 net.ipv4.tcp_timestamps=0
To apply the changes now:
sysctl -p
iperf speed tests
on systems to test install
aptitude install iperf
on one system run as server. in example it is using Ip 10.0.99.8
iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------
on a client to test:
# iperf -c 10.0.99.8 ------------------------------------------------------------ Client connecting to 10.0.99.8, TCP port 5001 TCP window size: 646 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.99.30 port 38629 connected with 10.0.99.8 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 8.98 GBytes 7.71 Gbits/sec