Infiniband

From Proxmox VE
Jump to navigation Jump to search

Introduction

Infiniband can be used with DRBD to speed up replication, this article covers setting up IP over Infiniband(IPoIB)

Subnet Manager

Infiniband requires a subnet manager to function. Many Infiniband switches have a built in subnet manager that can be enabled. When using multiple switches you can enable a subnet manager on all of them for redundancy.

If your switch does not have a subnet manager, or if you are not using a switch then you need to run a subnet manager on your node(s). opensm package in Debian Squeeze and up should be suffecient if you need a subnet manager.


Sockets Direct Protocol (SDP)

SDP can be used with a preload library to speed up TCP/IP communications over Infiniband. DRBD supports SDP and offers some performance gains.

The Linux Kernel does not include the SDP module. If you want to use SDP you need to install OFED. Thus far I have been unable to get OFED to compile for Proxmox 2.0.

IPoIB

IP over Infiniband allows sending IP packets over the Infiniband fabric.

Proxmox 1.X Prerequisites

Debian Lenny network scripts do not work well with Infiniband interfaces. This can be corrected by installing the following packages from Debian squeeze:

ifenslave-2.6_1.1.0-17_amd64.deb
net-tools_1.60-23_amd64.deb
ifupdown_0.6.10_amd64.deb

Proxmox 2.0

Nothing special is needed with Proxmox 2.0, everything seems to work out of the box.

AFAIK this is needed [ rob f 2013-07-13 ]

aptitude install  opensm

Create IPoIB Interface

Bonding

It is not possible to bond Infiniband to increase throughput If you want to use bonding for redundancy create a bonding interface.

/etc/modprobe.d/aliases-bond.conf

alias bond0 bonding
options bond0 mode=1 miimon=100 downdelay=200 updelay=200 max_bonds=2


Infiniband interfaces are named ib0,ib1, etc. Edit /etc/network/interfaces

auto bond0
iface bond0 inet static
        address  192.168.1.1
        netmask  255.255.255.0
        slaves ib0 ib1
        bond_miimon 100
        bond_mode active-backup
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib0/mode
        pre-up echo connected > /sys/class/net/ib1/mode
        pre-up modprobe bond0
        mtu 65520 

To bring up the interface:

ifup bond0

Without Bonding

Edit /etc/network/interfaces

auto ib0
iface ib0 inet static
        address  192.168.1.1
        netmask  255.255.255.0
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib0/mode
        mtu 65520 

To bring up the interface:

ifup ib0

TCP/IP Tuning

These settings performed best on my servers, your mileage may vary.

edit /etc/sysctl.conf

#Infiniband Tuning
net.ipv4.tcp_mem=1280000 1280000 1280000
net.ipv4.tcp_wmem = 32768 131072 1280000
net.ipv4.tcp_rmem = 32768 131072 1280000
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.rmem_default=16777216
net.core.wmem_default=16777216
net.core.optmem_max=1524288
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=0

To apply the changes now:

sysctl -p


iperf speed tests

on systems to test install

aptitude install iperf

on one system run as server. in example it is using Ip 10.0.99.8

iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------

on a client to test:

# iperf -c 10.0.99.8
------------------------------------------------------------
Client connecting to 10.0.99.8, TCP port 5001
TCP window size:  646 KByte (default)
------------------------------------------------------------
[  3] local 10.0.99.30 port 38629 connected with 10.0.99.8 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  8.98 GBytes  7.71 Gbits/sec