ISCSI Multipath: Difference between revisions

From Proxmox VE
Jump to navigation Jump to search
No edit summary
m (make note more visible and sticky)
 
(48 intermediate revisions by 13 users not shown)
Line 1: Line 1:
In order to have multipath working :
<div class="sticky-box notice-box">Note: This article is outdated! Please check out the general article on setting up [[Multipath]].</div>


* Add the debian repositories to ''/etc/apt/sources.list'' and install the ''multipath-tools''
== Introduction ==
# aptitude update; aptitude install multipath-tools
The main purpose of multipath connectivity is to provide redundant access to a storage device, i.e., to have access to the storage device when one or more of the components in a path fail. Another advantage of multipathing is the increased throughput by way of load balancing.
A common example for the use of multipathing is to add redunancy and gain maximum performance from an iSCSI SAN device.
 
If you use iSCSI, multipath is recommended - this works without configuration on the switches (If you use NFS or CIFS, use bonding, e.g. 802.ad).
 
The connection from the Proxmox VE host through the iSCSI SAN is referred to as a path. When multiple paths exists to a storage device (LUN) on a storage subsystem, it is referred to as multipath connectivity.
Therefore, you need to ensure that you have at least two dedicated NICs for iSCSI, using separate networks (and switches to protect against switch failures).
 
This is a generic how-to. Please consult the storage vendor documentation for vendor specific settings.
 
== Update your iSCSI configuration ==
 
It is important to start all required iSCSI connections at boot time. You can do this by
setting 'node.startup' to 'automatic'.
 
The default 'node.session.timeo.replacement_timeout' is 120 seconds. We recommend using a
much smaller value of 15 seconds instead.
 
You can set those values in '/etc/iscsi/iscsid.conf' (defaults). If you are already connected to
the iSCSI target, you need to modify the target specific defaults in '/etc/iscsi/nodes/<TARGET>/<PORTAL>/default'.
 
A modified 'iscsid.conf' file contains the following lines:


* Modify ''/etc/iscsi/iscsid.conf'' to allow automatic login to the targets by uncommenting/commenting the following lines
  node.startup = automatic
  node.startup = automatic
  #node.startup = manual
node.session.timeo.replacement_timeout = 15
 
Please configure your iSCSI storage on the GUI if you have not done that already ("Datacenter/Storage: Add iSCSI target").
 
== Activate Multipath ==
 
=== Install multipath tools ===
 
The default installation does not include the 'multipath-tools' package, so you first need to install it:
# apt-get update
# apt-get install multipath-tools
 
=== Configuration ===
 
Then you need to create the multipath configuration file '/etc/multipath.conf'. You can find details about each setting in the man page:
 
# man multipath.conf
 
We recommend using the 'wwid' (World Wide Identification) to identify disks. You can use the 'scsi_id' command to get the 'wwid' for a specific device. For example, the following command returns the 'wwid' for device '/dev/sda'
 
# /lib/udev/scsi_id -g -u -d /dev/sda
 
 
We normally blacklist all devices, and only allow specific devices using 'blacklist_exceptions':
 
<pre>
blacklist {
        wwid .*
}
 
blacklist_exceptions {
        wwid "3600144f028f88a0000005037a95d0001"
        wwid "3600144f028f88a0000005037a95d0002"
}
</pre>
 
We can also use the 'alias' directive to name the device, but this is optional:
 
<pre>
multipaths {
  multipath {
        wwid "3600144f028f88a0000005037a95d0001"
        alias mpath0
  }
  multipath {
        wwid "3600144f028f88a0000005037a95d0002"
        alias mpath1
  }
}
</pre>
 
 
Finally, you need reasonable defaults. We normally use the following multibus configuration (PVE 4.x and higher):
 
<pre>
defaults {
        polling_interval        2
        path_selector          "round-robin 0"
        path_grouping_policy    multibus
        uid_attribute          ID_SERIAL
        rr_min_io              100
        failback                immediate
        no_path_retry          queue
        user_friendly_names    yes
}
</pre>
 
The wwids must also be added to the file '/etc/multipath/wwids'. To do this, run the following commands with the appropriate wwids:
<pre>
# multipath -a 3600144f028f88a0000005037a95d0001
# multipath -a 3600144f028f88a0000005037a95d0002
</pre>
 
 
 
You should also check your SAN vendor's documentation for additional information.
 
To activate these settings, you need to restart the multipath daemon with:
 
# systemctl restart multipath-tools.service
 
==== Prioritizing a Certain TCP/IP Path ====
 
If, for example, the path to destination IP address 192.168.99.99 should be prioritized (where other IP routes will be used as failover), the default section should be as follows:
 
<pre>
defaults {
        polling_interval        2
        path_selector          "round-robin 0"
        path_grouping_policy    failover
        uid_attribute          ID_SERIAL
        rr_min_io              100
        failback                immediate
        prio                    iet
        prio_args              preferredip=192.168.99.99
        no_path_retry          queue
        user_friendly_names    yes
}
 
</pre>
 
</pre>
 
=== Query device status ===
You can view the status with:
 
# multipath -ll
 
mpath0 (3600144f028f88a0000005037a95d0001) dm-3 NEXENTA,NEXENTASTOR
size=64G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=2 status=active
  |- 5:0:0:0 sdb 8:16 active ready running
  `- 6:0:0:0 sdc 8:32 active ready running
 
 
To get more information about used devices use:
 
  # multipath -v3
 
== Performance test with fio ==
In order to check the performance, you can use fio.
 
Example read test:
fio --filename=/dev/mapper/mpath0 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1
 
== Vendor specific settings ==
Please add vendor specific recommendations here.
 
=== Dell ===
 
You need to load a Dell specific module scsi_dh_rdac permanently, in order to do this, just edit:
nano /etc/modules
 
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.
scsi_dh_rdac
 
 
<pre>
defaults {
        polling_interval        2
        path_selector          "round-robin 0"
        path_grouping_policy    multibus
        getuid_callout          "/lib/udev/scsi_id -g -u -d /dev/%n"
        rr_min_io              100
        failback                immediate
        no_path_retry          queue
}
blacklist {
        wwid *
}
 
blacklist_exceptions {
        wwid 3690b22c00008da2c000008a35098b0dc
       
}


* Depending on your SAN system you may have to configure things in ''/etc/iscsi/iscsid.conf'' like the following (check your documentation):
devices {
# To specify the length of time to wait for session re-establishment
        device {
# before failing SCSI commands back to the application when running
                vendor                  "DELL"
# the Linux SCSI Layer error handler, edit the line.
                product                "MD32xxi"
# The value is in seconds and the default is 120 seconds.
                path_grouping_policy    group_by_prio
node.session.timeo.replacement_timeout = 60
                prio                    rdac
                polling_interval        5
                path_checker            rdac
                path_selector          "round-robin 0"
                hardware_handler        "1 rdac"
                failback                immediate
                features                "2 pg_init_retries 50"
                no_path_retry          30
                rr_min_io              100
        }
}


* You may also need to create a file ''/etc/multipath.clonf'' to give user-readable names to the connections and to choose the way you want to use the multipath (multibus, failover...). Once again check your SAN documentation. For example with SanMelody :
multipaths {
defaults {
        multipath {
type = ["device-mapper", 1]
                wwid 3690b22c00008da2c000008a35098b0dc
filter = ["a\|/dev/disk/by-id.*\|", "r\|.*\|" ]
                alias md3200i
polling_interval 10
devices {
device {
      vendor            "DataCore"
      product          "SAN*"
      path_checker      tur
#      path_grouping_policy      failover
      path_grouping_policy      multibus
      path_checker      tur
      path_selector    "round-robin 0"
      failback          30
        getuid_callout "/sbin/scsi_id -g -u -p 0x80 -s /block/%n"
         }
         }
}
       
}
}</pre>
 
And you need to configure a suitable filter in /etc/lvm/lvm.conf in order to avoid error messages.
 
See also:
*[http://www.dell.com/downloads/global/products/pvaul/en/powervault-md32x0-md32x0i-linux-dm-installation-en.pdf Dell Linux DM Installation Details]
*[http://www.dell.com/downloads/global/products/pvaul/en/powervault-md3200i-performance-tuning-white-paper.pdf Dell Array Tuning Best Practices]
 
=== Reduxio HX550 ===
 
==== Configuration ====
You need to configure the following:
* Update /etc/iscsi/iscsid.conf
* Create/update /etc/multipath.conf
* Create /etc/udev/rules.d/99-reduxio.rules


* Add on the GUI the iSCSI targets you need to connect to.
===== /etc/iscsi/iscsid.conf =====
* Check the multipath is working :
Add or update the following parameters:
  # multipath -l
node.startup = automatic
* You should see something similar to :
# The length of time to wait before retrying a failed IO . Can be reduced to a minimum since multipath detects the failure and immediately fails to another path. The value is in seconds and the default is typically 120.
  # multipath -l
  node.session.timeo.replacement_timeout = 5
  SDataCoreSANmelody_Proxmox-Testdm-3 DataCore,SANmelody   
  [size=2.0T][features=0][hwhandler=0]
  # The time to wait for an iSCSI login to complete. The value is in seconds and the default is 15.
  \_ round-robin 0 [prio=0][active]
node.conn[0].timeo.login_timeout = 15
    \_ 4:0:0:0 sdc 8:32 [active][undef]
    \_ 3:0:0:0 sdb 8:16 [active][undef]
# To specify the time to wait for logout to complete, edit the line.
  SDataCoreSANmelody_KVM-SRdm-4 DataCore,SANmelody   
# The value is in seconds and the default is 15 seconds.
  [size=2.0T][features=0][hwhandler=0]
node.conn[0].timeo.logout_timeout = 15
  \_ round-robin 0 [prio=0][active]
    \_ 4:0:0:1 sde 8:64 [active][undef]
# Time interval to wait for on connection before sending a ping.
    \_ 3:0:0:1 sdd 8:48  [active][undef]
  node.conn[0].timeo.noop_out_interval = 5
   
# To specify the time to wait for a Nop-out response before failing
# the connection, edit this line. Failing the connection will
# cause IO to be failed back to the SCSI layer. If using dm-multipath
# this will cause the IO to be failed to the multipath layer.
node.conn[0].timeo.noop_out_timeout = 5
#
# This retry count along with node.conn[0].timeo.login_timeout
# determines the maximum amount of time iscsid will try to
# establish the initial login. node.session.initial_login_retry_max is
# multiplied by the node.conn[0].timeo.login_timeout to determine the
  # maximum amount.
node.session.initial_login_retry_max 8


===== /etc/udev/rules.d/99-reduxio.rules =====
Create the following file:


# /etc/udev/rules.d/99-reduxio.rules
 
 SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c '/usr/sbin/iscsiadm -m session -R '"
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", ATTR{size}=="0", RUN+="/bin/sh -c 'echo 1 > /sys$DEVPATH/../../delete '"
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c 'service multipathd reload || service multipath-tools reload ' "
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c '/usr/sbin/multipath -r $DEVNAME '"


===== /etc/multipath.conf =====
Create or update /etc/multipath.conf. This is required for correct high-availability


devices {
        device {
                vendor "REDUXIO"
                product "TCAS"
                revision "2300"
                path_grouping_policy "group_by_prio"
                path_checker "tur"
                hardware_handler "1 alua"
                path_selector "round-robin 0"
                prio "alua"
                failback "immediate"
                features "0"
                rr_weight "uniform"
                no_path_retry "72"
                queue_without_daemon "no"
                rr_min_io_rq 10
                rr_min_io 10
                user_friendly_names "yes"
                fast_io_fail_tmo "10"
        }
}
blacklist {
        # Note: it is highly recommended to blacklist by wwid or vendor instead of device name
        devnode "^sd[a]$"
}


==== See also ====
*[http://www.reduxio.com Reduxio's website]


== External links ==


[[Category: HOWTO]]
[[Category: HOWTO]]

Latest revision as of 08:22, 17 October 2024

Introduction

The main purpose of multipath connectivity is to provide redundant access to a storage device, i.e., to have access to the storage device when one or more of the components in a path fail. Another advantage of multipathing is the increased throughput by way of load balancing. A common example for the use of multipathing is to add redunancy and gain maximum performance from an iSCSI SAN device.

If you use iSCSI, multipath is recommended - this works without configuration on the switches (If you use NFS or CIFS, use bonding, e.g. 802.ad).

The connection from the Proxmox VE host through the iSCSI SAN is referred to as a path. When multiple paths exists to a storage device (LUN) on a storage subsystem, it is referred to as multipath connectivity. Therefore, you need to ensure that you have at least two dedicated NICs for iSCSI, using separate networks (and switches to protect against switch failures).

This is a generic how-to. Please consult the storage vendor documentation for vendor specific settings.

Update your iSCSI configuration

It is important to start all required iSCSI connections at boot time. You can do this by setting 'node.startup' to 'automatic'.

The default 'node.session.timeo.replacement_timeout' is 120 seconds. We recommend using a much smaller value of 15 seconds instead.

You can set those values in '/etc/iscsi/iscsid.conf' (defaults). If you are already connected to the iSCSI target, you need to modify the target specific defaults in '/etc/iscsi/nodes/<TARGET>/<PORTAL>/default'.

A modified 'iscsid.conf' file contains the following lines:

node.startup = automatic
node.session.timeo.replacement_timeout = 15

Please configure your iSCSI storage on the GUI if you have not done that already ("Datacenter/Storage: Add iSCSI target").

Activate Multipath

Install multipath tools

The default installation does not include the 'multipath-tools' package, so you first need to install it:

# apt-get update
# apt-get install multipath-tools

Configuration

Then you need to create the multipath configuration file '/etc/multipath.conf'. You can find details about each setting in the man page:

# man multipath.conf

We recommend using the 'wwid' (World Wide Identification) to identify disks. You can use the 'scsi_id' command to get the 'wwid' for a specific device. For example, the following command returns the 'wwid' for device '/dev/sda'

# /lib/udev/scsi_id -g -u -d /dev/sda


We normally blacklist all devices, and only allow specific devices using 'blacklist_exceptions':

blacklist {
        wwid .*
}

blacklist_exceptions {
        wwid "3600144f028f88a0000005037a95d0001"
        wwid "3600144f028f88a0000005037a95d0002"
}

We can also use the 'alias' directive to name the device, but this is optional:

multipaths {
  multipath {
        wwid "3600144f028f88a0000005037a95d0001"
        alias mpath0
  }
  multipath {
        wwid "3600144f028f88a0000005037a95d0002"
        alias mpath1
  }
}


Finally, you need reasonable defaults. We normally use the following multibus configuration (PVE 4.x and higher):

defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        uid_attribute           ID_SERIAL
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
        user_friendly_names     yes
}

The wwids must also be added to the file '/etc/multipath/wwids'. To do this, run the following commands with the appropriate wwids:

 # multipath -a 3600144f028f88a0000005037a95d0001
 # multipath -a 3600144f028f88a0000005037a95d0002


You should also check your SAN vendor's documentation for additional information.

To activate these settings, you need to restart the multipath daemon with:

# systemctl restart multipath-tools.service

Prioritizing a Certain TCP/IP Path

If, for example, the path to destination IP address 192.168.99.99 should be prioritized (where other IP routes will be used as failover), the default section should be as follows:

defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    failover
        uid_attribute           ID_SERIAL
        rr_min_io               100
        failback                immediate
        prio                    iet
        prio_args               preferredip=192.168.99.99
        no_path_retry           queue
        user_friendly_names     yes
}

Query device status

You can view the status with:

# multipath -ll
mpath0 (3600144f028f88a0000005037a95d0001) dm-3 NEXENTA,NEXENTASTOR
size=64G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=2 status=active
  |- 5:0:0:0 sdb 8:16 active ready running
  `- 6:0:0:0 sdc 8:32 active ready running


To get more information about used devices use:

# multipath -v3

Performance test with fio

In order to check the performance, you can use fio.

Example read test:

fio --filename=/dev/mapper/mpath0 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

Vendor specific settings

Please add vendor specific recommendations here.

Dell

You need to load a Dell specific module scsi_dh_rdac permanently, in order to do this, just edit:

nano /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.
scsi_dh_rdac


defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        getuid_callout          "/lib/udev/scsi_id -g -u -d /dev/%n"
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
}
blacklist {
        wwid *
}

blacklist_exceptions {
        wwid 3690b22c00008da2c000008a35098b0dc
        
}

devices {
        device {
                vendor                  "DELL"
                product                 "MD32xxi"
                path_grouping_policy    group_by_prio
                prio                    rdac
                polling_interval        5
                path_checker            rdac
                path_selector           "round-robin 0"
                hardware_handler        "1 rdac"
                failback                immediate
                features                "2 pg_init_retries 50"
                no_path_retry           30
                rr_min_io               100
        }
}

multipaths {
        multipath {
                wwid 3690b22c00008da2c000008a35098b0dc
                alias md3200i
        }
        
}

And you need to configure a suitable filter in /etc/lvm/lvm.conf in order to avoid error messages.

See also:

Reduxio HX550

Configuration

You need to configure the following:

  • Update /etc/iscsi/iscsid.conf
  • Create/update /etc/multipath.conf
  • Create /etc/udev/rules.d/99-reduxio.rules
/etc/iscsi/iscsid.conf

Add or update the following parameters:

node.startup = automatic
# The length of time to wait before retrying a failed IO . Can be reduced to a minimum since multipath detects the failure and immediately fails to another path. The value is in seconds and the default is typically 120.
node.session.timeo.replacement_timeout = 5

# The time to wait for an iSCSI login to complete. The value is in seconds and the default is 15.
node.conn[0].timeo.login_timeout = 15

# To specify the time to wait for logout to complete, edit the line.
# The value is in seconds and the default is 15 seconds.
node.conn[0].timeo.logout_timeout = 15

# Time interval to wait for on connection before sending a ping.
node.conn[0].timeo.noop_out_interval = 5

# To specify the time to wait for a Nop-out response before failing
# the connection, edit this line. Failing the connection will
# cause IO to be failed back to the SCSI layer. If using dm-multipath
# this will cause the IO to be failed to the multipath layer.
node.conn[0].timeo.noop_out_timeout = 5

#
# This retry count along with node.conn[0].timeo.login_timeout
# determines the maximum amount of time iscsid will try to
# establish the initial login. node.session.initial_login_retry_max is
# multiplied by the node.conn[0].timeo.login_timeout to determine the
# maximum amount.
node.session.initial_login_retry_max 8
/etc/udev/rules.d/99-reduxio.rules

Create the following file:

# /etc/udev/rules.d/99-reduxio.rules
 
 SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c '/usr/sbin/iscsiadm -m session -R '"
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", ATTR{size}=="0", RUN+="/bin/sh -c 'echo 1 > /sys$DEVPATH/../../delete '"
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c 'service multipathd reload || service multipath-tools reload ' "
SUBSYSTEM=="block" , ACTION=="change", ATTRS{model}=="TCAS", ATTRS{vendor}=="REDUXIO", RUN+="/bin/sh -c '/usr/sbin/multipath -r $DEVNAME '"
/etc/multipath.conf

Create or update /etc/multipath.conf. This is required for correct high-availability

devices {
        device {
                vendor "REDUXIO"
                product "TCAS"
                revision "2300"
                path_grouping_policy "group_by_prio"
                path_checker "tur"
                hardware_handler "1 alua"
                path_selector "round-robin 0"
                prio "alua"
                failback "immediate"
                features "0"
                rr_weight "uniform"
                no_path_retry "72"
                queue_without_daemon "no"
                rr_min_io_rq 10
                rr_min_io 10
                user_friendly_names "yes"
                fast_io_fail_tmo "10"
        }
}
blacklist {
        # Note: it is highly recommended to blacklist by wwid or vendor instead of device name
        devnode "^sd[a]$"
}

See also

External links