Fencing: Difference between revisions

From Proxmox VE
Jump to navigation Jump to search
No edit summary
(archive)
 
(61 intermediate revisions by 14 users not shown)
Line 1: Line 1:
{{Note|Article about Proxmox VE 2.0 beta}}
{{PVE3}}
=Introduction=
== Introduction ==
To ensure data integrity, only one node is allowed to run a VM or any other cluster-service at a time. The use of power switches in the hardware configuration enables a node to power-cycle another node before restarting that node's HA services during a fail-over process. This prevents two nodes from simultaneously accessing the same data and corrupting it. Fence devices are used to guarantee data integrity under all failure conditions.
To ensure data integrity, only one node is allowed to run a VM or any other cluster-service at a time. The use of power switches in the hardware configuration enables a node to power-cycle another node before restarting that node's HA services during a fail-over process. This prevents two nodes from simultaneously accessing the same data and corrupting it. Fence devices are used to guarantee data integrity under all failure conditions.
=Configure nodes to boot immediately and always after power cycle=
 
Check your bios settings and test if if works. Just unplug the power cord and test if the server boots up after reconnecting.
For a good easy introduction to HA fencing concepts device, see: http://www.clusterlabs.org/doc/crm_fencing.html
and also http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_Fence_Devices/index.html
 
obviously read the most general and introductive parts of these docs, as they are referring to other HA software, not pve.
 
== Configure nodes to boot immediately and always after power cycle ==
Check your bios settings and test if it works. Just unplug the power cord and test if the server boots up after reconnecting.


If you use integrated fence devices, you must configure ACPI (Advanced Configuration and Power Interface) to ensure immediate and complete fencing - here are the different options:
If you use integrated fence devices, you must configure ACPI (Advanced Configuration and Power Interface) to ensure immediate and complete fencing - here are the different options:
Line 12: Line 18:
In any case, you need to '''make sure that the node turns off immediately when fenced.''' If you have delays here, the HA resources cannot be moved.
In any case, you need to '''make sure that the node turns off immediately when fenced.''' If you have delays here, the HA resources cannot be moved.


=List of supported fence devices=
== Enable fencing on all nodes ==
In order to get fencing active, you also need to join each node to the fencing domain. Do the following on all your cluster nodes.
 
*Enable fencing in /etc/default/redhat-cluster-pve (Just uncomment the last line, see below):
nano /etc/default/redhat-cluster-pve
<pre>
FENCE_JOIN="yes"
</pre>
 
*restart cman service:
/etc/init.d/cman restart
 
*join the fence domain with:
fence_tool join
 
To check the status, just run (this example shows all 3 nodes already joined):
fence_tool ls
 
<pre>fence domain
member count  3
victim count  0
victim now    0
master nodeid 1
wait state    none
members      1 2 3</pre>
 
'''Note'''
If the cluster goes out of sync, when you complete the join and restart cman service for all nodes, then you must also restart the service pve-cluster on all nodes:
service pve-cluster restart
 
=General HowTo for editing the cluster.conf=
'''Note: ''this is no more valid under PVE 4''''' : You should rather read the page [[Editing_corosync.conf]]
 
First, create a copy of the current cluster.conf, make the needed changes, increase the config_version number, check the syntax and if everything is ready, activate the new config via GUI.
 
Here are the steps :
cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
 
nano /etc/pve/cluster.conf.new
 
If you edit this file via CLI, you need to increase ALWAYS the "config_version" number. This guarantees that the all nodes apply´s the new settings.
 
You should validate the config with the following command ['''''Note: this is no more valid under PVE 4''''', see [[Editing_corosync.conf]]]:
ccs_config_validate -v -f /etc/pve/cluster.conf.new
 
In order to apply this new config, you need to go to the web interface (Datacenter/HA). You can see the changes done and if the syntax is ok you can commit the changed via GUI to all  nodes. By doing this, all nodes gets the info about the new config and apply them automatically.
 
== List of supported fence devices ==


==APC Switch Rack PDU==
=== APC Switch Rack PDU ===
E.g. AP7921, here is a example used in our test lab.
E.g. AP7921, here is a example used in our test lab.


===Create a user on the APC web interface===
==== Create a user on the APC web interface ====
I just configured a new user via "Outlet User Management"
I just configured a new user via "Outlet User Management"
*user name: hpapc
*user name: hpapc
Line 24: Line 78:
Make sure that you enable "Outlet Access" and SSH and the most important part, make sure you connected the physical servers to the right power supply.
Make sure that you enable "Outlet Access" and SSH and the most important part, make sure you connected the physical servers to the right power supply.


===Example /etc/pve/cluster.conf.new with APC power fencing===
==== Example /etc/pve/cluster.conf.new with APC power fencing ====
This example uses the APC power switch as fencing device. Additionally, a simple "TestIP" is used for HA service and fail-over testing.
This example uses the APC power switch as fencing device (make sure you enabled SSH on your APC). Additionally, a simple "TestIP" is used for HA service and fail-over testing.


  cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
  cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
Line 39: Line 93:


   <fencedevices>
   <fencedevices>
     <fencedevice agent="fence_apc" ipaddr="192.168.2.30" login="hpapc" name="apc" passwd="12345678"/>
     <fencedevice agent="fence_apc" ipaddr="192.168.2.30" login="hpapc" name="apc" passwd="12345678" power_wait="10"/>
   </fencedevices>
   </fencedevices>


Line 90: Line 144:
'''Note'''
'''Note'''


If you edit this file via CLI, you need to increase ALWAYS the "config_version" number. This guarantees that the all nodes apply´s the new settings.
If you edit this file via CLI, you need to increase ALWAYS the "config_version" number. This guarantees that the all nodes apply´s the new settings.  
 
You should validate the config with the following command:
ccs_config_validate -v -f /etc/pve/cluster.conf.new


In order to apply this new config, you need to go to the web interface (Datacenter/HA). You can see the changes done and if the syntax is ok you can commit the changed via gui to all  nodes. By doing this, all nodes gets the info about the new config and apply them automatically.
In order to apply this new config, you need to go to the web interface (Datacenter/HA). You can see the changes done and if the syntax is ok you can commit the changed via gui to all  nodes. By doing this, all nodes gets the info about the new config and apply them automatically.


===Enable fencing on all nodes===
The power_wait option specifies how long to wait between performing a power action. Without it the server will be turned off, then on in quick succession. Setting this ensures that the server will be turned off for a certain amount of time before being turned back on resulting in more reliable fencing.
In order to get fencing active, you also need to join each node to the fencing domain. To the following on all your cluster nodes.


*Enable fencing in /etc/default/redhat-cluster-pve (Just uncomment the last line, see below):
=== [[Intel Modular Server HA]] ===
nano /etc/default/redhat-cluster-pve
<pre># CLUSTERNAME=""
# NODENAME=""
# USE_CCS="yes"
# CLUSTER_JOIN_TIMEOUT=300
# CLUSTER_JOIN_OPTIONS=""
# CLUSTER_SHUTDOWN_TIMEOUT=60
# RGMGR_OPTIONS=""
FENCE_JOIN="yes"</pre>


*join the fence domain with:
=== [[Dell servers using iDRAC]] ===
fence_tool join


To check the status, just run (this example shows all 3 nodes already joined):
Dell iDRAC cards can be used as fencing devices
  fence_tool ls
==== Create a user on the Dell iDRAC ====
* Create a fencing_user account on each iDRAC with 'Operator' permissions.
* Although the iDRAC network is usually on a private, secure network, unique passwords for each machine can be entered in the configuration below.
:* Configure your fence user under iDRAC User Authentication and add ''Operator'' status for iDRAC, LAN, and Serial Port.
:* Set IPMI User Privileges to ''Operator'' and check Enable Serial Over LAN
* Your proxmox hosts need to have network access, through ssh to your Dell iDRAC cards.
* See [[Fencing#Testing_Dell_iDRAC|Testing Dell iDRAC]] to verify your syntax.
==== Example /etc/pve/cluster.conf.new with iDRAC ====
This config was tested with DRAC V7 cards.


<pre>fence domain
See [[Fencing#General_HowTo_for_editing_the_cluster.conf|above]] for editing/activation steps.
member count  3
nano /etc/pve/cluster.conf.new
victim count  0
victim now    0
master nodeid 1
wait state    none
members      1 2 3</pre>
 
===Test fencing=== 
Before you use the fencing device, make sure that it works as expected. In my example configuration, the AP7921 uses the IP 192.168.2.30:
 
Query the status of power supply:
fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o status -n 1 -v
 
Reboot the server using fence_apc:
fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o reboot -n 1 -v
 
==[[Intel Modular Server HA]]==
 
==[[Dell servers]]==
 
You can use dell drac cards as fencing devices.
 
Your proxmox hosts need to have network access, through ssh to your dell drac cards.
 
This config was tested with DRAC V5 cards, but it must works with V6.


<source lang="xml">
<source lang="xml">
<?xml version="1.0"?>
<?xml version="1.0"?>
<cluster name="hpcluster765" config_version="28">
<cluster name="peR620" config_version="28">
   <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
   <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
   </cman>
   </cman>
   <fencedevices>
   <fencedevices>
     <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="root" name="node1-drac" passwd="XXXX" secure="1"/>
     <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node1-drac" passwd="XXXX" secure="1"/>
     <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="root" name="node2-drac" passwd="XXXX" secure="1"/>
     <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node2-drac" passwd="XXXX" secure="1"/>
     <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="root" name="node3-drac" passwd="XXXX" secure="1"/>
     <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node3-drac" passwd="XXXX" secure="1"/>
   </fencedevices>
   </fencedevices>
   <clusternodes>
   <clusternodes>
Line 174: Line 204:
   </clusternode>
   </clusternode>
   </clusternodes>
   </clusternodes>
  <rm>
    <service autostart="1" exclusive="0" name="TestIP" recovery="relocate">
      <ip address="192.168.7.180"/>
    </service>
  </rm>
</cluster>
</cluster>
</source>
See [[Fencing#General_HowTo_for_editing_the_cluster.conf|above]] for editing/activation steps.
For Dell iDRAC5 Cards you can basically use the same config as for DRAC7, but you need to change the ''fencedevice'' commands to:
<source lang="xml">
  <fencedevices>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node1-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node2-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node3-drac" passwd="XXXX" secure="1"/>
  </fencedevices>
</source>
</source>


==[[Dell blade servers]]==
=== [[Dell blade servers]] ===
PowerEdge M1000e Chassis Management Controller (CMC) acts as a network power switch of sorts.  You configure a single IP address on the CMC, and connect to that IP for management.  Individual blade slots can be powered up or down as needed.   
PowerEdge M1000e Chassis Management Controller (CMC) acts as a network power switch of sorts.  You configure a single IP address on the CMC, and connect to that IP for management.  Individual blade slots can be powered up or down as needed.   


Line 194: Line 230:


Example:
Example:


<source lang="xml">
<source lang="xml">
Line 238: Line 272:
</source>
</source>


==to be extended==
tbd.


[[Category: Proxmox VE 2.0]]
=== [[IPMI (generic)]] ===
 
This is a generic method for IPMI
 
this is needed on all nodes  2013-07-02 . see notes at end of section.
aptitude install ipmitool
 
 
<source lang="xml">
<?xml version="1.0"?>
<cluster name="clustername" config_version="6">
    <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
    </cman>
    <fencedevices>
        <fencedevice agent="fence_ipmilan" name="ipmi1" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="ipmi2" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="ipmi3" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
    </fencedevices>
    <clusternodes>
    <clusternode name="host1" votes="1" nodeid="1">
        <fence>
            <method name="1">
                <device name="ipmi1"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="host2" votes="1" nodeid="2">
        <fence>
            <method name="1">
                <device name="ipmi2"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="host3" votes="1" nodeid="3">
        <fence>
            <method name="1">
                <device name="ipmi3"/>
            </method>
        </fence>
    </clusternode>
</clusternodes>
<rm>
    <service autostart="1" exclusive="0" name="ha_test_ip" recovery="relocate">
        <ip address="192.168.7.180"/>
    </service>
</rm>
</cluster>
</source>
 
==== IPMI notes ====
 
After setting up IPMI  in cluster.conf, I tested and got this:
<pre>
fbc3  ~ # fence_node fbc240 -vv
fence fbc240 dev 0.0 agent fence_ipmilan result: error from agent
agent args: nodename=fbc240 agent=fence_ipmilan lanplus=1 ipaddr=10.1.10.173 login=**** passwd=****** power_wait=5
fence fbc240 failed
</pre>
which was solved with
aptitude install ipmitool
then:
<pre>
fbc3  ~ # fence_node fbc240 -vv
fence fbc240 dev 0.0 agent fence_ipmilan result: success
agent args: nodename=fbc240 agent=fence_ipmilan lanplus=1 ipaddr=10.1.10.173 login=***** passwd=*** power_wait=5
fence fbc240 success
</pre>
 
The above was tested on a SuperMicro system.
 
=== APC Master Switch ===
 
Some old APC PDUs do not support SSH and do not work with fence_apc.
These older units do work with SNMP allowing the fence agent fence_apc_snmp to work.
 
 
<source lang="xml">
<?xml version="1.0"?>
<cluster name="hpcluster765" config_version="28">
 
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>
 
  <fencedevices>
    <fencedevice agent="fence_apc_snmp" ipaddr="192.168.2.30" name="apc" community="12345678" power_wait="10"/>
 
  </fencedevices>
 
  <clusternodes>
 
  <clusternode name="hp4" votes="1" nodeid="1">
    <fence>
      <method name="power">
        <device name="apc" port="4" />
      </method>
    </fence>
  </clusternode>
 
  <clusternode name="hp1" votes="1" nodeid="2">
    <fence>
      <method name="power">
        <device name="apc" port="1" />
      </method>
    </fence>
  </clusternode>
 
  <clusternode name="hp3" votes="1" nodeid="3">
    <fence>
      <method name="power">
        <device name="apc" port="3" />
      </method>
    </fence>
  </clusternode>
 
  <clusternode name="hp2" votes="1" nodeid="4">
    <fence>
      <method name="power">
        <device name="apc" port="2" />
      </method>
    </fence>
  </clusternode>
 
  </clusternodes>
 
  <rm>
    <service autostart="1" exclusive="0" name="TestIP" recovery="relocate">
      <ip address="192.168.7.180"/>
    </service>
  </rm>
 
</cluster>
 
</source>
 
=== Fencing using a managed switch ===
 
'''Prerequisites''':  
 
#A managed switch supporting SNMP
#Write access to the switch through SNMP
 
<br>
 
The idea behind this method is to either isolate the entire node or isolate the node from shared storage. The way this is done is to call the switch using the proper command to disable one or more port(s) on the switch and doing so effectively avoid the node from being able to start a VM or CT on the shared storage since no route will exists to the shared storage from the node. Restoring the access to the shared storage requires operator intervention on the switch or by running the fence command with the option to open the port(s) again. If the nodes are using bonding you need to disable the bridge aggregation on the switch and not the individual ports which is members of the bridge aggregation.
 
The shown example here uses SNMPv2c without password but a configured ACL on the switch only allowing members running on the cluster vlan access to the configured fencing group on the switch. The fence_agent supports both an index number or the name for the ports.
 
'''See list of known interfaces on the switch''': fence_ifmib -o list -c &lt;community&gt; -a &lt;IP&gt; -n switch
 
'''Disable a specific interface on the switch''': fence_ifmib --action=off -c &lt;community&gt; -a &lt;IP&gt; -n &lt;index|name&gt;
 
'''Enable a specific interface on the switch''': fence_ifmib --action=on -c &lt;community&gt; -a &lt;IP&gt; -n &lt;index|name&gt;
<br>
<br>
Example:
 
<source lang="xml"><?xml version="1.0"?>
<cluster config_version="74" name="proxmox">
<cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<quorumd allow_kill="0" interval="3" label="proxmox1_qdisk" tko="10" votes="1">
  <heuristic interval="3" program="ping $GATEWAY -c1 -w1" score="1" tko="4"/>
  <heuristic interval="3" program="ip addr | grep eth1 | grep -q UP" score="2" tko="3"/>
</quorumd>
<totem token="54000"/>
<fencedevices>
  <fencedevice agent="fence_ifmib" community="fencing" ipaddr="172.16.3.254" name="hp1910" snmp_version="2c"/>
</fencedevices>
<clusternodes>
  <clusternode name="esx1" nodeid="1" votes="1">
    <fence>
      <method name="fence">
        <device action="off" name="hp1910" port="Bridge-Aggregation2"/>
      </method>
    </fence>
  </clusternode>
  <clusternode name="esx2" nodeid="2" votes="1">
    <fence>
      <method name="fence">
        <device action="off" name="hp1910" port="Bridge-Aggregation3"/>
      </method>
    </fence>
  </clusternode>
</clusternodes>
<rm>
  <failoverdomains>
    <failoverdomain name="webfailover" ordered="0" restricted="1">
      <failoverdomainnode name="esx1"/>
      <failoverdomainnode name="esx2"/>
    </failoverdomain>
  </failoverdomains>
  <resources>
    <ip address="172.16.3.7" monitor_link="5"/>
  </resources>
  <service autostart="1" domain="webfailover" name="web" recovery="relocate">
    <ip ref="172.16.3.7"/>
  </service>
  <pvevm autostart="1" vmid="109"/>
</rm>
</cluster></source><br>
 
== Multiple methods for a node ==
 
Note: See also '''man fenced'''
 
In more  advanced  configurations,  multiple  fencing  methods  can  be defined  for  a  node.  If fencing fails using the first method, fenced will try the next method, and continue to cycle through  methods until one succeeds.
 
<source lang="xml">
      <clusternode name="node1" nodeid="1">
              <fence>
              <method name="1">
              <device name="myswitch" foo="x"/>
              </method>
              <method name="2">
              <device name="another" bar="123"/>
              </method>
              </fence>
      </clusternode>
 
      <fencedevices>
              <fencedevice name="myswitch" agent="..." something="..."/>
              <fencedevice name="another" agent="..."/>
      </fencedevices>
</source>
 
== Dual path, redundant power ==
Note: See also '''man fenced'''
 
Sometimes  fencing a node requires disabling two power ports or two i/o paths. This is done by specifying two or more devices within a method. fenced  will  run  the agent for the device twice, once for each device line, and both must succeed for fencing to be considered successful.
 
<source lang="xml">
      <clusternode name="node1" nodeid="1">
              <fence>
              <method name="1">
              <device name="sanswitch1" port="11"/>
              <device name="sanswitch2" port="11"/>
              </method>
              </fence>
      </clusternode>
</source>
 
When using power switches to fence nodes with dual power supplies, the agents must be told to turn off both power ports before restoring power to either port.  The default off-on behavior of the agent could  result
in the power never being fully disabled to the node.
 
<source lang="xml">
      <clusternode name="node1" nodeid="1">
              <fence>
              <method name="1">
              <device name="nps1" port="11" action="off"/>
              <device name="nps2" port="11" action="off"/>
              <device name="nps1" port="11" action="on"/>
              <device name="nps2" port="11" action="on"/>
              </method>
              </fence>
      </clusternode>
</source>
 
=Test fencing=
Before you use the fencing device, make sure that it works as expected:
 
Display internal fenced state:
fence_tool ls
<source lang="xml">
  fence domain
  member count  3
  victim count  0
  victim now    0
  master nodeid 3
  wait state    none
  members      2 3 4
</source>
 
Test fencing with '''fence_node''':
fence_node NODENAME -vv
Where NODENAME is from the node definition line:
<clusternode name="NODENAME" nodeid="1" votes="1">
You should get a "success" here and your machine powers off.
 
Repeat the command and the machine powers back on.
 
=== Testing APC Switch Rack PDU ===
In my example configuration, the AP7921 uses the IP 192.168.2.30.
 
Query the status of power supply:
fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o status -n 1 -v
 
Reboot the server using fence_apc:
fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o reboot -n 1 -v
 
Test fencing with '''fence_node''':
 
fence_node NODENAME -vv
 
You should get a "success" here.
 
=== Testing Dell iDRAC ===
 
iDRAC[5,6,7] use the fence_drac5 agent, as indicated in the Dell settings above.
 
Test on the command line of another server using:
fence_drac5 --ip="10.1.1.2" --username="prox-b-drac" --password="****" --ssh --verbose --debug-file="/tmp/foo" --command-prompt="admin1->" --action="off"
 
Check the /tmp/foo file for connection logs.
 
* Can you ssh into the iDRAC using the given username/password?
* Is ssh enabled within iDRAC management?  (Overview > iDRAC preferences > iDRAC Settings > Network > Services > ssh)
nc -zv [ipOf_iDRAC] 22
* Are you trying to connect to the iDRAC management port or the Proxmox IP address?


[[Category: HOWTO]]
[[Category: Archive]]
[[Category: Proxmox VE 3.x]]

Latest revision as of 16:10, 18 July 2019

Yellowpin.svg Note: This article is about the previous Proxmox VE 3.x releases

Introduction

To ensure data integrity, only one node is allowed to run a VM or any other cluster-service at a time. The use of power switches in the hardware configuration enables a node to power-cycle another node before restarting that node's HA services during a fail-over process. This prevents two nodes from simultaneously accessing the same data and corrupting it. Fence devices are used to guarantee data integrity under all failure conditions.

For a good easy introduction to HA fencing concepts device, see: http://www.clusterlabs.org/doc/crm_fencing.html and also http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_Fence_Devices/index.html

obviously read the most general and introductive parts of these docs, as they are referring to other HA software, not pve.

Configure nodes to boot immediately and always after power cycle

Check your bios settings and test if it works. Just unplug the power cord and test if the server boots up after reconnecting.

If you use integrated fence devices, you must configure ACPI (Advanced Configuration and Power Interface) to ensure immediate and complete fencing - here are the different options:

  • make sure that you did not installed acpid (remove with: aptitude remove acpid)
  • disable ACPI soft-off in the bios
  • disable via acpi=off to the kernel boot command line

In any case, you need to make sure that the node turns off immediately when fenced. If you have delays here, the HA resources cannot be moved.

Enable fencing on all nodes

In order to get fencing active, you also need to join each node to the fencing domain. Do the following on all your cluster nodes.

  • Enable fencing in /etc/default/redhat-cluster-pve (Just uncomment the last line, see below):
nano /etc/default/redhat-cluster-pve
FENCE_JOIN="yes"
  • restart cman service:
/etc/init.d/cman restart
  • join the fence domain with:
fence_tool join

To check the status, just run (this example shows all 3 nodes already joined):

fence_tool ls
fence domain
member count  3
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2 3

Note If the cluster goes out of sync, when you complete the join and restart cman service for all nodes, then you must also restart the service pve-cluster on all nodes:

service pve-cluster restart

General HowTo for editing the cluster.conf

Note: this is no more valid under PVE 4 : You should rather read the page Editing_corosync.conf

First, create a copy of the current cluster.conf, make the needed changes, increase the config_version number, check the syntax and if everything is ready, activate the new config via GUI.

Here are the steps :

cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
nano /etc/pve/cluster.conf.new

If you edit this file via CLI, you need to increase ALWAYS the "config_version" number. This guarantees that the all nodes apply´s the new settings.

You should validate the config with the following command [Note: this is no more valid under PVE 4, see Editing_corosync.conf]:

ccs_config_validate -v -f /etc/pve/cluster.conf.new

In order to apply this new config, you need to go to the web interface (Datacenter/HA). You can see the changes done and if the syntax is ok you can commit the changed via GUI to all nodes. By doing this, all nodes gets the info about the new config and apply them automatically.

List of supported fence devices

APC Switch Rack PDU

E.g. AP7921, here is a example used in our test lab.

Create a user on the APC web interface

I just configured a new user via "Outlet User Management"

  • user name: hpapc
  • password: 12345678

Make sure that you enable "Outlet Access" and SSH and the most important part, make sure you connected the physical servers to the right power supply.

Example /etc/pve/cluster.conf.new with APC power fencing

This example uses the APC power switch as fencing device (make sure you enabled SSH on your APC). Additionally, a simple "TestIP" is used for HA service and fail-over testing.

cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
nano /etc/pve/cluster.conf.new
<?xml version="1.0"?>
<cluster name="hpcluster765" config_version="28">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>

  <fencedevices>
    <fencedevice agent="fence_apc" ipaddr="192.168.2.30" login="hpapc" name="apc" passwd="12345678" power_wait="10"/>
  </fencedevices>

  <clusternodes>

  <clusternode name="hp4" votes="1" nodeid="1">
    <fence>
      <method name="power">
        <device name="apc" port="4" secure="on"/>
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp1" votes="1" nodeid="2">
    <fence>
      <method name="power">
        <device name="apc" port="1" secure="on"/>
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp3" votes="1" nodeid="3">
    <fence>
      <method name="power">
        <device name="apc" port="3" secure="on"/>
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp2" votes="1" nodeid="4">
    <fence>
      <method name="power">
        <device name="apc" port="2" secure="on"/>
      </method>
    </fence>
  </clusternode>

  </clusternodes>

  <rm>
    <service autostart="1" exclusive="0" name="TestIP" recovery="relocate">
      <ip address="192.168.7.180"/>
    </service>
  </rm>

</cluster>

Note

If you edit this file via CLI, you need to increase ALWAYS the "config_version" number. This guarantees that the all nodes apply´s the new settings.

You should validate the config with the following command:

ccs_config_validate -v -f /etc/pve/cluster.conf.new

In order to apply this new config, you need to go to the web interface (Datacenter/HA). You can see the changes done and if the syntax is ok you can commit the changed via gui to all nodes. By doing this, all nodes gets the info about the new config and apply them automatically.

The power_wait option specifies how long to wait between performing a power action. Without it the server will be turned off, then on in quick succession. Setting this ensures that the server will be turned off for a certain amount of time before being turned back on resulting in more reliable fencing.

Intel Modular Server HA

Dell servers using iDRAC

Dell iDRAC cards can be used as fencing devices

Create a user on the Dell iDRAC

  • Create a fencing_user account on each iDRAC with 'Operator' permissions.
  • Although the iDRAC network is usually on a private, secure network, unique passwords for each machine can be entered in the configuration below.
  • Configure your fence user under iDRAC User Authentication and add Operator status for iDRAC, LAN, and Serial Port.
  • Set IPMI User Privileges to Operator and check Enable Serial Over LAN
  • Your proxmox hosts need to have network access, through ssh to your Dell iDRAC cards.
  • See Testing Dell iDRAC to verify your syntax.

Example /etc/pve/cluster.conf.new with iDRAC

This config was tested with DRAC V7 cards.

See above for editing/activation steps.

nano /etc/pve/cluster.conf.new
<?xml version="1.0"?>
<cluster name="peR620" config_version="28">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node1-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node2-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1->" ipaddr="X.X.X.X" login="fencing_user" name="node3-drac" passwd="XXXX" secure="1"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="node1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="node1-drac"/>
        </method>
      </fence>
  </clusternode>
  <clusternode name="node2" nodeid="2" votes="1">
    <fence>
      <method name="1">
        <device name="node2-drac"/>
      </method>
    </fence>
  </clusternode>
  <clusternode name="node3" nodeid="3" votes="1">
    <fence>
      <method name="1">
        <device name="node3-drac"/>
      </method>
    </fence>
  </clusternode>
  </clusternodes>
</cluster>

See above for editing/activation steps.

For Dell iDRAC5 Cards you can basically use the same config as for DRAC7, but you need to change the fencedevice commands to:

  <fencedevices>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node1-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node2-drac" passwd="XXXX" secure="1"/>
    <fencedevice agent="fence_drac5" ipaddr="X.X.X.X" login="fencing_user" name="node3-drac" passwd="XXXX" secure="1"/>
  </fencedevices>

Dell blade servers

PowerEdge M1000e Chassis Management Controller (CMC) acts as a network power switch of sorts. You configure a single IP address on the CMC, and connect to that IP for management. Individual blade slots can be powered up or down as needed.

NOTE: At the time of this writing, there is a bug that prevents the CMC from powering the blade back up after it is fenced. To recover from a fenced outage, manually power the blade on (or connect to the CMC and issue the command racadm serveraction -m server-# powerup). New code available for testing can correct this behavior. See Bug 466788 for beta code and further discussions on this issue.

NOTE: Using the individual iDRAC on each Dell Blade is not supported at this time. Instead use the Dell CMC as described in this section. If desired, you may configure IPMI as your secondary fencing method for individual Dell Blades. For information on support of the Dell iDRAC, see Bug 496748.

To configure your nodes for DRAC CMC fencing:

  1. For CMC IP Address enter the DRAC CMC IP address.
  2. Enter the specific blade for Module Name. For example, enter server-1 for blade 1, and server-4 for blade 4.

Example:

<?xml version="1.0"?>
<cluster name="hpcluster765" config_version="28">
  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>
  <fencedevices>
       <fencedevice agent="fence_drac5" module_name="server-1" ipaddr="CMC IP Address (X.X.X.X)" login="root" secure="1" name="drac-cmc-blade1" passwd="drac_password"/>
       <fencedevice agent="fence_drac5" module_name="server-2" ipaddr="CMC IP Address (X.X.X.X)" login="root" secure="1" name="drac-cmc-blade2" passwd="drac_password"/>
       <fencedevice agent="fence_drac5" module_name="server-2" ipaddr="CMC IP Address (X.X.X.X)" login="root" secure="1" name="drac-cmc-blade3" passwd="drac_password"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="node1" nodeid="1" votes="1">
      <fence>
        <method name="1">
          <device name="drac-cmc-blade1"/>
        </method>
      </fence>
  </clusternode>
  <clusternode name="node2" nodeid="2" votes="1">
    <fence>
      <method name="1">
        <device name="drac-cmc-blade2"/>
      </method>
    </fence>
  </clusternode>
  <clusternode name="node3" nodeid="3" votes="1">
    <fence>
      <method name="1">
        <device name="drac-cmc-blade3"/>
      </method>
    </fence>
  </clusternode>
  </clusternodes>
  <rm>
    <service autostart="1" exclusive="0" name="TestIP" recovery="relocate">
      <ip address="192.168.7.180"/>
    </service>
  </rm>
</cluster>


IPMI (generic)

This is a generic method for IPMI

this is needed on all nodes 2013-07-02 . see notes at end of section.

aptitude install ipmitool


<?xml version="1.0"?>
<cluster name="clustername" config_version="6">
    <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
    </cman>
    <fencedevices>
        <fencedevice agent="fence_ipmilan" name="ipmi1" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="ipmi2" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
        <fencedevice agent="fence_ipmilan" name="ipmi3" lanplus="1" ipaddr="X.X.X.X" login="ipmiusername" passwd="ipmipassword" power_wait="5"/>
    </fencedevices>
    <clusternodes>
    <clusternode name="host1" votes="1" nodeid="1">
        <fence>
            <method name="1">
                 <device name="ipmi1"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="host2" votes="1" nodeid="2">
        <fence>
            <method name="1">
                 <device name="ipmi2"/>
            </method>
        </fence>
    </clusternode>
    <clusternode name="host3" votes="1" nodeid="3">
        <fence>
            <method name="1">
                 <device name="ipmi3"/>
            </method>
        </fence>
    </clusternode>
</clusternodes>
<rm>
    <service autostart="1" exclusive="0" name="ha_test_ip" recovery="relocate">
        <ip address="192.168.7.180"/>
    </service>
</rm>
</cluster>

IPMI notes

After setting up IPMI in cluster.conf, I tested and got this:

fbc3  ~ # fence_node fbc240 -vv
fence fbc240 dev 0.0 agent fence_ipmilan result: error from agent
agent args: nodename=fbc240 agent=fence_ipmilan lanplus=1 ipaddr=10.1.10.173 login=**** passwd=****** power_wait=5 
fence fbc240 failed

which was solved with

aptitude install ipmitool

then:

fbc3  ~ # fence_node fbc240 -vv
fence fbc240 dev 0.0 agent fence_ipmilan result: success
agent args: nodename=fbc240 agent=fence_ipmilan lanplus=1 ipaddr=10.1.10.173 login=***** passwd=*** power_wait=5 
fence fbc240 success

The above was tested on a SuperMicro system.

APC Master Switch

Some old APC PDUs do not support SSH and do not work with fence_apc. These older units do work with SNMP allowing the fence agent fence_apc_snmp to work.


<?xml version="1.0"?>
<cluster name="hpcluster765" config_version="28">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>

  <fencedevices>
    <fencedevice agent="fence_apc_snmp" ipaddr="192.168.2.30" name="apc" community="12345678" power_wait="10"/>

  </fencedevices>

  <clusternodes>

  <clusternode name="hp4" votes="1" nodeid="1">
    <fence>
      <method name="power">
        <device name="apc" port="4" />
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp1" votes="1" nodeid="2">
    <fence>
      <method name="power">
        <device name="apc" port="1" />
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp3" votes="1" nodeid="3">
    <fence>
      <method name="power">
        <device name="apc" port="3" />
      </method>
    </fence>
  </clusternode>

  <clusternode name="hp2" votes="1" nodeid="4">
    <fence>
      <method name="power">
        <device name="apc" port="2" />
      </method>
    </fence>
  </clusternode>

  </clusternodes>

  <rm>
    <service autostart="1" exclusive="0" name="TestIP" recovery="relocate">
      <ip address="192.168.7.180"/>
    </service>
  </rm>

</cluster>

Fencing using a managed switch

Prerequisites:

  1. A managed switch supporting SNMP
  2. Write access to the switch through SNMP


The idea behind this method is to either isolate the entire node or isolate the node from shared storage. The way this is done is to call the switch using the proper command to disable one or more port(s) on the switch and doing so effectively avoid the node from being able to start a VM or CT on the shared storage since no route will exists to the shared storage from the node. Restoring the access to the shared storage requires operator intervention on the switch or by running the fence command with the option to open the port(s) again. If the nodes are using bonding you need to disable the bridge aggregation on the switch and not the individual ports which is members of the bridge aggregation.

The shown example here uses SNMPv2c without password but a configured ACL on the switch only allowing members running on the cluster vlan access to the configured fencing group on the switch. The fence_agent supports both an index number or the name for the ports.

See list of known interfaces on the switch: fence_ifmib -o list -c <community> -a <IP> -n switch

Disable a specific interface on the switch: fence_ifmib --action=off -c <community> -a <IP> -n <index|name>

Enable a specific interface on the switch: fence_ifmib --action=on -c <community> -a <IP> -n <index|name>

Example:

<?xml version="1.0"?>
<cluster config_version="74" name="proxmox">
 <cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
 <quorumd allow_kill="0" interval="3" label="proxmox1_qdisk" tko="10" votes="1">
   <heuristic interval="3" program="ping $GATEWAY -c1 -w1" score="1" tko="4"/>
   <heuristic interval="3" program="ip addr | grep eth1 | grep -q UP" score="2" tko="3"/>
 </quorumd>
 <totem token="54000"/>
 <fencedevices>
   <fencedevice agent="fence_ifmib" community="fencing" ipaddr="172.16.3.254" name="hp1910" snmp_version="2c"/>
 </fencedevices>
 <clusternodes>
   <clusternode name="esx1" nodeid="1" votes="1">
     <fence>
       <method name="fence">
         <device action="off" name="hp1910" port="Bridge-Aggregation2"/>
       </method>
     </fence>
   </clusternode>
   <clusternode name="esx2" nodeid="2" votes="1">
     <fence>
       <method name="fence">
         <device action="off" name="hp1910" port="Bridge-Aggregation3"/>
       </method>
     </fence>
   </clusternode>
 </clusternodes>
 <rm>
   <failoverdomains>
     <failoverdomain name="webfailover" ordered="0" restricted="1">
       <failoverdomainnode name="esx1"/>
       <failoverdomainnode name="esx2"/>
     </failoverdomain>
   </failoverdomains>
   <resources>
     <ip address="172.16.3.7" monitor_link="5"/>
   </resources>
   <service autostart="1" domain="webfailover" name="web" recovery="relocate">
     <ip ref="172.16.3.7"/>
   </service>
   <pvevm autostart="1" vmid="109"/>
 </rm>
</cluster>


Multiple methods for a node

Note: See also man fenced

In more advanced configurations, multiple fencing methods can be defined for a node. If fencing fails using the first method, fenced will try the next method, and continue to cycle through methods until one succeeds.

       <clusternode name="node1" nodeid="1">
               <fence>
               <method name="1">
               <device name="myswitch" foo="x"/>
               </method>
               <method name="2">
               <device name="another" bar="123"/>
               </method>
               </fence>
       </clusternode>

       <fencedevices>
               <fencedevice name="myswitch" agent="..." something="..."/>
               <fencedevice name="another" agent="..."/>
       </fencedevices>

Dual path, redundant power

Note: See also man fenced

Sometimes fencing a node requires disabling two power ports or two i/o paths. This is done by specifying two or more devices within a method. fenced will run the agent for the device twice, once for each device line, and both must succeed for fencing to be considered successful.

       <clusternode name="node1" nodeid="1">
               <fence>
               <method name="1">
               <device name="sanswitch1" port="11"/>
               <device name="sanswitch2" port="11"/>
               </method>
               </fence>
       </clusternode>

When using power switches to fence nodes with dual power supplies, the agents must be told to turn off both power ports before restoring power to either port. The default off-on behavior of the agent could result in the power never being fully disabled to the node.

       <clusternode name="node1" nodeid="1">
               <fence>
               <method name="1">
               <device name="nps1" port="11" action="off"/>
               <device name="nps2" port="11" action="off"/>
               <device name="nps1" port="11" action="on"/>
               <device name="nps2" port="11" action="on"/>
               </method>
               </fence>
       </clusternode>

Test fencing

Before you use the fencing device, make sure that it works as expected:

Display internal fenced state:

fence_tool ls
  fence domain
  member count  3
  victim count  0
  victim now    0
  master nodeid 3
  wait state    none
  members       2 3 4

Test fencing with fence_node:

fence_node NODENAME -vv

Where NODENAME is from the node definition line:

<clusternode name="NODENAME" nodeid="1" votes="1">

You should get a "success" here and your machine powers off.

Repeat the command and the machine powers back on.

Testing APC Switch Rack PDU

In my example configuration, the AP7921 uses the IP 192.168.2.30.

Query the status of power supply:

fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o status -n 1 -v

Reboot the server using fence_apc:

fence_apc -x -l hpapc -p 12345678 -a 192.168.2.30 -o reboot -n 1 -v

Test fencing with fence_node:

fence_node NODENAME -vv

You should get a "success" here.

Testing Dell iDRAC

iDRAC[5,6,7] use the fence_drac5 agent, as indicated in the Dell settings above.

Test on the command line of another server using:

fence_drac5 --ip="10.1.1.2" --username="prox-b-drac" --password="****" --ssh --verbose --debug-file="/tmp/foo" --command-prompt="admin1->" --action="off"

Check the /tmp/foo file for connection logs.

  • Can you ssh into the iDRAC using the given username/password?
  • Is ssh enabled within iDRAC management? (Overview > iDRAC preferences > iDRAC Settings > Network > Services > ssh)
nc -zv [ipOf_iDRAC] 22
  • Are you trying to connect to the iDRAC management port or the Proxmox IP address?