High Availability Cluster : Simple version
|Note: Article about the old stable Proxmox VE 3.x releases. See High Availability Cluster 4.x for the new HA stack.|
- From HA Cluster :
Proxmox VE High Availability Cluster (Proxmox VE HA Cluster) enables the definition of high available virtual machines. In simple words, if a virtual machine (VM) is configured as HA and the physical host fails, the VM is automatically restarted on one of the remaining Proxmox VE Cluster nodes. The Proxmox VE HA Cluster is based on proven Linux HA technologies, providing stable and reliable HA service.
For a more simple way to build a HA Cluster, and for a better understanding of it's bases, we have created this guide.
For testing this guide, we need:
- at least three machines with virtualization support.
- at least 1 shared storage
We have N node ready, for example this list of hostnames:
- node1 ( ip example : 192.168.7.1 )
- node2 ( ip example : 192.168.7.2 )
- node3 ( ip example : 192.168.7.3 )
- node4 ( ip example : 192.168.7.4 )
Now we can create the cluster Extensys :
On node1 : root@node1:~$ pvecm create Extensys #NOTE: we can create the cluster on any node.
And then, we add all the other nodes:
root@node2:~$ pvecm add 192.168.7.1 <-- this is the ip of the node where the cluster has been created *NOTE: follow all the istructions, then wait until it's done. root@node3:~$ pvecm add 192.168.7.1 root@node4:~$ pvecm add 192.168.7.2 <-- we can add to any node of the cluster
Display the cluster status :
root@node1:~$ pvecm nodes 1 M 536 2014-05-17 11:33:43 node1 2 M 536 2014-05-20 08:12:19 node2 3 M 536 2014-05-17 11:33:43 node3 4 M 528 2014-05-17 11:33:18 node4
Basic HA: Fencing Between Nodes
The first step to implement the High Avaiability is to activate the fencing on every node : (for more info on fencing, read here.)
Uncomment this line in /etc/default/redhat-cluster-pve: FENCE_JOIN="yes"
root@node3:~$ service cman reload *NOTE: we must check if the service rgmanages is online and running after cman restart root@node3:~$ service rgmanager status rgmanager (pid 2961 2959 2956) is running...
If you encounter any problem with rgmanager, try to reboot all nodes by "reboot" (do not use "shutdown -r now").
When all the nodes are in the fencing domain, we can use this command to see if everything is fine:
root@node3:~$ clustat Cluster Status for extensys @ Mon May 26 11:35:43 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, rgmanager node2 2 Online, rgmanager node3 3 Online, Local, rgmanager node4 4 Online, rgmanager
If you have two nodes, you must set the expected vote at 1 on every node by this command :
pvecm expected 1
HA, activate on VM
Now we have 4 nodes in cluster, with fencing actived. Now we need to add the iScsi/nfs storage in Datacenter->Storage. Once done, create one or two VM on it, then shut it down. Now go to Datacenter -> HA , click on Add -> HA managed VM/CT , add the id of the VM, enable autostart and click on activate . Go to the VM option, enable "Start at boot".
At this point, we need to check if all we have done is right:
root@node3:~$ clustat Cluster Status for extensys @ Mon May 26 12:01:22 2014 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ node1 1 Online, rgmanager node2 2 Online, rgmanager node3 3 Online, Local, rgmanager node4 4 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- pvevm:101 node2 started pvevm:102 node4 started
If you wish to test HA :
- 1 start a machine on a node ( like node1 )
- 2 reboot or gently power off that node.(*)
(*)This method of HA don't help on power loss or hard reset, so is only for testing purpose. ( Or production, with some work on it ). For more information :