Proxmox VE Cluster
From Proxmox VE
| | Note: Article about the outdated Proxmox VE 1.x releases |
Contents |
Introduction
Proxmox VE Cluster enables central management of multiple physical servers. A Proxmox VE Cluster consists of one master and several nodes (minimum is a master and one node).
Main features
- Centralized web management
- One login and password for accessing all nodes and guests
- Console view to all Virtual Machines
- Migration of Virtual Machines between physical hosts
- Synchronized Virtual Appliance template store
Create a Proxmox VE Cluster
First, install two Proxmox VE servers, see Installation. Make sure that each Proxmox VE server has a unique host name, by default all server has the same host name.
Currently the cluster creation has to be done on the console, you can login to the Proxmox VE server via ssh.
All settings can be done via "pveca", the PVE Cluster Administration Toolkit
USAGE: pveca -l # show cluster status
pveca -c # create new cluster with localhost as master
pveca -s [-h IP] # sync cluster configuration from master (or IP)
pveca -d ID # delete a node
pveca -a [-h IP] # add new node to cluster
pveca -m # force local node to become master
Define the master
Login via ssh to the first Proxmox VE server.
Create the master:
pveca -c
To check the state of cluster:
pveca -l
Add a node to an existing master
Login via ssh to a second Proxmox VE server. Please note, the node should not have any VM´s. (If yes you will get conflicts with identical VMID´s - to workaround, use vzdump to backup and to restore to a different VMID after the cluster configuration).
Join a node to the master:
pveca -a -h IP-ADDRESS-MASTER
To check the state of cluster:
pveca -l
Display the state of cluster:
pveca -l CID----IPADDRESS----ROLE-STATE--------UPTIME---LOAD----MEM---ROOT---DATA 1 : 192.168.7.104 M A 5 days 01:43 0.54 20% 1% 4% 2 : 192.168.7.103 N A 2 days 05:02 0.04 26% 5% 29% 3 : 192.168.7.105 N A 00:13 1.41 22% 3% 15% 4 : 192.168.7.106 N A 00:05 0.54 17% 3% 3%
Remove a cluster node
Through the Central Web-based Management move all virtual machines out of the node you need to remove from the cluster. Make sure you have no local backups you want to keep, and save them accordingly.
Log into the cluster master node via ssh. Issue a pceva command to identify the nodeID:
pveca -l CID----IPADDRESS----ROLE-STATE--------UPTIME---LOAD----MEM---DISK 1 : 172.25.3.40 M A 58 days 00:02 0.02 7% 75% 2 : 172.25.3.41 N A 57 days 22:42 0.02 2% 90% 3 : 172.25.3.39 N A 59 days 00:26 0.00 2% 14%
Issue the delete command (here deleting node #3):
pveca -d 3
If the operation succeeds no output is returned.
PVE cluster unconfigures the slave node and shortly it will not be visible into the Central Web-based Management.
Please note that the node will not see the shared storage anymore.
Upgrade a working cluster
NOTE: if you got a critical production system you should test upgrades in your test lab.
Here are two possible ways to perform cluster upgrades (new pve versions, etc), example for a simple two nodes cluster:
method A
If you want to keep the disruption to each vm as short as possible:
- Live Migrate all VMs to the master node
- Update the slave node.
- Reboot Slave node.
- One by One turn off a VM, offline migrate to updated node, turn it on on new node.
- After all vms are moved, update the master node and reboot it.
- Then live migrate some of the vms back to the master node where they belong.
This can also be done backwards, i.e update the master first and then the slave. Users report to have done this without issues. But it could be better to update the slaves first then the master.
method B
Users have also updated like this:
- turn off all the vms on a node.
- install updates
- reboot
- let VMs autostart or manually start them
- repeat process till all nodes are updated.
see also this forum thread and Upgrade and Downloads wiki pages
Working with Proxmox VE Cluster
Now, you can start creating Virtual Machines on Cluster nodes by using the Central Web-based Management on the master.
Troubleshooting
You can manually check the cluster configuration files on each node. Before you edit these files, stop the cluster sync and tunnel service via web interface.
nano /etc/pve/cluster.cfg
Also check if the following file are still up to date, if not because your keys got updated just remove/adapt it.
/root/.ssh/known_hosts
Delete and recreate a cluster configuration
Sometimes it's quicker to delete and recreate your cluster configuration than it is to try and figure out what went wrong. The process includes stoping the cluster sync and tunnel service, deleting the existing cluster config and creating/joining the new cluster. The steps can be found below:
- Note: If your ssh host keys have changed you may need to delete them on each host before you begin:
rm /root/.ssh/known_hosts
- Jot down the IP of the new master node.
- Run on new master node:
/etc/init.d/pvemirror stop /etc/init.d/pvetunnel stop rm /etc/pve/cluster.cfg pveca -c
- Verify that the cluster has been created:
pveca -l
- Run on nodes you wish to add to the new cluster.
- Note: Change IP-ADDRESS-MASTER to the IP of the new master node.
/etc/init.d/pvemirror stop /etc/init.d/pvetunnel stop rm /etc/pve/cluster.cfg pveca -a -h IP-ADDRESS-MASTER
- Verify that the node has been added to the new cluster:
pveca -l
Local node already part of cluster
If you are getting the error "local node already part of cluster" and you have verified that the host name is different than any other node then you may have not properly indented you /etc/network/interfaces file. Example:
auto vmbr0 iface vmbr0 inet static address 10.9.0.2 netmask 255.255.255.0 gateway 10.9.0.1 bridge_ports bond0.9 bridge_stp off bridge_fd 0
Should be:
auto vmbr0
iface vmbr0 inet static
address 10.9.0.2
netmask 255.255.255.0
gateway 10.9.0.1
bridge_ports bond0.9
bridge_stp off
bridge_fd 0
