DRBD9: Difference between revisions
mNo edit summary |
|||
Line 1: | Line 1: | ||
= Introduction | = Introduction == | ||
DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1. For detailed information please visit [http://www.linbit.com Linbit]. | DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1. For detailed information please visit [http://www.linbit.com Linbit]. | ||
Line 14: | Line 14: | ||
DRBD9 integration is introduced in Proxmox VE 4.0 as technology preview. | DRBD9 integration is introduced in Proxmox VE 4.0 as technology preview. | ||
= System requirements | == System requirements == | ||
You need 3 identical Proxmox VE servers (V4.0 or higher) with the following extra hardware: | You need 3 identical Proxmox VE servers (V4.0 or higher) with the following extra hardware: | ||
Line 29: | Line 29: | ||
* Consider doing [[#Integrity checking|integrity checking]] periodically to be sure DRBD is consistent | * Consider doing [[#Integrity checking|integrity checking]] periodically to be sure DRBD is consistent | ||
== Network | === Network === | ||
Configure the NIC dedicated for DRBD traffic (eth1 in the current example) on all nodes with a fixed private IP address via the web interface and reboot each server. | Configure the NIC dedicated for DRBD traffic (eth1 in the current example) on all nodes with a fixed private IP address via the web interface and reboot each server. | ||
Line 96: | Line 96: | ||
bridge_fd 0</pre> | bridge_fd 0</pre> | ||
== Disk for DRBD | === Disk for DRBD === | ||
DRBD will search for the LVM Volume Group drbdpool. | DRBD will search for the LVM Volume Group drbdpool. | ||
So you have to create them on all nodes. | So you have to create them on all nodes. | ||
Line 103: | Line 103: | ||
To prepare the disk for DRBD just run | To prepare the disk for DRBD just run | ||
<pre> | <pre> | ||
Line 132: | Line 131: | ||
</pre> | </pre> | ||
= DRBD configuration | == DRBD configuration == | ||
== Software installation | === Software installation === | ||
Install the DRBD user tools on all nodes : | Install the DRBD user tools on all nodes : | ||
<pre>apt-get install drbdmanage -y</pre> | <pre>apt-get install drbdmanage -y</pre> | ||
== Configure DRBD | === Configure DRBD === | ||
First make sure that the ssh-keys of each node are in "known_hosts" list from all the other. This can be easily ensured by | First make sure that the ssh-keys of each node are in "known_hosts" list from all the other. This can be easily ensured by | ||
Line 194: | Line 193: | ||
</pre> | </pre> | ||
then add a DRBD entry to /etc/pve/storage.cfg like this: | then add a DRBD entry to '''/etc/pve/storage.cfg''' like this: | ||
<b>NOTE1:</b> Redundancy <Number> - this number can not be higher than the maximum number of your actual total nodes. | <b>NOTE1:</b> Redundancy <Number> - this number can not be higher than the maximum number of your actual total nodes. | ||
Line 212: | Line 211: | ||
drbdmanage list-nodes | drbdmanage list-nodes | ||
= Create the first VM on DRBD for testing and live migration | == Create the first VM on DRBD for testing and live migration == | ||
On the GUI you can see the DRBD storage and you can use it for as virtual disk storage. | On the GUI you can see the DRBD storage and you can use it for as virtual disk storage. | ||
Line 220: | Line 219: | ||
Try to live migrate the VM - as all data is available on both nodes it will take just a few seconds. The overall process might take a bit longer if the VM is under load and if there is a lot of RAM involved. But in any case, the downtime is minimal and you will see no interruption at all. | Try to live migrate the VM - as all data is available on both nodes it will take just a few seconds. The overall process might take a bit longer if the VM is under load and if there is a lot of RAM involved. But in any case, the downtime is minimal and you will see no interruption at all. | ||
= DRBD support | == DRBD support == | ||
DRBD can be configured in many different ways and there is a lot of space for optimizations and performance tuning. If you run DRBD in a production environment we highly recommend the [http://www.linbit.com/en/p/products/drbd9 DRBD commercial support] from the DRBD developers. The company behind DRBD is [http://www.linbit.com Linbit]. | DRBD can be configured in many different ways and there is a lot of space for optimizations and performance tuning. If you run DRBD in a production environment we highly recommend the [http://www.linbit.com/en/p/products/drbd9 DRBD commercial support] from the DRBD developers. The company behind DRBD is [http://www.linbit.com Linbit]. | ||
= Recovery from communication failure | == Recovery from communication failure == | ||
= Integrity checking = | == Integrity checking == | ||
*You can enable "data-integrity-alg" for testing purposes and test at least for a week before production use. Don't use in production as this can cause split brain in dual-primary configuration and also it decreases performance. | *You can enable "data-integrity-alg" for testing purposes and test at least for a week before production use. Don't use in production as this can cause split brain in dual-primary configuration and also it decreases performance. | ||
Line 236: | Line 235: | ||
*Check man drbd.conf, section "NOTES ON DATA INTEGRITY" for more information. | *Check man drbd.conf, section "NOTES ON DATA INTEGRITY" for more information. | ||
= Final considerations | == Final considerations == | ||
Now you have a fully redundant storage for your VM´s without using expensive SAN equipment, configured in about 10 to 30 minutes - starting from bare-metal. | Now you have a fully redundant storage for your VM´s without using expensive SAN equipment, configured in about 10 to 30 minutes - starting from bare-metal. |
Revision as of 19:32, 26 October 2015
Introduction =
DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1. For detailed information please visit Linbit.
Main features of the integration in Proxmox VE:
- drbd9/drbdmanage; drbd devices on top of LVM
- All VM disks (LVM volumes on the DRBD device) can be replicated in real time on several Proxmox VE nodes via the network.
- Ability to live migrate running machines without downtime in a few seconds WITHOUT the need of SAN (iSCSI, FC, NFS) as the data is already on both nodes.
- LXC containers can use DRBD9 storage
Note:
DRBD9 integration is introduced in Proxmox VE 4.0 as technology preview.
System requirements
You need 3 identical Proxmox VE servers (V4.0 or higher) with the following extra hardware:
- Extra NIC (dedicated for DRBD traffic)
- Second disk, SSD, Flash card or raid volume (e.g. /dev/sdb) for DRBD
- Use a hardware raid controller with BBU to eliminate performance issues concerning internal metadata (see Florian´s blog).
- A functional Proxmox VE Cluster (V4.0 or higher)
- At least 2GB RAM in each node
VM settings when running on top of DRBD
- DRBD supports only the raw disk format at the moment.
- You need to change the VM disk cache node from the default 'none' to 'writethrough' instead of the default 'none'. Do not use write cache for any virtual drives on top of DRBD as this can cause out of sync blocks. Follow the link for more information: http://forum.proxmox.com/threads/18259-KVM-on-top-of-DRBD-and-out-of-sync-long-term-investigation-results?p=93126
- Consider doing integrity checking periodically to be sure DRBD is consistent
Network
Configure the NIC dedicated for DRBD traffic (eth1 in the current example) on all nodes with a fixed private IP address via the web interface and reboot each server.
For better understanding, here is an /etc/network/interfaces example from the first node called pve1, after the reboot:
cat /etc/network/interfaces # network interface settings auto lo iface lo inet loopback iface eth0 inet manual auto eth1 iface eth1 inet static address 10.0.15.81 netmask 255.255.255.0 auto vmbr0 iface vmbr0 inet static address 192.168.15.81 netmask 255.255.255.0 gateway 192.168.15.1 bridge_ports eth0 bridge_stp off bridge_fd 0
And from the second node, called pve2:
# network interface settings auto lo iface lo inet loopback iface eth0 inet manual auto eth1 iface eth1 inet static address 10.0.15.82 netmask 255.255.240.0 auto vmbr0 iface vmbr0 inet static address 192.168.15.82 netmask 255.255.255.0 gateway 192.168.15.1 bridge_ports eth0 bridge_stp off bridge_fd 0
And finally from the third node pve3:
# network interface settings auto lo iface lo inet loopback iface eth0 inet manual auto eth1 iface eth1 inet static address 10.0.0.83 netmask 255.255.255.0 auto vmbr0 iface vmbr0 inet static address 192.168.15.83 netmask 255.255.255.0 gateway 192.168.15.1 bridge_ports eth0 bridge_stp off bridge_fd 0
Disk for DRBD
DRBD will search for the LVM Volume Group drbdpool. So you have to create them on all nodes.
I will use /dev/sdb1 for DRBD. Therefore I need to create this single big partition on /dev/sdb - make sure they exist on all nodes.
To prepare the disk for DRBD just run
parted /dev/sdb mktable gpt parted /dev/sdb mkpart drbd 1 100% parted /dev/sdb p Model: ATA Samsung SSD 850 (scsi) Disk /dev/sdb: 512GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 1049kB 512GB 512GB drbd
And create then logical volume dedicated for drbd
NOTE: The logical volumes must have all the same size on each node!
root@proxmox:~# vgcreate drbdpool /dev/sdb1 Physical volume "/dev/sdb1" successfully created Volume group "drbdpool" successfully created root@proxmox:~# lvcreate -L 511G -n drbdthinpool -T drbdpool Logical volume "drbdthinpool" created.
DRBD configuration
Software installation
Install the DRBD user tools on all nodes :
apt-get install drbdmanage -y
Configure DRBD
First make sure that the ssh-keys of each node are in "known_hosts" list from all the other. This can be easily ensured by
ssh 10.0.15.82
etc.
To configure DRBD9 it is only necessary to run the following on node pve1:
drbdmanage init -q 10.0.15.81 Failed to find logical volume "drbdpool/.drbdctrl_0" Failed to find logical volume "drbdpool/.drbdctrl_1" Logical volume ".drbdctrl_0" created. Logical volume ".drbdctrl_1" created. initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created. initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created. empty drbdmanage control volume initialized. empty drbdmanage control volume initialized. Operation completed successfully
Now add all nodes of the cluster to DRBD, with the following command (you should check login as root to these nodes work)
root@pve1:~# drbdmanage add-node -q pve2 10.0.15.82 Operation completed successfully Operation completed successfully Executing join command using ssh. IMPORTANT: The output you see comes from pve2 IMPORTANT: Your input is executed on pve2 Failed to find logical volume "drbdpool/.drbdctrl_0" Failed to find logical volume "drbdpool/.drbdctrl_1" Logical volume ".drbdctrl_0" created. Logical volume ".drbdctrl_1" created. NOT initializing bitmap initializing activity log Writing meta data... New drbd meta data block successfully created. NOT initializing bitmap initializing activity log Writing meta data... New drbd meta data block successfully created. Operation completed successfully root@pve1:~# drbdmanage add-node -q pve3 10.0.15.83
then add a DRBD entry to /etc/pve/storage.cfg like this:
NOTE1: Redundancy <Number> - this number can not be higher than the maximum number of your actual total nodes.
NOTE2: If the file is missed try adding some storage before like a local directory, and pve will create the file for you
NOTE3: Each storage entry in that file must be followed by exactly one empty line
drbd: drbd1 content images,rootdir redundancy 3
The node configuration can be verified by
drbdmanage list-nodes
Create the first VM on DRBD for testing and live migration
On the GUI you can see the DRBD storage and you can use it for as virtual disk storage.
NOTE: DRBD supports only raw disk format at the moment.
Try to live migrate the VM - as all data is available on both nodes it will take just a few seconds. The overall process might take a bit longer if the VM is under load and if there is a lot of RAM involved. But in any case, the downtime is minimal and you will see no interruption at all.
DRBD support
DRBD can be configured in many different ways and there is a lot of space for optimizations and performance tuning. If you run DRBD in a production environment we highly recommend the DRBD commercial support from the DRBD developers. The company behind DRBD is Linbit.
Recovery from communication failure
Integrity checking
- You can enable "data-integrity-alg" for testing purposes and test at least for a week before production use. Don't use in production as this can cause split brain in dual-primary configuration and also it decreases performance.
- It is good idea to run "drbdadm verify" once a week (or at least once a month) when servers under low load.
# /etc/cron.d/drbdadm-verify-weekly # This will have cron invoked a drbd resources verification every Monday at 42 minutes past midnight 42 0 * * 1 root /sbin/drbdadm verify all
- Check man drbd.conf, section "NOTES ON DATA INTEGRITY" for more information.
Final considerations
Now you have a fully redundant storage for your VM´s without using expensive SAN equipment, configured in about 10 to 30 minutes - starting from bare-metal.
- Three servers for a redundant SAN
- Three servers for redundant virtualization hosts