OpenVZ on ISCSI howto: Difference between revisions
mNo edit summary |
(Put to Category Proxmox VE 3.x) |
||
(12 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
{{Note|Article about the old stable Proxmox VE 3.x releases}} | |||
== OpenVZ on iSCSI == | |||
This is fairly straight forward to accomplish and also allows (offline) migration between | This is fairly straight forward to accomplish and also allows (offline) migration between Proxmox cluster nodes. | ||
Offline migration means there is a 5 second outage between being relocated to another node in the cluster. | Offline migration means there is a 5 second outage between being relocated to another node in the cluster. | ||
Line 14: | Line 15: | ||
So if you have more nodes or only 1, just use your brains to work it out from this example. | So if you have more nodes or only 1, just use your brains to work it out from this example. | ||
'''NOTE:''' OpenVZ cannot share the same [http://en.wikipedia.org/wiki/Logical_Unit_Number LUN] (Logical_Unit_Number) on different nodes. So you need ONE LUN PER CLUSTER NODE. | |||
You can however have as many containers per node/LUN. | You can however have as many containers per node/LUN. | ||
For example: | For example: | ||
Line 28: | Line 27: | ||
You can also run KVM instances at the same time however they require their OWN LUN's | You can also run KVM instances at the same time however they require their OWN LUN's | ||
=== Add the iSCSI target to the master server === | |||
* Before you do this lets get a look at our system devices | |||
* Open an SSH connection to the Proxmox Master Node. | |||
* Run fdisk -l so we have a before and after view of system devices. | |||
You will end up with something similar to the following. Make a note of what are your local system devices so you know what is there already. This goes for whether you have existing SAN connections or not - it's best to know what the system looks like so you can see what devices are added and easily reference them later. | You will end up with something similar to the following. Make a note of what are your local system devices so you know what is there already. This goes for whether you have existing SAN connections or not - it's best to know what the system looks like so you can see what devices are added and easily reference them later. | ||
Line 86: | Line 69: | ||
Disk /dev/dm-2 doesn't contain a valid partition table</pre> | Disk /dev/dm-2 doesn't contain a valid partition table</pre> | ||
* Go to the web interface for Proxmox and add the iSCSI target under storage. | |||
* If you have multiple targets ensure you add each target for all the LUN's you need. | |||
=== Set ISCSI to automatic connection === | |||
* Now Edit the iSCSI node config file per instance of target/lun you have added. For mine I have a single target, with all LUNS being used by openVZ availble on that single target. If you like to use one target per LUN that can also be done, but just make sure you edit each node config file as follows: | * Now Edit the iSCSI node config file per instance of target/lun you have added. For mine I have a single target, with all LUNS being used by openVZ availble on that single target. If you like to use one target per LUN that can also be done, but just make sure you edit each node config file as follows: | ||
<pre>nano -w /etc/iscsi/node/iqn_for_your_node_here/IP_Address_and_port_here/default</pre> | <pre>nano -w /etc/iscsi/node/iqn_for_your_node_here/IP_Address_and_port_here/default</pre> | ||
You should end up with something like this - you can hit tab of course to help you with this path as you are entering it: | You should end up with something like this - you can hit tab of course to help you with this path as you are entering it: | ||
<pre>nano -w /etc/iscsi/nodes/iqn.2006-01.com.openfiler\:tsn.672802aca9d8/10.5.0.6\,3260\,1/default</pre> | <pre>nano -w /etc/iscsi/nodes/iqn.2006-01.com.openfiler\:tsn.672802aca9d8/10.5.0.6\,3260\,1/default</pre> | ||
* Near the top of the file change node.startup to automatic | * Near the top of the file change node.startup to automatic | ||
node.startup = automatic | node.startup = automatic | ||
* Near the bottom change node.conn[0] to automatic | |||
node.conn[0].startup = automatic | |||
* Exit and save the file. | * Exit and save the file. | ||
* Restart the open-iscsi service with the following command: | |||
* Restart the open-iscsi service with the following command: | |||
<pre>prox1:#/etc/init.d/open-iscsi restart | <pre>prox1:#/etc/init.d/open-iscsi restart | ||
Line 126: | Line 107: | ||
Mounting network filesystems:.</pre> | Mounting network filesystems:.</pre> | ||
'''NOTE:'''I have noticed that sometimes this is changed automatically back to manual. It is important to have this set to automatic if you have any servers set to boot at startup. It is also important to check this file on ALL cluster nodes to ensure they are also set, because the file replicates it with the setting as manual. SO this needs to be completed on each cluster node! | |||
* This config file is also where you setup authentication for iscsi targets if you are using CHAP authentication (not recommended) | * This config file is also where you setup authentication for iscsi targets if you are using CHAP authentication (not recommended) | ||
=== Setting up the LUN for OpenVZ === | |||
* First confirm that the LUN you are wanting to use for this cluster node is now available to the system with fdisk -l. | * First confirm that the LUN you are wanting to use for this cluster node is now available to the system with fdisk -l. | ||
Your output should now look something like the following: | Your output should now look something like the following: | ||
Line 175: | Line 153: | ||
Disk /dev/dm-2 doesn't contain a valid partition table | Disk /dev/dm-2 doesn't contain a valid partition table | ||
Disk /dev/sdb: | Disk /dev/sdb: 176.1 GB, 176160768000 bytes | ||
255 heads, 63 sectors/track, 21416 cylinders | |||
Units = cylinders of | Units = cylinders of 16065 * 512 = 8225280 bytes | ||
Disk identifier: | Disk identifier: 0xf2cc6ee0 | ||
Disk /dev/sdb doesn't contain a valid partition table | Disk /dev/sdb doesn't contain a valid partition table | ||
Disk /dev/sdc: | Disk /dev/sdc: 176.1 GB, 176160768000 bytes | ||
255 heads, 63 sectors/track, 21416 cylinders | |||
Units = cylinders of | Units = cylinders of 16065 * 512 = 8225280 bytes | ||
Disk identifier: | Disk identifier: 0xf8fa13a9 | ||
Disk /dev/sdc doesn't contain a valid partition table | Disk /dev/sdc doesn't contain a valid partition table | ||
Disk /dev/sdd: | |||
Disk /dev/sdd: 176.1 GB, 176160768000 bytes | |||
Units = cylinders of | 255 heads, 63 sectors/track, 21416 cylinders | ||
Disk identifier: | Units = cylinders of 16065 * 512 = 8225280 bytes | ||
Disk identifier: 0xd1403fdb | |||
Disk /dev/sdd doesn't contain a valid partition table</pre> | Disk /dev/sdd doesn't contain a valid partition table</pre> | ||
As you can see from the above output /dev/sdb; /dev/sdc; & /dev/sdd have been added to my system. These are the 3 LUNS (1 per node) that we will be adding. | |||
<strong>NOTE:</strong> The disk identifiers MAY also appear as 0000000000 until you create the partitions and file system as detailed below. | |||
To make it clear they will be used as follows: | To make it clear they will be used as follows: | ||
Line 207: | Line 186: | ||
/dev/sdd - prox2 (cluster node 3)</pre> | /dev/sdd - prox2 (cluster node 3)</pre> | ||
* So now we know what device we will be using for node 1 we do the following to create a partition on it: | |||
<pre>prox1:#fdisk /dev/sdb | <pre>prox1:#fdisk /dev/sdb | ||
Line 222: | Line 199: | ||
Type 'w' and enter to save changes and exit</pre> | Type 'w' and enter to save changes and exit</pre> | ||
* Now create the file system on the partition you just created by running: | * Now create the file system on the partition you just created by running: | ||
prox1:#mkfs.ext3 /dev/sdb1 (Obviously set the /dev/???1 to whatever your device is in your system) | prox1:#mkfs.ext3 /dev/sdb1 (Obviously set the /dev/???1 to whatever your device is in your system) | ||
Line 255: | Line 229: | ||
This filesystem will be automatically checked every 38 mounts or | This filesystem will be automatically checked every 38 mounts or | ||
180 days, whichever comes first. Use tune2fs -c or -i to override.</pre> | 180 days, whichever comes first. Use tune2fs -c or -i to override.</pre> | ||
* Repeat creating the partition and filesystem on any other LUNS you will be using for openvz on other cluster nodes. This is so we can obtain the disk identifier and easily reference what disk is being used in what system. | |||
* Running fdisk -l now will look something like this for the example 3 LUNS: | |||
<pre> | |||
Disk /dev/sdb: 176.1 GB, 176160768000 bytes | |||
255 heads, 63 sectors/track, 21416 cylinders | |||
Units = cylinders of 16065 * 512 = 8225280 bytes | |||
Disk identifier: 0xf2cc6ee0 | |||
Device Boot Start End Blocks Id System | |||
/dev/sdb1 1 21416 172023988+ 83 Linux | |||
Disk /dev/sdc: 176.1 GB, 176160768000 bytes | |||
255 heads, 63 sectors/track, 21416 cylinders | |||
Units = cylinders of 16065 * 512 = 8225280 bytes | |||
Disk identifier: 0xf8fa13a9 | |||
Device Boot Start End Blocks Id System | |||
/dev/sdc1 1 21416 172023988+ 83 Linux | |||
Disk /dev/sdd: 176.1 GB, 176160768000 bytes | |||
255 heads, 63 sectors/track, 21416 cylinders | |||
Units = cylinders of 16065 * 512 = 8225280 bytes | |||
Disk identifier: 0xd1403fdb | |||
Device Boot Start End Blocks Id System | |||
/dev/sdd1 1 21416 172023988+ 83 Linux | |||
</pre> | |||
<strong>NOTE:</strong> Make a note of the disk identifiers as these will be the same on all 3 nodes, but the /dev/???1 device mount point can change. | |||
=== Mounting the filesystem === | |||
* Now you can mount the filesystem you have just created. As we are wanting to use the iSCSI to hold openVZ containers we will have to remount the local copy to another folder. | * Now you can mount the filesystem you have just created. As we are wanting to use the iSCSI to hold openVZ containers we will have to remount the local copy to another folder. | ||
So first create a folder | So first create a folder | ||
<pre># mkdir /var/lib/vz1 (or whatever name you want to give it)</pre> | <pre># mkdir /var/lib/vz1 (or whatever name you want to give it)</pre> | ||
* Now we open up fstab and edit the local mount and add in our iscsi mount point | * Now we open up fstab and edit the local mount and add in our iscsi mount point | ||
<pre># nano -w /etc/fstab</pre> | <pre># nano -w /etc/fstab</pre> | ||
* You will see a line similar or the same as this near the top of the file: | * You will see a line similar or the same as this near the top of the file: | ||
/dev/pve/data /var/lib/vz ext3 defaults 0 1 | /dev/pve/data /var/lib/vz ext3 defaults 0 1 | ||
Line 284: | Line 284: | ||
/dev/pve/data /var/lib/vz1 ext3 defaults 0 1 | /dev/pve/data /var/lib/vz1 ext3 defaults 0 1 | ||
* At the bottom of the file you will need to add in the following line: | |||
* | |||
/dev/sdb1 /var/lib/vz ext3 defaults,auto,_netdev 0 0 | /dev/sdb1 /var/lib/vz ext3 defaults,auto,_netdev 0 0 | ||
'''Note:''' Obviously change the /dev/???1 to whatever device is in your system | |||
* Now exit and save the file | * Now exit and save the file | ||
* At this point <strong>ENSURE</strong> there are <strong>NO VZ</strong> containers running | * At this point <strong>ENSURE</strong> there are <strong>NO VZ</strong> containers running | ||
* Now run umount /var/lib/vz and mount -a | * Now run umount /var/lib/vz and mount -a | ||
<pre> | <pre> | ||
prox1:#umount /var/lib/vz | prox1:#umount /var/lib/vz | ||
Line 312: | Line 299: | ||
prox1:#</pre> | prox1:#</pre> | ||
'''Note:''' You don't want to see errors at this point, or any feedback at this point otherwise you've done something wrong | |||
* If successfull you can now look at the filesystems mounted in your server with: | * If successfull you can now look at the filesystems mounted in your server with: | ||
<pre>prox1:#df -h</pre> | <pre>prox1:#df -h</pre> | ||
Line 337: | Line 322: | ||
=== Copying your containers and booting up === | |||
* Now that you have the filesystem mounted and setup correctly you can now copy the local openVZ files along with any existing containers into the /var/lib/vz folder. Complete this step even if you have no existing containers as the file structure and system files are still required to exist BEFORE you start adding them like you would normally via the proxmox web interface. | * Now that you have the filesystem mounted and setup correctly you can now copy the local openVZ files along with any existing containers into the /var/lib/vz folder. Complete this step even if you have no existing containers as the file structure and system files are still required to exist BEFORE you start adding them like you would normally via the proxmox web interface. | ||
<pre>prox1:#cp -vpr /var/lib/vz1/. /var/lib/vz/. - this will recursively copy all existing data across to the SAN</pre> | <pre>prox1:#cp -vpr /var/lib/vz1/. /var/lib/vz/. - this will recursively copy all existing data across to the SAN</pre> | ||
Line 348: | Line 332: | ||
=== Setting up the cluster nodes === | |||
* As we have now (in steps 2 & 3) already partitioned and created the filesytem for the other LUNs, you only need to repeat steps 4 & 5 on any of your cluster nodes. Just ensure that for each node you are setting up that you change the relevant commands and config files to point to the correct device for each node. | |||
* | |||
< | <strong>NOTE:</strong> If you are adding new LUNs then of course you will need to repeat all steps | ||
</ | |||
Each server <strong>MIGHT</strong> mount the devices in a different order or with different names | |||
<strong>SO ALWAYS CHECK THE DISK IDENTIFIER TO ENSURE YOU ARE FORMATTING THE CORRECT LUN</strong> | |||
<pre>In this example I have 3 nodes in total, so I will use the following | <pre>In this example I have 3 nodes in total, so I will use the following | ||
Line 366: | Line 347: | ||
/dev/sdc1 - lun1 - cluster node 1 | /dev/sdc1 - lun1 - cluster node 1 | ||
/dev/sdd1 - lun2 - cluster node 2</pre> | /dev/sdd1 - lun2 - cluster node 2</pre> | ||
=== Testing migration out === | |||
* Once you have setup <strong>all</strong> nodes successfully!! you can then start migrating to your hearts content | |||
* Only offline migrations work. But as stated there is minimal downtime even for offline migrations | |||
* Once you have setup <strong>all</strong> nodes successfully!! you can then start migrating to your hearts content | |||
* Only offline migrations work. But as stated there is minimal downtime even for offline migrations | |||
* Things to consider | * Things to consider | ||
1. There is significant network and cpu overhead while migrating containers because this is NOT a shared filesystem as in KVM. So ALL data for the container you are migrating is copied from one LUN to the other LUN via the host nodes you are migrating from and to | 1. There is significant network and cpu overhead while migrating containers because this is NOT a shared filesystem as in KVM. So ALL data for the container you are migrating is copied from one LUN to the other LUN via the host nodes you are migrating from and to, if you do this via the Proxmox gui. | ||
2. vzmigrate is the script that is used to migrate the containers from one node to the other, however by default it removes the source container from the current host after migrating. Instead of doing this via the gui you can run the rsync command manually with a custom script that accepts a from and to vraiable which runs rsync, then vzctl stop, then a final rsync, then vzctl start. This is effectively what an offline vzmigrate does, but of course if you do this manually then you are only synchronising changes not the entire folder and therefore you are significantly reducing network and cpu load in the process. | |||
'''NOTE:''' This is only for containers that you might frequently want to move from one cluster node to another. For example less critical services that may spike in load at times and impact more critical services on that node. If say after hours you synch containers that are candidates for frequent migration, then during critical times it would take only a matter of seconds potentially to migrate the container (with little to no impact on other more critical services), instead of minutes (with severe impact on other servers due to high network and cpu load) | |||
[[Category:Proxmox VE 3.x]] | |||
[[Category: |
Latest revision as of 15:19, 13 October 2016
Note: Article about the old stable Proxmox VE 3.x releases |
OpenVZ on iSCSI
This is fairly straight forward to accomplish and also allows (offline) migration between Proxmox cluster nodes.
Offline migration means there is a 5 second outage between being relocated to another node in the cluster. The phase 1 sync is done while online, the container is then shutdown, and phase 2 sync is completed and then the container is brought up on the other node.
Of course this does depend on how long the services etc take to shutdown and how much data there is in the Phase 2 sync. So don't quote me on the times :) (The time mentioned was for a DNS,DHCP,File Server I've tested personally).
This assumes you have already setup a cluster, and SAN. Also this setup example is for a 3 node cluster. So if you have more nodes or only 1, just use your brains to work it out from this example.
NOTE: OpenVZ cannot share the same LUN (Logical_Unit_Number) on different nodes. So you need ONE LUN PER CLUSTER NODE. You can however have as many containers per node/LUN.
For example:
- Master node0 connects to - LUN1 - 10 containers
- Cluster node1 connects to - LUN2 - 5 containers
- Cluster node2 connects to - LUN3 - 40 containers
- Etc..........
You can also run KVM instances at the same time however they require their OWN LUN's
Add the iSCSI target to the master server
- Before you do this lets get a look at our system devices
- Open an SSH connection to the Proxmox Master Node.
- Run fdisk -l so we have a before and after view of system devices.
You will end up with something similar to the following. Make a note of what are your local system devices so you know what is there already. This goes for whether you have existing SAN connections or not - it's best to know what the system looks like so you can see what devices are added and easily reference them later.
ie: /dev/sda1; /dev/sda2 etc
prox:~# fdisk -l Disk /dev/sda: 8589 MB, 8589934592 bytes 255 heads, 63 sectors/track, 1044 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 * 1 66 524288 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 66 1044 7861610 8e Linux LVM Disk /dev/dm-0: 1073 MB, 1073741824 bytes 255 heads, 63 sectors/track, 130 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-0 doesn't contain a valid partition table Disk /dev/dm-1: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-1 doesn't contain a valid partition table Disk /dev/dm-2: 3758 MB, 3758096384 bytes 255 heads, 63 sectors/track, 456 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-2 doesn't contain a valid partition table
- Go to the web interface for Proxmox and add the iSCSI target under storage.
- If you have multiple targets ensure you add each target for all the LUN's you need.
Set ISCSI to automatic connection
- Now Edit the iSCSI node config file per instance of target/lun you have added. For mine I have a single target, with all LUNS being used by openVZ availble on that single target. If you like to use one target per LUN that can also be done, but just make sure you edit each node config file as follows:
nano -w /etc/iscsi/node/iqn_for_your_node_here/IP_Address_and_port_here/default
You should end up with something like this - you can hit tab of course to help you with this path as you are entering it:
nano -w /etc/iscsi/nodes/iqn.2006-01.com.openfiler\:tsn.672802aca9d8/10.5.0.6\,3260\,1/default
- Near the top of the file change node.startup to automatic
node.startup = automatic
- Near the bottom change node.conn[0] to automatic
node.conn[0].startup = automatic
- Exit and save the file.
- Restart the open-iscsi service with the following command:
prox1:#/etc/init.d/open-iscsi restart Disconnecting iSCSI targets:Logging out of session [sid: 1, target: iqn.2006-01.com.openfiler:tsn.672802aca9d8, portal:10.5.0.6,3260] Logout of [sid: 1, target: iqn.2006-01.com.openfiler:tsn.672802aca9d8, portal: 10.5.0.6,3260]: successful Stopping iSCSI initiator service:. Starting iSCSI initiator service: iscsid. Setting up iSCSI targets: Logging in to [iface: default, target: iqn.2006-01.com.openfiler:tsn.672802aca9d8, portal: 10.5.0.6,3260] Login to [iface: default, target: iqn.2006-01.com.openfiler:tsn.672802aca9d8, portal: 10.5.0.6,3260]: successful Mounting network filesystems:.
NOTE:I have noticed that sometimes this is changed automatically back to manual. It is important to have this set to automatic if you have any servers set to boot at startup. It is also important to check this file on ALL cluster nodes to ensure they are also set, because the file replicates it with the setting as manual. SO this needs to be completed on each cluster node!
- This config file is also where you setup authentication for iscsi targets if you are using CHAP authentication (not recommended)
Setting up the LUN for OpenVZ
- First confirm that the LUN you are wanting to use for this cluster node is now available to the system with fdisk -l.
Your output should now look something like the following:
prox:# fdisk -l Disk /dev/sda: 8589 MB, 8589934592 bytes 255 heads, 63 sectors/track, 1044 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 * 1 66 524288 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 66 1044 7861610 8e Linux LVM Disk /dev/dm-0: 1073 MB, 1073741824 bytes 255 heads, 63 sectors/track, 130 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-0 doesn't contain a valid partition table Disk /dev/dm-1: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-1 doesn't contain a valid partition table Disk /dev/dm-2: 3758 MB, 3758096384 bytes 255 heads, 63 sectors/track, 456 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Disk /dev/dm-2 doesn't contain a valid partition table Disk /dev/sdb: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xf2cc6ee0 Disk /dev/sdb doesn't contain a valid partition table Disk /dev/sdc: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xf8fa13a9 Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd1403fdb Disk /dev/sdd doesn't contain a valid partition table
As you can see from the above output /dev/sdb; /dev/sdc; & /dev/sdd have been added to my system. These are the 3 LUNS (1 per node) that we will be adding.
NOTE: The disk identifiers MAY also appear as 0000000000 until you create the partitions and file system as detailed below.
To make it clear they will be used as follows:
/dev/sdb - prox (cluster node 1) /dev/sdc - prox1 (cluster node 2) /dev/sdd - prox2 (cluster node 3)
- So now we know what device we will be using for node 1 we do the following to create a partition on it:
prox1:#fdisk /dev/sdb Type 'n' and press enter to create a new partition Type 'p' and enter for primary partition Type '1' and enter for the 1st (and only) partition Press enter to accept the default start cylinder Press enter to accept the default end cylinder Type 't' and enter to set the system type Type '83' and enter to set it as Linux Type 'w' and enter to save changes and exit
- Now create the file system on the partition you just created by running:
prox1:#mkfs.ext3 /dev/sdb1 (Obviously set the /dev/???1 to whatever your device is in your system)
You should see similar to the following output:
prox1:# mkfs.ext3 /dev/sdb1 mke2fs 1.41.3 (12-Oct-2008) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 10756096 inodes, 43005997 blocks 2150299 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 1313 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 38 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
- Repeat creating the partition and filesystem on any other LUNS you will be using for openvz on other cluster nodes. This is so we can obtain the disk identifier and easily reference what disk is being used in what system.
- Running fdisk -l now will look something like this for the example 3 LUNS:
Disk /dev/sdb: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xf2cc6ee0 Device Boot Start End Blocks Id System /dev/sdb1 1 21416 172023988+ 83 Linux Disk /dev/sdc: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xf8fa13a9 Device Boot Start End Blocks Id System /dev/sdc1 1 21416 172023988+ 83 Linux Disk /dev/sdd: 176.1 GB, 176160768000 bytes 255 heads, 63 sectors/track, 21416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd1403fdb Device Boot Start End Blocks Id System /dev/sdd1 1 21416 172023988+ 83 Linux
NOTE: Make a note of the disk identifiers as these will be the same on all 3 nodes, but the /dev/???1 device mount point can change.
Mounting the filesystem
- Now you can mount the filesystem you have just created. As we are wanting to use the iSCSI to hold openVZ containers we will have to remount the local copy to another folder.
So first create a folder
# mkdir /var/lib/vz1 (or whatever name you want to give it)
- Now we open up fstab and edit the local mount and add in our iscsi mount point
# nano -w /etc/fstab
- You will see a line similar or the same as this near the top of the file:
/dev/pve/data /var/lib/vz ext3 defaults 0 1
CHANGE that to point to the new folder you just created - for example:
/dev/pve/data /var/lib/vz1 ext3 defaults 0 1
- At the bottom of the file you will need to add in the following line:
/dev/sdb1 /var/lib/vz ext3 defaults,auto,_netdev 0 0
Note: Obviously change the /dev/???1 to whatever device is in your system
- Now exit and save the file
- At this point ENSURE there are NO VZ containers running
- Now run umount /var/lib/vz and mount -a
prox1:#umount /var/lib/vz prox1:# prox1:#mount -a prox1:#
Note: You don't want to see errors at this point, or any feedback at this point otherwise you've done something wrong
- If successfull you can now look at the filesystems mounted in your server with:
prox1:#df -h
You will see something similar to this:
prox1:#/etc/qemu-server# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/pve-root 17G 5.3G 11G 34% / tmpfs 16G 0 16G 0% /lib/init/rw udev 10M 616K 9.4M 7% /dev tmpfs 16G 0 16G 0% /dev/shm /dev/mapper/pve-data 38G 4.7G 34G 13% /var/lib/vz1 /dev/sda1 504M 31M 448M 7% /boot 172.15.241.29:/mnt/luns/nfs/ISO 24G 172M 23G 1% /mnt/pve/ISO /dev/sdb1 162G 188M 154G 1% /var/lib/vz
Notice how the local disk is now mounted to the new location and the iscsi is mounted to the default openVZ folder
Copying your containers and booting up
- Now that you have the filesystem mounted and setup correctly you can now copy the local openVZ files along with any existing containers into the /var/lib/vz folder. Complete this step even if you have no existing containers as the file structure and system files are still required to exist BEFORE you start adding them like you would normally via the proxmox web interface.
prox1:#cp -vpr /var/lib/vz1/. /var/lib/vz/. - this will recursively copy all existing data across to the SAN
If you want to confirm that this is actually happening you can simply run the df -h command and you will see the Used and Avail sizes changing for /dev/???1
Setting up the cluster nodes
- As we have now (in steps 2 & 3) already partitioned and created the filesytem for the other LUNs, you only need to repeat steps 4 & 5 on any of your cluster nodes. Just ensure that for each node you are setting up that you change the relevant commands and config files to point to the correct device for each node.
NOTE: If you are adding new LUNs then of course you will need to repeat all steps
Each server MIGHT mount the devices in a different order or with different names SO ALWAYS CHECK THE DISK IDENTIFIER TO ENSURE YOU ARE FORMATTING THE CORRECT LUN
In this example I have 3 nodes in total, so I will use the following /dev/sdb1 - lun0 - master cluster node 0 /dev/sdc1 - lun1 - cluster node 1 /dev/sdd1 - lun2 - cluster node 2
Testing migration out
- Once you have setup all nodes successfully!! you can then start migrating to your hearts content
- Only offline migrations work. But as stated there is minimal downtime even for offline migrations
- Things to consider
1. There is significant network and cpu overhead while migrating containers because this is NOT a shared filesystem as in KVM. So ALL data for the container you are migrating is copied from one LUN to the other LUN via the host nodes you are migrating from and to, if you do this via the Proxmox gui.
2. vzmigrate is the script that is used to migrate the containers from one node to the other, however by default it removes the source container from the current host after migrating. Instead of doing this via the gui you can run the rsync command manually with a custom script that accepts a from and to vraiable which runs rsync, then vzctl stop, then a final rsync, then vzctl start. This is effectively what an offline vzmigrate does, but of course if you do this manually then you are only synchronising changes not the entire folder and therefore you are significantly reducing network and cpu load in the process.
NOTE: This is only for containers that you might frequently want to move from one cluster node to another. For example less critical services that may spike in load at times and impact more critical services on that node. If say after hours you synch containers that are candidates for frequent migration, then during critical times it would take only a matter of seconds potentially to migrate the container (with little to no impact on other more critical services), instead of minutes (with severe impact on other servers due to high network and cpu load)