Difference between revisions of "PVE-zsync"

From Proxmox VE
Jump to navigation Jump to search
 
(30 intermediate revisions by 10 users not shown)
Line 1: Line 1:
=Introduction=
+
== Introduction ==
 
With the Proxmox VE ZFS replication manager (pve-zsync) you can synchronize your virtual machine (virtual disks and VM configuration) or directory stored on ZFS between two servers. By synchronizing, you have a full copy of your virtual machine on the second host and you can start your virtual machines on the second server (in case of data loss on the first server).
 
With the Proxmox VE ZFS replication manager (pve-zsync) you can synchronize your virtual machine (virtual disks and VM configuration) or directory stored on ZFS between two servers. By synchronizing, you have a full copy of your virtual machine on the second host and you can start your virtual machines on the second server (in case of data loss on the first server).
  
By default, the tool is syncing every 15 minutes, but the synchronization interval is fully configurable via the integrated cron job setup. The configuration of pve-zsync can be done either on the source server or on the target server.
+
By default, the tool syncs every 15 minutes, but the synchronization interval is fully configurable via the integrated cron job setup. The configuration of pve-zsync can be done either on the source server or on the target server.
  
 
This is useful for advanced backup strategies.
 
This is useful for advanced backup strategies.
  
'''Note:''' pve-zsync is introduced for Proxmox VE 3.4 as technology preview. The package can be installed on plain Debian Wheezy or Jessie servers too, as long as ZFS is configured.
+
'''Note:''' pve-zsync was introduced in Proxmox VE 3.4 as technology preview. The package can also be installed on plain Debian Wheezy, Jessie or Stretch servers, as long as ZFS is configured.
  
==Main features==
+
'''Note:''' Our ZFS is configured to auto-mount all subvols, so keep this in mind if you use pve-zsync. Also, zvol will be scanned by lvm.
 +
 
 +
== Main features ==
 
*Speed limiter
 
*Speed limiter
 
*Syncing interval can be set by cron
 
*Syncing interval can be set by cron
Line 17: Line 19:
 
*Traffic is encrypted
 
*Traffic is encrypted
  
=Limitations=
+
== Limitations ==
 
*not possible to sync recursive
 
*not possible to sync recursive
 
*only ssh for transfer
 
*only ssh for transfer
 
*email notification is done by cron
 
*email notification is done by cron
 +
*Name resolution not taken into account, you have to use IP Addresses, even using hosts file hostnames will not work
  
=System requirements=
+
== System requirements ==
 
*Both, target and source server must support ZFS (best practice: use Proxmox VE hosts).   
 
*Both, target and source server must support ZFS (best practice: use Proxmox VE hosts).   
 
*SSH must be installed and configured
 
*SSH must be installed and configured
Line 29: Line 32:
 
*perl  
 
*perl  
 
*scp
 
*scp
 +
*JASON.pm (libjson-perl this is the Debian package name, e.g. needed on Wheezy)
 +
 +
== PVE Storage Replication and PVE-zsync ==
 +
 +
PVE Storage Replication and PVE-zsync work completely independent and do not harm each other, as long the following requirements are followed:
 +
 +
* The destination pool / subset are different.
 +
* You do not migrate a guest to an other node.
 +
 +
Sum up of the differences:
 +
 +
{| class="wikitable"
 +
|-
 +
! Characteristic !! PVE Storage Replication !! PVE-zsync
 +
|-
 +
| Replication || Cluster Wide || to every Node which meets the requirements
 +
|-
 +
| Operation mode || push || push or pull
 +
|-
 +
| Management || GUI and Command line || Command line only
 +
|-
 +
| Keep snapshot || no  || yes
 +
|-
 +
| Migration || yes || no
 +
|-
 +
| Main Goal || redundancy || offsite backup
 +
|}
 +
 +
=== Migrate form pve-zsync to Storage Replication ===
 +
 +
When you have a cluster and would like to switch to Storage Replication, you have to destroy the pve-zsync job, clean up the storage and create a new Storage Replication job.
 +
 +
pve-zsync destroy <vmid> [--jobname <test>]
 +
 +
Then you have to cleanup the zfs storage.
  
=Configuration and use=
+
Start at the source side and remove all snapshots which start with '@rep_'
 +
 
 +
zfs destroy <pool>/[<path>/]vm-<VMID>-<type>-<number>@<rep_snapshots>
 +
 
 +
Then destroy all guest datasets on the destination side.
 +
 
 +
zfs destroy -R vm-<vmid>-*-<DiskNO>
 +
 
 +
Now you can create a new Storage Replication job
 +
More information about see [[Storage Replication]]
 +
 
 +
== Configuration and use ==
  
 
Install the package with apt, on your Proxmox VE host:
 
Install the package with apt, on your Proxmox VE host:
Line 38: Line 87:
 
This tool need basically no configuration. On the the first usage, when you create a job with a unknown host, the tool will ask you about the password for the remote server.
 
This tool need basically no configuration. On the the first usage, when you create a job with a unknown host, the tool will ask you about the password for the remote server.
  
==Sync a VM or ZFS dataset one time==
+
=== Sync a VM or ZFS dataset one time ===
This is also possible if a recurring job exists, here you must have in mind that the naming must be the same.
+
(N.B. this is also possible if a recurring job for that VM already exists, here you must have in mind that the naming in --name must be the same).
  
 
  root@zfs1:~# pve-zsync sync --source 100 --dest 192.168.1.2:tank/backup --verbose --maxsnap 2 --name test1 --limit 512
 
  root@zfs1:~# pve-zsync sync --source 100 --dest 192.168.1.2:tank/backup --verbose --maxsnap 2 --name test1 --limit 512
  
This command sync VM 100 witch is located on the server where the tool is called and sent it to the server 192.168.1.2 on the zpool tank which has an subset backup. the maxsnap tells, that it should be kept 2 backups, if there are more then 2 backups, the 3 one will be erased (sorted by creation time). Name is only reasonable, if there already a sync job.
+
This command sync VM 100 witch is located on the server where the tool is called and sent it to the server 192.168.1.2 on the zpool tank which has an subset backup. The --maxsnap tells that it should be kept 2 backups, if there are more then 2 backups, the 3 one will be erased (sorted by creation time). Name is only needed if there is already a sync job.
limit sets the speed limit what is used for syncing, here it would be 512 KBytes/s.
+
The --limit param sets the speed limit what is used for syncing, here it would be 512 KBytes/s.
+
 
==Create a recurring sync job==
+
=== Create a recurring sync job ===
  
  root@zfs1:~# pve-zsync create --source 192.168.1.1:100 --dest tank/backup --verbose --maxsnap 2 --name test1 --limit 512 --skip
+
  root@zfs2:~# pve-zsync create --source 192.168.1.1:100 --dest tank/backup --verbose --maxsnap 2 --name test1 --limit 512 --skip
  
The skip parameter disable the initial sync, what normally will be done immediately, but can take a wile, dependent on the size of the backup. The initial sync will be done at the first sync time.
+
The --skip parameter disables the initial sync, that normally would be done immediately but can take a while, depending on the size of the backup. The initial sync will be done at the first sync time.
  
==Delete a recurring sync job==
+
=== Delete a recurring sync job ===
 
if you delete a job, the former backup data will not be erased only the config will be erased.
 
if you delete a job, the former backup data will not be erased only the config will be erased.
  
  root@zfs1:~# pve-zsync destroy --source 192.168.1.1:100 --name test1
+
  root@zfs2:~# pve-zsync destroy --source 192.168.1.1:100 --name test1
 
   
 
   
 
Name is not necessary if it is default.
 
Name is not necessary if it is default.
  
==Pause a sync job==
+
=== Pause a sync job ===
 
If you want to pause a job, say maintenance the source server!
 
If you want to pause a job, say maintenance the source server!
  
  root@zfs1:~# pve-zsync disable --source 192.168.1.1:100 --name test1
+
  root@zfs2:~# pve-zsync disable --source 192.168.1.1:100 --name test1
  
==Reactivate a sync job==
+
=== Reactivate a sync job ===
 
to reactivate a job, because the job was paused or the job failed use.  
 
to reactivate a job, because the job was paused or the job failed use.  
  
  root@zfs1:~# pve-zsync enable --source 192.168.1.1:100 --name test1
+
  root@zfs2:~# pve-zsync enable --source 192.168.1.1:100 --name test1
  
 
This will reset the error flag in case of failure.
 
This will reset the error flag in case of failure.
  
==Changing parameters==
+
=== Changing parameters ===
 
You can edit the configuration in /etc/cron.d/pve-zsync or destroy the job and create it new.
 
You can edit the configuration in /etc/cron.d/pve-zsync or destroy the job and create it new.
  
==Information about the jobs==
+
=== Information about the jobs ===
 
To get a overview about all jobs use:
 
To get a overview about all jobs use:
  
 
<pre>
 
<pre>
root@zfs1:~# pve-zsync list
+
root@zfs94:~# pve-zsync list
 
SOURCE                  NAME          STATE LAST SYNC          TYPE  
 
SOURCE                  NAME          STATE LAST SYNC          TYPE  
 
192.168.15.95:100        testing1      ok    2015-05-13_14:44:00 ssh   
 
192.168.15.95:100        testing1      ok    2015-05-13_14:44:00 ssh   
 
192.168.15.95:data/test1 testing1      syncing2015-05-13_14:44:11 ssh   
 
192.168.15.95:data/test1 testing1      syncing2015-05-13_14:44:11 ssh   
  
root@zfs1:~# pve-zsync status
+
root@zfs94:~# pve-zsync status
 
SOURCE                  NAME          STATUS  
 
SOURCE                  NAME          STATUS  
 
192.168.15.95:100        testing1      ok     
 
192.168.15.95:100        testing1      ok     
Line 89: Line 138:
 
</pre>
 
</pre>
  
==Recovering an VM==
+
=== Recovering an VM ===
 
You must recover the VM or dataset manually. (In one of the upcoming releases, the restore for Proxmox VE VM will be integrated).
 
You must recover the VM or dataset manually. (In one of the upcoming releases, the restore for Proxmox VE VM will be integrated).
  
Fist, stop the sync job for the VM or dataset in question.
+
First, stop the sync job for the VM or dataset in question.
  
 
<b>NOTE:</b> if not you can be interfere the sync job or your snapshot will removed before you are able to send it.  
 
<b>NOTE:</b> if not you can be interfere the sync job or your snapshot will removed before you are able to send it.  
 
<pre>
 
<pre>
pve-zsync disable --source 192.168.15.2:100 --name test
+
root@zfs2:~# pve-zsync disable --source 192.168.15.1:100 --name test
pve-zsync list
+
root@zfs2:~# pve-zsync list
 
SOURCE                  NAME          STATE  LAST SYNC          TYPE  
 
SOURCE                  NAME          STATE  LAST SYNC          TYPE  
192.168.15.2:100        test            stopped2015-06-12_11:03:01 ssh   
+
192.168.15.1:100        test            stopped2015-06-12_11:03:01 ssh   
 
</pre>
 
</pre>
  
Line 112: Line 161:
 
</pre>
 
</pre>
  
Example restore VM 100 with 2 disk on server 192.168.15.2 and change VMID from 100 to 200:
+
'''NOTE:''' On PVE 4.2+ the path is: <tt>/var/lib/pve-zsync/data/<VMID>.conf.rep_<JOB_NAME><VMID>_<TIMESTAMP></tt>
 +
 
 +
Example restore VM 100 with 2 disk from 192.168.15.2 (pve2) to 192.168.15.1 (pve1) and change VMID from 100 to 200:
 
<pre>
 
<pre>
zfs send rpool/backup/vm-100-disk-1@rep_test100_2015-06-12_11:03:01 | ssh root@192.168.15.2 zfs receive vm/vm-200-disk-1
+
root@zfs2:~# zfs send rpool/backup/vm-100-disk-1@rep_test100_2015-06-12_11:03:01 | ssh root@192.168.15.1 zfs receive vm/vm-200-disk-1
ssh root@192.168.15.2
+
root@zfs2:~# ssh root@192.168.15.1
cp /var/lib/pve-zsync/100.conf.rep_test100_2015-06-11_14:11:01 /etc/pve/qemu-server/200.conf
+
root@zfs1:~# cp /var/lib/pve-zsync/100.conf.rep_test100_2015-06-11_14:11:01 /etc/pve/qemu-server/200.conf
nano /etc/pve/qemu-server/200.conf
+
root@zfs1:~# nano /etc/pve/qemu-server/200.conf
 
</pre>
 
</pre>
  
Line 152: Line 203:
 
</pre>
 
</pre>
  
= Troubelshooting =
+
== Troubelshooting ==
 +
Don't forget the command only works with IP Addresses and not hostnames.
 +
 
 +
 
 +
===Job status is on error and data-set can't be erased on destination system===
 +
 
 +
If you have problems with a sync-job and when you try to erase the destination zvol,
 +
you get the error "zfs data-set is busy", then lvm could be the problem.
 +
 
 +
This can be occur if you sync zvols with a lvm on it.
 +
 
 +
In this case please insert the following line in /etc/lvm/lvm.conf
 +
 
 +
filter = [ "r|/dev/zd*|" ]
 +
 
 +
and reboot the system.
 +
 
 
tbd.
 
tbd.
= Video Tutorials =
+
 
[http://www.youtube.com/user/ProxmoxVE Proxmox VE Youtube channel]
+
== Video Tutorials ==
[[Category:Technology]][[Category:HOWTO]][[Category: Installation]]
+
* [http://www.youtube.com/user/ProxmoxVE Proxmox VE Youtube channel]
 +
==tips==
 +
*as of 2016-11 If you migrate a vm that is used by pve-zsync
 +
:edit /etc/cron,d/pve-zsync  and change the IP address for vm . 
 +
:run  'pve-zsync enable --source ____  --name ____  '  or else there will be a warning when the job runs.
 +
[[Category:HOWTO]] [[Category: Installation]]

Latest revision as of 15:17, 15 December 2020

Introduction

With the Proxmox VE ZFS replication manager (pve-zsync) you can synchronize your virtual machine (virtual disks and VM configuration) or directory stored on ZFS between two servers. By synchronizing, you have a full copy of your virtual machine on the second host and you can start your virtual machines on the second server (in case of data loss on the first server).

By default, the tool syncs every 15 minutes, but the synchronization interval is fully configurable via the integrated cron job setup. The configuration of pve-zsync can be done either on the source server or on the target server.

This is useful for advanced backup strategies.

Note: pve-zsync was introduced in Proxmox VE 3.4 as technology preview. The package can also be installed on plain Debian Wheezy, Jessie or Stretch servers, as long as ZFS is configured.

Note: Our ZFS is configured to auto-mount all subvols, so keep this in mind if you use pve-zsync. Also, zvol will be scanned by lvm.

Main features

  • Speed limiter
  • Syncing interval can be set by cron
  • Syncing VM (disks and config) but also ZFS Datasets
  • Can keep multiple backups
  • Can be used in both directions
  • Can send on local host
  • Traffic is encrypted

Limitations

  • not possible to sync recursive
  • only ssh for transfer
  • email notification is done by cron
  • Name resolution not taken into account, you have to use IP Addresses, even using hosts file hostnames will not work

System requirements

  • Both, target and source server must support ZFS (best practice: use Proxmox VE hosts).
  • SSH must be installed and configured
  • to receive email-notifications, a working mail server is required (e.g. postfix).
  • cstream
  • perl
  • scp
  • JASON.pm (libjson-perl this is the Debian package name, e.g. needed on Wheezy)

PVE Storage Replication and PVE-zsync

PVE Storage Replication and PVE-zsync work completely independent and do not harm each other, as long the following requirements are followed:

  • The destination pool / subset are different.
  • You do not migrate a guest to an other node.

Sum up of the differences:

Characteristic PVE Storage Replication PVE-zsync
Replication Cluster Wide to every Node which meets the requirements
Operation mode push push or pull
Management GUI and Command line Command line only
Keep snapshot no yes
Migration yes no
Main Goal redundancy offsite backup

Migrate form pve-zsync to Storage Replication

When you have a cluster and would like to switch to Storage Replication, you have to destroy the pve-zsync job, clean up the storage and create a new Storage Replication job.

pve-zsync destroy <vmid> [--jobname <test>]

Then you have to cleanup the zfs storage.

Start at the source side and remove all snapshots which start with '@rep_'

zfs destroy <pool>/[<path>/]vm-<VMID>-<type>-<number>@<rep_snapshots> 

Then destroy all guest datasets on the destination side.

zfs destroy -R vm-<vmid>-*-<DiskNO>

Now you can create a new Storage Replication job More information about see Storage Replication

Configuration and use

Install the package with apt, on your Proxmox VE host:

apt-get install pve-zsync

This tool need basically no configuration. On the the first usage, when you create a job with a unknown host, the tool will ask you about the password for the remote server.

Sync a VM or ZFS dataset one time

(N.B. this is also possible if a recurring job for that VM already exists, here you must have in mind that the naming in --name must be the same).

root@zfs1:~# pve-zsync sync --source 100 --dest 192.168.1.2:tank/backup --verbose --maxsnap 2 --name test1 --limit 512

This command sync VM 100 witch is located on the server where the tool is called and sent it to the server 192.168.1.2 on the zpool tank which has an subset backup. The --maxsnap tells that it should be kept 2 backups, if there are more then 2 backups, the 3 one will be erased (sorted by creation time). Name is only needed if there is already a sync job. The --limit param sets the speed limit what is used for syncing, here it would be 512 KBytes/s.

Create a recurring sync job

root@zfs2:~# pve-zsync create --source 192.168.1.1:100 --dest tank/backup --verbose --maxsnap 2 --name test1 --limit 512 --skip

The --skip parameter disables the initial sync, that normally would be done immediately but can take a while, depending on the size of the backup. The initial sync will be done at the first sync time.

Delete a recurring sync job

if you delete a job, the former backup data will not be erased only the config will be erased.

root@zfs2:~# pve-zsync destroy --source 192.168.1.1:100 --name test1

Name is not necessary if it is default.

Pause a sync job

If you want to pause a job, say maintenance the source server!

root@zfs2:~# pve-zsync disable --source 192.168.1.1:100 --name test1

Reactivate a sync job

to reactivate a job, because the job was paused or the job failed use.

root@zfs2:~# pve-zsync enable --source 192.168.1.1:100 --name test1

This will reset the error flag in case of failure.

Changing parameters

You can edit the configuration in /etc/cron.d/pve-zsync or destroy the job and create it new.

Information about the jobs

To get a overview about all jobs use:

root@zfs94:~# pve-zsync list
SOURCE                   NAME           STATE LAST SYNC           TYPE 
192.168.15.95:100        testing1       ok     2015-05-13_14:44:00 ssh  
192.168.15.95:data/test1 testing1       syncing2015-05-13_14:44:11 ssh  

root@zfs94:~# pve-zsync status
SOURCE                   NAME           STATUS 
192.168.15.95:100        testing1       ok    
192.168.15.95:data/test1 testing1       syncing 

Recovering an VM

You must recover the VM or dataset manually. (In one of the upcoming releases, the restore for Proxmox VE VM will be integrated).

First, stop the sync job for the VM or dataset in question.

NOTE: if not you can be interfere the sync job or your snapshot will removed before you are able to send it.

root@zfs2:~# pve-zsync disable --source 192.168.15.1:100 --name test
root@zfs2:~# pve-zsync list
SOURCE                   NAME           STATE  LAST SYNC           TYPE 
192.168.15.1:100        test            stopped2015-06-12_11:03:01 ssh  

Then you can send the VM or Dataset to the selected target. SSH is only needed if you send to a remote sever.

zfs send <pool>/[<path>/]vm-<VMID>-disk-<number>@<last_snapshot> | [ssh root@<destination>] zfs receive <pool>/<path>/vm-<VMID>-disk-<number>

If you have a VM you must also copy the config and you need to correct the virtual disk storage configuration accordingly.

cp /var/lib/pve-zsync/<VMID>.conf.rep_<JOB_NAME><VMID>_<TIMESTAMP> /etc/pve/qemu-server/<VMID>.conf

NOTE: On PVE 4.2+ the path is: /var/lib/pve-zsync/data/<VMID>.conf.rep_<JOB_NAME><VMID>_<TIMESTAMP>

Example restore VM 100 with 2 disk from 192.168.15.2 (pve2) to 192.168.15.1 (pve1) and change VMID from 100 to 200:

root@zfs2:~# zfs send rpool/backup/vm-100-disk-1@rep_test100_2015-06-12_11:03:01 | ssh root@192.168.15.1 zfs receive vm/vm-200-disk-1
root@zfs2:~# ssh root@192.168.15.1
root@zfs1:~# cp /var/lib/pve-zsync/100.conf.rep_test100_2015-06-11_14:11:01 /etc/pve/qemu-server/200.conf
root@zfs1:~# nano /etc/pve/qemu-server/200.conf

Now you have to change the storage path:

bootdisk: virtio0
cores: 1
memory: 512
name: Debian8min
net0: virtio=12:5E:F6:59:A9:BB,bridge=vmbr0
numa: 0
ostype: l26
smbios1: uuid=11fa2fba-5670-4610-aabb-534ad7edeffe
sockets: 1
virtio0: zfs:vm-100-disk-1,size=10G
virtio1: zfs:vm-100-disk-2,size=10G

to

bootdisk: virtio0
cores: 1
memory: 512
name: Debian8min
net0: virtio=12:5E:F6:59:A9:BB,bridge=vmbr0
numa: 0
ostype: l26
smbios1: uuid=11fa2fba-5670-4610-aabb-534ad7edeffe
sockets: 1
virtio0: vm:vm-200-disk-1,size=10G
virtio1: vm:vm-200-disk-2,size=10G

Troubelshooting

Don't forget the command only works with IP Addresses and not hostnames.


Job status is on error and data-set can't be erased on destination system

If you have problems with a sync-job and when you try to erase the destination zvol, you get the error "zfs data-set is busy", then lvm could be the problem.

This can be occur if you sync zvols with a lvm on it.

In this case please insert the following line in /etc/lvm/lvm.conf

filter = [ "r|/dev/zd*|" ]

and reboot the system.

tbd.

Video Tutorials

tips

  • as of 2016-11 If you migrate a vm that is used by pve-zsync
edit /etc/cron,d/pve-zsync and change the IP address for vm .
run 'pve-zsync enable --source ____ --name ____ ' or else there will be a warning when the job runs.