File System level backups with LVM snapshots

From Proxmox VE
Jump to: navigation, search

Introduction

The general idea consists in combining an external tool which is able to do filesystem level incremental backups ( rsync by means of BackupPC in this document) with the possibility to take snapshots of LVM based storage of virtual machines.

Main constraints in designing this solution were:

  • Do not change fundamentally the configuration of an host under BackupPC
  • Preserve easy interactive restore directly on the host.

Basically the target host, when a backup is required via ssh connection, instead of directly executing the rsync command, intercepts it and runs a script ("forced command") which:

  1. Prepares backup operations (for instance, save ACL in case of Windows host)
  2. Stops or Suspends services which can do important changes on filesystem, or require a logical consistency on filesystem (e.g: Databases).
  3. Triggers a snapshot of his own storage on PVE host it is runnng on.
  4. Revert machine to normal operating state.
  5. Redirects original rsync command towards PVE hosts and the snapshot.
    • Redirected rsync runs on PVE: mount fs, optionally save MBR and PBS, save ntfs metadata for Windows hosts, run rsync.
  6. Triggers snapshot snapshot removal on PVE.

Backuppc-snap-schema.png

During interactive restore, instead, rsync process runs directly on the host.

In the following paragraphs you will find detailed configuration steps for a Windows host.

Requirements

  • Local user "backup", member of Administrators and Backup Operators

Tools needed on target Windows host

Procedure

Create "backup" user

  • Add "backup" user to "Administrators" and "Backup Operators" groups.
  • Connect to the host as "backup" user.
  • If you have quotes activated for some disk, check that "backup" entry is "no limit" (interactively restored files are initially owned by this user.).

Install Cygwin as backup user

  • Create C:\cygwin folder
    • Copy from another server c:\cygwin\cygwin-data folder (or install from the net if this is the first host configured).
    • Copy locally and run as backup the Cygwin install file "Setup.exe"
      • Install for all users
      • Leave default setup (c:\cygwin) for root cygwin folder.
      • Set local folder as repository and use c:\cygwin\cygwin-data as source.
    • Add following packages:
      • openssh
      • rsync (NOTE: install 3.0.7; 3.0.8 is problematic.)
      • libiconv
      • libiconv2
      • subversion
      • vim
    • Proceed, accepting Desktop and Start menu shortcuts creation.
    • Enter bash shell using Desktop icon; wait default settings creation for "backup" user; exit bash shell.

cyg_server user setup

NOTE: Skip this step for Windows XP hosts; in that case sshd will run with system account privileges.

  • Create a domain level user with Administrator privileges called cyg_server , this user will run sshd service. (You can use another existing user, providing it to the sshd service configuration script below when asked).
  • Reconnect to the host with a Domain Administrator account; enter bash, and run:
mkpasswd -l -d <yourdomain name> | grep cyg_server >> /etc/

which adds in /etc/passwd cygwin file an entry for domain <yourdomain name> user cyg_server, ssh daemon will run with this user account.

  • Add cyg_server to local Administrators.
  • NOTE: It's important to check that cyg_server is listed as Domain Administrator in /etc/passwd, and that the same user is a local Administrator, before proceeding with following steps.

ssh service setup.

  • Reconnect as local "backup" user.
  • Run from bash "ssh-host-config" script; see in the following section the responses to various requests ("*** Query:" sections).
$ ssh-host-config

*** Query: Overwrite existing /etc/ssh_config file? (yes/no)  yes

*** Info: Creating default /etc/ssh_config file

*** Query: Overwrite existing /etc/sshd_config file? (yes/no) yes

*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/README.privsep.


*** Query: Should privilege separation be used? (yes/no) yes

*** Info: Note that creating a new user requires that the current account have
*** Info: Administrator privileges.  Should this script attempt to create a

*** Query:new local account 'sshd'? yes

*** Info: Updating /etc/sshd_config file


*** Warning: The following functions require administrator privileges!

*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes 

*** Query: Enter the value of CYGWIN for the daemon: [] ntsec

*** Info: On Windows Server 2003, Windows Vista, and above, the
*** Info: SYSTEM account cannot setuid to other users -- a capability
*** Info: sshd requires.  You need to have or to create a privileged
*** Info: account.  This script will help you do so.

*** Info: You appear to be running Windows XP 64bit, Windows 2003 Server,
*** Info: or later.  On these systems, it's not possible to use the LocalSystem
*** Info: account for services that can change the user id without an
*** Info: explicit password (such as passwordless logins [e.g. public key
*** Info: authentication] via sshd).

*** Info: If you want to enable that functionality, it's required to create
*** Info: a new account with special privileges (unless a similar account
*** Info: already exists). This account is then used to run these special
*** Info: servers.

*** Info: Note that creating a new user requires that the current account
*** Info: have Administrator privileges itself.

*** Info: This script plans to use 'cyg_server'.
*** Info: 'cyg_server' will only be used by registered services.


*** Query: Do you want to use a different name? (yes/no) no

*** Info: Please enter a password for new user cyg_server.  Please be sure
*** Info: that this password matches the password rules given on your system.
*** Info: Entering no password will exit the configuration.

*** Query: Please enter the password:
*** Query: Reenter:

*** Info: Also keep in mind that the user 'cyg_server' needs read permissions
*** Info: on all users' relevant files for the services running as 'cyg_server'.

*** Info: In particular, for the sshd server all users' .ssh/authorized_keys
*** Info: files must have appropriate permissions to allow public key
*** Info: authentication. (Re-)running ssh-user-config for each user will set
*** Info: these permissions correctly. [Similar restrictions apply, for
*** Info: instance, for .rhosts files if the rshd server is running, etc].

NOTE: In some cases (probably if you forgot to add cyg_server to local Administrators), errors like following could happen:

*** Warning: cyg_server is in /etc/passwd, but the local
*** Warning: machine's SAM does not know about cyg_server.
*** Warning: Perhaps cyg_server is a pre-existing domain account.
*** Warning: Continuing, but check if this is ok.

In that case, verify cyg_server permissions as shown at the end of this document.

  • Start ssh service
net start sshd

ssh client setup

  • Copy from another backup host the ssh backup key
scp <hostname>:/home/backup/id_rsa_backup /home/backup/

All virtual machines connect to PVE hosts with the same key (generic access with this key is filtered with a forced command on PVE).

  • Connect with this key to all (already configured , otherwise you will obtain a normal root shell) PVE nodes, after accepting pve ssh public key insertion in ~/.ssh/known_hosts file.

The result should be always:

$ ssh -i id_rsa_backup -l root pve1
Rejected
Connection to pve1 closed.

If you try to send qm status 101 (assuming that 101 is the ID of a VM running on pve1) command, instead:

$ ssh -i id_rsa_backup -l root pve1 'qm status 101'
status: running

This confirms that ssh forced comman on PVE side works as expected.

"Forced command" setup

Edit ~/.ssh/authorized keys, adding forced command for backup connections:

command="/home/backup/backup-restore" ssh-rsa AAAAB3Nza...

where AAAAB3Nza... is the remote backuppc user public key.

Host scripts

Download vm side scripts (updated October 30 2012) and extract them inside /home/backup directory.

PVE scripts

  • Download PVE side scripts (updated October 30 2012) and extract them in a directory of your choice (e.g.: /opt/snap)
  • Configure forced command inside PVE "authorized_keys" file:
command="/opt/snap/snap-backup-restore" ssh-rsa AAAAB3NzaC1...

where AAAAB3Nza... is the public counterpart of the id_rsa_backup configured on virtual machines.

NOTE: Take care that /opt/snap/snap-backup-restore is the correct location of the snapshot master script.

  • You will need on PVE also a recent version of ntfs-3g (Debian Squeeze package is currently too old, Wheezy version is ok), and the attr package.

Utilities

  • Copy in a directory of your choice (scripts are configured with C:\App\Scripts\bin\ by default) the executable SetACL.exe
    • It's the Open Source tool that takes care of backup and iteractive restore of Windows ACL.
  • Copy in a directory of your choice (scripts are configured with C:\App\Scripts\bin\ by default ) the executable sync.exe
    • N.B.: sync.exe is NOT Open Source software, and you need to execute it the first time from GUI for aknowledging the EULA;

Host scripts configuration

host.conf file

  • Edit /home/backup/host.conf backup configuration file. (You can rename ad edit host.conf.tpl).

There is one line, uncommented, listing the hostnames of PVE nodes in the cluster, and multiple lines, which are to be left commented, one per disk (partition) subject to backup:

  • Example (for hypotetical host-to-backup host):
PVE_LIST="pve1 pve2 pve3"
#VMID DISK PART FSTYPE RELATIVE_MP STORAGE MBR_SAVE SNAP_SIZE(MB) MP_PREFIX
#101 1 1 ntfs /cygdrive/c/ lvmstorage mbryes 30%FREE
#101 1 1 ntfs /cygdrive/d/ lvmstorage mbrno 1000

Where:

  • PVE_LIST is the list of DNS hostnames for PVE cluster host-to-backup is running into.
  • VMID is the Virtual machine ID inside cluster PVE.
  • DISK is the disk number (inside PVE VM configuration 101.conf)
  • PART is the one-based partition number in DISK
  • FSTYPE is the filesystem type; it's the value that will be used as "-t" flag of "mount" command for mounting snapshot filesystem on PVE.
  • RELATIVE_MP it's the mount point relative to the directory on PVE ( /tmp/<VMID>/ ) reserved for snapshots; .
  • STORAGE is the storage name in PVE containing DISK. IMPORTANT: If your STORAGE is an LVM Group, and its name is different, use the Volume Group name here. If you use the predefined "local" storage, a snapshot of the 'data' LV will be made, inside the 'pve' VG.
  • MBR_SAVE tells if to save Master Boot Record (mbryes) for the disk or not (mbrno).
  • SNAP_SIZE(MB) it's the snapshot size as a percentage of free space inside the Volume Group DISK is part of; es: 30%FREE. It is also possible to set a fixed size in Megabytes.
  • MP_PREFIX Is an optional mount point prefix; useful on linux hosts for making mount points independent.

NOTE: trailing slash in RELATIVE_MP is important.

Setup of optional commands to run before and after the snapshot

  • Create tasks file; copy and modify from template:
cp tasks.tpl tasks

inserting commands inside tasks_before() and tasks_after() functions.

If you want to stop a Lotus Domino service before the snapshot and start it after, for example:

#!/bin/bash
 
 tasks_before() {
  echo "Stopping Lotus Domino..."
  net stop "Lotus Domino Server (LotusDominodata)" 
 }

 tasks_after() {
  echo "Starting Lotus Domino..."
  net start "Lotus Domino Server (LotusDominodata)"
}

Snapshot testing

  • cd inside backup scripts root:
cd /home/backup
  • create a snapshot for /cygdrive/c/ drive:
./snap.sh create /cygdrive/c/
  • remove it:
./snap.sh remove /cygdrive/c/

NOTE: optional commands before and after snapshot are skipped by default; for a complete test set variabile NO_TASKS to false inside snap.sh.

Test acl backup from remote BackupPC host

  • Connect to BackupPC host and run:
# su - backuppc

assuming backuppc user identity

  • Try to connect to target host, adding it to known_hosts answering yes to the related question. Then, generic ssh should be Rejected:
ssh backup@host-to-backup
Rejected
Connection to host-to-backup closed.

this is OK.

  • Test acl backup funcion:
ssh -q -x -l backup <host name> acl-backup "<share>"

For example, for /cygdrive/c/ on host-to-backup:

ssh -q -x -l backup host-to-backup acl-backup "/cygdrive/c"

Check that .acl.bak and acl.bak.err files are correctly created; .acl.bak.err content is "0" if acl saving is ok.

NOTE: In some cases it is necessary to modify /etc/bashrc cygwin file on the target host, moving following lines:

# If not running interactively, don't do anything
[[ "$-" != *i* ]] && return

to the beginning of file, just before:

# Check that we haven't already been sourced.
([[ -z ${CYG_SYS_BASHRC} ]] && CYG_SYS_BASHRC="1") || return

This avoids standard output pollution when rsync is redirected, causing a lock in rsync operation.

A symptom is the presence of lines like the followings in BackupPC host log, after having manually killed the backup job:

Executing DumpPreShareCmd: /usr/bin/ssh -q -x -l backup host-to-backup acl-backup "/cygdrive/d"
incr backup started back to 2011-11-15 10:39:20 (backup #0) for directory /cygdrive/d
Running: /usr/bin/ssh -q -x -l backup host-to-backup /usr/bin/rsync-backup --server --sender --numeric-ids --perms --owner --group -D --links   --hard-links --times --block-size=2048 --recursive --one-file-system . /cygdrive/d/
Xfer PIDs are now 19432
Got remote protocol 1668572463
Fatal error (bad version): /etc/bash.bashrc: line 13:  5028 Aborted                 ( [[ -z ${CYG_SYS_BASHRC} ]] && CYG_SYS_BASHRC="1" )

BackupPC host configuration

  • Connect to BackupPC host.
  • Add target DNS host name to "Edit Hosts" section , indicating authorized users.
  • Move in "Hosts Summary"; the new machine will be listed as "Host without backups". Select the link of the machine.
  • Select "Edit Config" for the machine. NOTE: Pay attention to not select by mistake the global"Edit Config" in "Server" section below.

Xfer

  • Set in "RsyncShareName" the paths to save ("shares" in BackupPC terminology); for Windows machine use Cygwin syntax, paths will start everytime with /cygdrive/<disk letter>; eg: /cygdrive/c.
  • Set Exclusions: ("BackupFilesExclude"); share name is the prefix, and after that relative paths to exclude. For windows hosts, for example, under /cygdrive/c you may want to exclude pagefile.sys and hiberfil.sys
  • Set "RsyncClientCmd"
    • Modify default ssh user from root to backup.
    • Add "-backup" suffix to "$rsyncPath"; you will read then "$rsyncPath-backup". The final configuration will be:
$sshPath -q -x -l backup $host $rsyncPath-backup $argList+
  • Set "RsyncClientRestoreCmd"
    • Modify default ssh user from root to backup.
    • Add "-restore" suffix to "$rsyncPath"; you will read then "$rsyncPath-restore". The final configuration will be:
$sshPath -q -x -l backup $host $rsyncPath-restore  $argList+

Backup Settings

  • Insert in "DumpPreShareCmd" the following command for saving ACL before every backup (it is the formerly tested command):
$sshPath -q -x -l backup $host acl-backup "$share"
  • Insert in "RestorePostUserCmd" the following command which sets ACL after every restore:
$sshPath -q -x -l backup $host acl-restore "$share" "$pathHdrSrc" "$pathHdrDest" "$fileList"

Do other configurations as usual.



Verify cyg_server permissions

cyg_server user should have the following permissions:

SeTcbPrivilege
SeCreateTokenPrivilege
SeAssignPrimaryTokenPrivilege
SeServiceLogonRight

For verifying run:

$ editrights -u cyg_server -l

In case of missing permissions, add them:

editrights -a SeTcbPrivilege -u cyg_server
editrights -a SeCreateTokenPrivilege -u cyg_server
editrights -a SeAssignPrimaryTokenPrivilege -u cyg_server
editrights -a SeServiceLogonRight -u cyg_server



Backuppc-snap download page

Download the scripts from this page.