1. Introduction

Proxmox VE is a platform to run virtual machines and containers. It is based on Debian Linux, and completely open source. For maximum flexibility, we implemented two virtualization technologies - Kernel-based Virtual Machine (KVM) and container-based virtualization (LXC).

One main design goal was to make administration as easy as possible. You can use Proxmox VE on a single node, or assemble a cluster of many nodes. All management tasks can be done using our web-based management interface, and even a novice user can setup and install Proxmox VE within minutes.

Proxmox Software Stack

1.1. Central Management

While many people start with a single node, Proxmox VE can scale out to a large set of clustered nodes. The cluster stack is fully integrated and ships with the default installation.

Unique Multi-Master Design

The integrated web-based management interface gives you a clean overview of all your KVM guests and Linux containers and even of your whole cluster. You can easily manage your VMs and containers, storage or cluster from the GUI. There is no need to install a separate, complex, and pricey management server.

Proxmox Cluster File System (pmxcfs)

Proxmox VE uses the unique Proxmox Cluster file system (pmxcfs), a database-driven file system for storing configuration files. This enables you to store the configuration of thousands of virtual machines. By using corosync, these files are replicated in real time on all cluster nodes. The file system stores all data inside a persistent database on disk, nonetheless, a copy of the data resides in RAM which provides a maximum storage size of 30MB - more than enough for thousands of VMs.

Proxmox VE is the only virtualization platform using this unique cluster file system.

Web-based Management Interface

Proxmox VE is simple to use. Management tasks can be done via the included web based management interface - there is no need to install a separate management tool or any additional management node with huge databases. The multi-master tool allows you to manage your whole cluster from any node of your cluster. The central web-based management - based on the JavaScript Framework (ExtJS) - empowers you to control all functionalities from the GUI and overview history and syslogs of each single node. This includes running backup or restore jobs, live-migration or HA triggered activities.

Command Line

For advanced users who are used to the comfort of the Unix shell or Windows Powershell, Proxmox VE provides a command line interface to manage all the components of your virtual environment. This command line interface has intelligent tab completion and full documentation in the form of UNIX man pages.


Proxmox VE uses a RESTful API. We choose JSON as primary data format, and the whole API is formally defined using JSON Schema. This enables fast and easy integration for third party management tools like custom hosting environments.

Role-based Administration

You can define granular access for all objects (like VMs, storages, nodes, etc.) by using the role based user- and permission management. This allows you to define privileges and helps you to control access to objects. This concept is also known as access control lists: Each permission specifies a subject (a user or group) and a role (set of privileges) on a specific path.

Authentication Realms

Proxmox VE supports multiple authentication sources like Microsoft Active Directory, LDAP, Linux PAM standard authentication or the built-in Proxmox VE authentication server.

1.2. Flexible Storage

The Proxmox VE storage model is very flexible. Virtual machine images can either be stored on one or several local storages or on shared storage like NFS and on SAN. There are no limits, you may configure as many storage definitions as you like. You can use all storage technologies available for Debian Linux.

One major benefit of storing VMs on shared storage is the ability to live-migrate running machines without any downtime, as all nodes in the cluster have direct access to VM disk images.

We currently support the following Network storage types:

  • LVM Group (network backing with iSCSI targets)

  • iSCSI target

  • NFS Share

  • CIFS Share

  • Ceph RBD

  • Directly use iSCSI LUNs

  • GlusterFS

Local storage types supported are:

  • LVM Group (local backing devices like block devices, FC devices, DRBD, etc.)

  • Directory (storage on existing filesystem)

  • ZFS

1.3. Integrated Backup and Restore

The integrated backup tool (vzdump) creates consistent snapshots of running Containers and KVM guests. It basically creates an archive of the VM or CT data which includes the VM/CT configuration files.

KVM live backup works for all storage types including VM images on NFS, CIFS, iSCSI LUN, Ceph RBD. The new backup format is optimized for storing VM backups fast and effective (sparse files, out of order data, minimized I/O).

1.4. High Availability Cluster

A multi-node Proxmox VE HA Cluster enables the definition of highly available virtual servers. The Proxmox VE HA Cluster is based on proven Linux HA technologies, providing stable and reliable HA services.

1.5. Flexible Networking

Proxmox VE uses a bridged networking model. All VMs can share one bridge as if virtual network cables from each guest were all plugged into the same switch. For connecting VMs to the outside world, bridges are attached to physical network cards and assigned a TCP/IP configuration.

For further flexibility, VLANs (IEEE 802.1q) and network bonding/aggregation are possible. In this way it is possible to build complex, flexible virtual networks for the Proxmox VE hosts, leveraging the full power of the Linux network stack.

1.6. Integrated Firewall

The integrated firewall allows you to filter network packets on any VM or Container interface. Common sets of firewall rules can be grouped into “security groups”.

1.7. Hyper-converged Infrastructure

Proxmox VE is a virtualization platform that tightly integrates compute, storage and networking resources, manages highly available clusters, backup/restore as well as disaster recovery. All components are software-defined and compatible with one another.

Therefore it is possible to administrate them like a single system via the centralized web management interface. These capabilities make Proxmox VE an ideal choice to deploy and manage an open source hyper-converged infrastructure.

1.7.1. Benefits of a Hyper-Converged Infrastructure (HCI) with Proxmox VE

A hyper-converged infrastructure (HCI) is especially useful for deployments in which a high infrastructure demand meets a low administration budget, for distributed setups such as remote and branch office environments or for virtual private and public clouds.

HCI provides the following advantages:

  • Scalability: seamless expansion of compute, network and storage devices (i.e. scale up servers and storage quickly and independently from each other).

  • Low cost: Proxmox VE is open source and integrates all components you need such as compute, storage, networking, backup, and management center. It can replace an expensive compute/storage infrastructure.

  • Data protection and efficiency: services such as backup and disaster recovery are integrated.

  • Simplicity: easy configuration and centralized administration.

  • Open Source: No vendor lock-in.

1.7.2. Hyper-Converged Infrastructure: Storage

Proxmox VE has tightly integrated support for deploying a hyper-converged storage infrastructure. You can, for example, deploy and manage the following two storage technologies by using the Webinterface only:

Besides above, Proxmox VE has support to integrate a wide range of additional storage technologies. You can find out about them in the Storage Manager chapter.

1.8. Why Open Source

Proxmox VE uses a Linux kernel and is based on the Debian GNU/Linux Distribution. The source code of Proxmox VE is released under the GNU Affero General Public License, version 3. This means that you are free to inspect the source code at any time or contribute to the project yourself.

At Proxmox we are committed to use open source software whenever possible. Using open source software guarantees full access to all functionalities - as well as high security and reliability. We think that everybody should have the right to access the source code of a software to run it, build on it, or submit changes back to the project. Everybody is encouraged to contribute while Proxmox ensures the product always meets professional quality criteria.

Open source software also helps to keep your costs low and makes your core infrastructure independent from a single vendor.

1.9. Your benefits with Proxmox VE

  • Open source software

  • No vendor lock-in

  • Linux kernel

  • Fast installation and easy-to-use

  • Web-based management interface


  • Huge active community

  • Low administration costs and simple deployment

1.10. Getting Help

1.10.1. Proxmox VE Wiki

The primary source of information is the Proxmox VE Wiki. It combines the reference documentation with user contributed content.

1.10.2. Community Support Forum

We always encourage our users to discuss and share their knowledge using the Proxmox VE Community Forum. The forum is moderated by the Proxmox support team. The large user base is spread out all over the world. Needless to say that such a large forum is a great place to get information.

1.10.3. Mailing Lists

This is a fast way to communicate with the Proxmox VE community via email.

Proxmox VE is fully open source and contributions are welcome! The primary communication channel for developers is the:

1.10.4. Commercial Support

Proxmox Server Solutions GmbH also offers enterprise support available as Proxmox VE Subscription Service Plans. All users with a subscription get access to the Proxmox VE Enterprise Repository, and—with a Basic, Standard or Premium subscription—also to the Proxmox Customer Portal. The customer portal provides help and support with guaranteed response times from the Proxmox VE developers.

For volume discounts, or more information in general, please contact office@proxmox.com.

1.10.5. Bug Tracker

Proxmox runs a public bug tracker at https://bugzilla.proxmox.com. If an issue appears, file your report there. An issue can be a bug as well as a request for a new feature or enhancement. The bug tracker helps to keep track of the issue and will send a notification once it has been solved.

1.11. Project History

The project started in 2007, followed by a first stable version in 2008. At the time we used OpenVZ for containers, and KVM for virtual machines. The clustering features were limited, and the user interface was simple (server generated web page).

But we quickly developed new features using the Corosync cluster stack, and the introduction of the new Proxmox cluster file system (pmxcfs) was a big step forward, because it completely hides the cluster complexity from the user. Managing a cluster of 16 nodes is as simple as managing a single node.

We also introduced a new REST API, with a complete declarative specification written in JSON-Schema. This enabled other people to integrate Proxmox VE into their infrastructure, and made it easy to provide additional services.

Also, the new REST API made it possible to replace the original user interface with a modern HTML5 application using JavaScript. We also replaced the old Java based VNC console code with noVNC. So you only need a web browser to manage your VMs.

The support for various storage types is another big task. Notably, Proxmox VE was the first distribution to ship ZFS on Linux by default in 2014. Another milestone was the ability to run and manage Ceph storage on the hypervisor nodes. Such setups are extremely cost effective.

When we started we were among the first companies providing commercial support for KVM. The KVM project itself continuously evolved, and is now a widely used hypervisor. New features arrive with each release. We developed the KVM live backup feature, which makes it possible to create snapshot backups on any storage type.

The most notable change with version 4.0 was the move from OpenVZ to LXC. Containers are now deeply integrated, and they can use the same storage and network features as virtual machines.

1.12. Improving the Proxmox VE Documentation

Contributions and improvements to the Proxmox VE documentation are always welcome. There are several ways to contribute.

If you find errors or other room for improvement in this documentation, please file a bug at the Proxmox bug tracker to propose a correction.

If you want to propose new content, choose one of the following options:

  • The wiki: For specific setups, how-to guides, or tutorials the wiki is the right option to contribute.

  • The reference documentation: For general content that will be helpful to all users please propose your contribution for the reference documentation. This includes all information about how to install, configure, use, and troubleshoot Proxmox VE features. The reference documentation is written in the asciidoc format. To edit the documentation you need to clone the git repository at git://git.proxmox.com/git/pve-docs.git; then follow the README.adoc document.

Note If you are interested in working on the Proxmox VE codebase, the Developer Documentation wiki article will show you where to start.

1.13. Translating Proxmox VE

The Proxmox VE user interface is in English by default. However, thanks to the contributions of the community, translations to other languages are also available. We welcome any support in adding new languages, translating the latest features, and improving incomplete or inconsistent translations.

We use gettext for the management of the translation files. Tools like Poedit offer a nice user interface to edit the translation files, but you can use whatever editor you’re comfortable with. No programming knowledge is required for translating.

1.13.1. Translating with git

The language files are available as a git repository. If you are familiar with git, please contribute according to our Developer Documentation.

You can create a new translation by doing the following (replace <LANG> with the language ID):

# git clone git://git.proxmox.com/git/proxmox-i18n.git
# cd proxmox-i18n
# make init-<LANG>.po

Or you can edit an existing translation, using the editor of your choice:

# poedit <LANG>.po

1.13.2. Translating without git

Even if you are not familiar with git, you can help translate Proxmox VE. To start, you can download the language files here. Find the language you want to improve, then right click on the "raw" link of this language file and select Save Link As…. Make your changes to the file, and then send your final translation directly to office(at)proxmox.com, together with a signed contributor license agreement.

1.13.3. Testing the Translation

In order for the translation to be used in Proxmox VE, you must first translate the .po file into a .js file. You can do this by invoking the following script, which is located in the same repository:

# ./po2js.pl -t pve xx.po >pve-lang-xx.js

The resulting file pve-lang-xx.js can then be copied to the directory /usr/share/pve-i18n, on your proxmox server, in order to test it out.

Alternatively, you can build a deb package by running the following command from the root of the repository:

# make deb
Important For either of these methods to work, you need to have the following perl packages installed on your system. For Debian/Ubuntu:
# apt-get install perl liblocale-po-perl libjson-perl

1.13.4. Sending the Translation

You can send the finished translation (.po file) to the Proxmox team at the address office(at)proxmox.com, along with a signed contributor license agreement. Alternatively, if you have some developer experience, you can send it as a patch to the Proxmox VE development mailing list. See Developer Documentation.

2. Installing Proxmox VE

Proxmox VE is based on Debian. This is why the install disk images (ISO files) provided by Proxmox include a complete Debian system (Debian 10 Buster for Proxmox VE version 6.x) as well as all necessary Proxmox VE packages.

The installer will guide through the setup, allowing you to partition the local disk(s), apply basic system configurations (for example, timezone, language, network) and install all required packages. This process should not take more than a few minutes. Installing with the provided ISO is the recommended method for new and existing users.

Alternatively, Proxmox VE can be installed on top of an existing Debian system. This option is only recommended for advanced users because detailed knowledge about Proxmox VE is required.

2.1. System Requirements

We recommend using high quality server hardware, when running Proxmox VE in production. To further decrease the impact of a failed host, you can run Proxmox VE in a cluster with highly available (HA) virtual machines and containers.

Proxmox VE can use local storage (DAS), SAN, NAS, and distributed storage like Ceph RBD. For details see chapter storage.

2.1.1. Minimum Requirements, for Evaluation

These minimum requirements are for evaluation purposes only and should not be used in production.

  • CPU: 64bit (Intel EMT64 or AMD64)

  • Intel VT/AMD-V capable CPU/Mainboard for KVM full virtualization support

  • RAM: 1 GB RAM, plus additional RAM needed for guests

  • Hard drive

  • One network card (NIC)

  • Intel EMT64 or AMD64 with Intel VT/AMD-V CPU flag.

  • Memory: Minimum 2 GB for the OS and Proxmox VE services, plus designated memory for guests. For Ceph and ZFS, additional memory is required; approximately 1GB of memory for every TB of used storage.

  • Fast and redundant storage, best results are achieved with SSDs.

  • OS storage: Use a hardware RAID with battery protected write cache (“BBU”) or non-RAID with ZFS (optional SSD for ZIL).

  • VM storage:

    • For local storage, use either a hardware RAID with battery backed write cache (BBU) or non-RAID for ZFS and Ceph. Neither ZFS nor Ceph are compatible with a hardware RAID controller.

    • Shared and distributed storage is possible.

  • Redundant (Multi-)Gbit NICs, with additional NICs depending on the preferred storage technology and cluster setup.

  • For PCI(e) passthrough the CPU needs to support the VT-d/AMD-d flag.

2.1.3. Simple Performance Overview

To get an overview of the CPU and hard disk performance on an installed Proxmox VE system, run the included pveperf tool.

Note This is just a very quick and general benchmark. More detailed tests are recommended, especially regarding the I/O performance of your system.

2.1.4. Supported Web Browsers for Accessing the Web Interface

To access the web-based user interface, we recommend using one of the following browsers:

  • Firefox, a release from the current year, or the latest Extended Support Release

  • Chrome, a release from the current year

  • Microsoft’s currently supported version of Edge

  • Safari, a release from the current year

When accessed from a mobile device, Proxmox VE will show a lightweight, touch-based interface.

2.2. Prepare Installation Media

Download the installer ISO image from: https://www.proxmox.com/en/downloads/category/iso-images-pve

The Proxmox VE installation media is a hybrid ISO image. It works in two ways:

  • An ISO image file ready to burn to a CD or DVD.

  • A raw sector (IMG) image file ready to copy to a USB flash drive (USB stick).

Using a USB flash drive to install Proxmox VE is the recommended way because it is the faster option.

2.2.1. Prepare a USB Flash Drive as Installation Medium

The flash drive needs to have at least 1 GB of storage available.

Note Do not use UNetbootin. It does not work with the Proxmox VE installation image.
Important Make sure that the USB flash drive is not mounted and does not contain any important data.

2.2.2. Instructions for GNU/Linux

On Unix-like operating system use the dd command to copy the ISO image to the USB flash drive. First find the correct device name of the USB flash drive (see below). Then run the dd command.

# dd bs=1M conv=fdatasync if=./proxmox-ve_*.iso of=/dev/XYZ
Note Be sure to replace /dev/XYZ with the correct device name and adapt the input filename (if) path.
Caution Be very careful, and do not overwrite the wrong disk!
Find the Correct USB Device Name

There are two ways to find out the name of the USB flash drive. The first one is to compare the last lines of the dmesg command output before and after plugging in the flash drive. The second way is to compare the output of the lsblk command. Open a terminal and run:

# lsblk

Then plug in your USB flash drive and run the command again:

# lsblk

A new device will appear. This is the one you want to use. To be on the extra safe side check if the reported size matches your USB flash drive.

2.2.3. Instructions for macOS

Open the terminal (query Terminal in Spotlight).

Convert the .iso file to .img using the convert option of hdiutil for example.

# hdiutil convert -format UDRW -o proxmox-ve_*.dmg proxmox-ve_*.iso
Tip macOS tends to automatically add .dmg to the output file name.

To get the current list of devices run the command:

# diskutil list

Now insert the USB flash drive and run this command again to determine which device node has been assigned to it. (e.g., /dev/diskX).

# diskutil list
# diskutil unmountDisk /dev/diskX
Note replace X with the disk number from the last command.
# sudo dd if=proxmox-ve_*.dmg of=/dev/rdiskX bs=1m
Note rdiskX, instead of diskX, in the last command is intended. It will increase the write speed.

2.2.4. Instructions for Windows

Using Etcher

Etcher works out of the box. Download Etcher from https://etcher.io. It will guide you through the process of selecting the ISO and your USB Drive.

Using Rufus

Rufus is a more lightweight alternative, but you need to use the DD mode to make it work. Download Rufus from https://rufus.ie/. Either install it or use the portable version. Select the destination drive and the Proxmox VE ISO file.

Important Once you Start you have to click No on the dialog asking to download a different version of GRUB. In the next dialog select the DD mode.

2.3. Using the Proxmox VE Installer

The installer ISO image includes the following:

  • Complete operating system (Debian Linux, 64-bit)

  • The Proxmox VE installer, which partitions the local disk(s) with ext4, xfs or ZFS and installs the operating system.

  • Proxmox VE Linux kernel with KVM and LXC support

  • Complete toolset for administering virtual machines, containers, the host system, clusters and all necessary resources

  • Web-based management interface

Note All existing data on the for installation selected drives will be removed during the installation process. The installer does not add boot menu entries for other operating systems.

Please insert the prepared installation media (for example, USB flash drive or CD-ROM) and boot from it.

Tip Make sure that booting from the installation medium (for example, USB) is enabled in your servers firmware settings.

After choosing the correct entry (e.g. Boot from USB) the Proxmox VE menu will be displayed and one of the following options can be selected:

Install Proxmox VE

Starts the normal installation.

Tip It’s possible to use the installation wizard with a keyboard only. Buttons can be clicked by pressing the ALT key combined with the underlined character from the respective button. For example, ALT + N to press a Next button.
Install Proxmox VE (Debug mode)

Starts the installation in debug mode. A console will be opened at several installation steps. This helps to debug the situation if something goes wrong. To exit a debug console, press CTRL-D. This option can be used to boot a live system with all basic tools available. You can use it, for example, to repair a degraded ZFS rpool or fix the bootloader for an existing Proxmox VE setup.

Rescue Boot

With this option you can boot an existing installation. It searches all attached hard disks. If it finds an existing installation, it boots directly into that disk using the Linux kernel from the ISO. This can be useful if there are problems with the boot block (grub) or the BIOS is unable to read the boot block from the disk.

Test Memory

Runs memtest86+. This is useful to check if the memory is functional and free of errors.


After selecting Install Proxmox VE and accepting the EULA, the prompt to select the target hard disk(s) will appear. The Options button opens the dialog to select the target file system.

The default file system is ext4. The Logical Volume Manager (LVM) is used when ext4 or xfs is selected. Additional options to restrict LVM space can also be set (see below).

Proxmox VE can be installed on ZFS. As ZFS offers several software RAID levels, this is an option for systems that don’t have a hardware RAID controller. The target disks must be selected in the Options dialog. More ZFS specific settings can be changed under Advanced Options (see below).

Warning ZFS on top of any hardware RAID is not supported and can result in data loss.

The next page asks for basic configuration options like the location, the time zone, and keyboard layout. The location is used to select a download server close by to speed up updates. The installer usually auto-detects these settings. They only need to be changed in the rare case that auto detection fails or a different keyboard layout should be used.


Next the password of the superuser (root) and an email address needs to be specified. The password must consist of at least 5 characters. It’s highly recommended to use a stronger password. Some guidelines are:

  • Use a minimum password length of 12 to 14 characters.

  • Include lowercase and uppercase alphabetic characters, numbers, and symbols.

  • Avoid character repetition, keyboard patterns, common dictionary words, letter or number sequences, usernames, relative or pet names, romantic links (current or past), and biographical information (for example ID numbers, ancestors' names or dates).

The email address is used to send notifications to the system administrator. For example:

  • Information about available package updates.

  • Error messages from periodic CRON jobs.


The last step is the network configuration. Please note that during installation you can either use an IPv4 or IPv6 address, but not both. To configure a dual stack node, add additional IP addresses after the installation.


The next step shows a summary of the previously selected options. Re-check every setting and use the Previous button if a setting needs to be changed. To accept, press Install. The installation starts to format disks and copies packages to the target. Please wait until this step has finished; then remove the installation medium and restart your system.


If the installation failed check out specific errors on the second TTY (‘CTRL + ALT + F2’), ensure that the systems meets the minimum requirements. If the installation is still not working look at the how to get help chapter.

Further configuration is done via the Proxmox web interface. Point your browser to the IP address given during installation (https://youripaddress:8006).

Note Default login is "root" (realm PAM) and the root password is defined during the installation process.

2.3.1. Advanced LVM Configuration Options

The installer creates a Volume Group (VG) called pve, and additional Logical Volumes (LVs) called root, data, and swap. To control the size of these volumes use:


Defines the total hard disk size to be used. This way you can reserve free space on the hard disk for further partitioning (for example for an additional PV and VG on the same hard disk that can be used for LVM storage).


Defines the size of the swap volume. The default is the size of the installed memory, minimum 4 GB and maximum 8 GB. The resulting value cannot be greater than hdsize/8.

Note If set to 0, no swap volume will be created.

Defines the maximum size of the root volume, which stores the operation system. The maximum limit of the root volume size is hdsize/4.


Defines the maximum size of the data volume. The actual size of the data volume is:

datasize = hdsize - rootsize - swapsize - minfree

Where datasize cannot be bigger than maxvz.

Note In case of LVM thin, the data pool will only be created if datasize is bigger than 4GB.
Note If set to 0, no data volume will be created and the storage configuration will be adapted accordingly.

Defines the amount of free space left in the LVM volume group pve. With more than 128GB storage available the default is 16GB, else hdsize/8 will be used.

Note LVM requires free space in the VG for snapshot creation (not required for lvmthin snapshots).

2.3.2. Advanced ZFS Configuration Options

The installer creates the ZFS pool rpool. No swap space is created but you can reserve some unpartitioned space on the install disks for swap. You can also create a swap zvol after the installation, although this can lead to problems. (see ZFS swap notes).


Defines the ashift value for the created pool. The ashift needs to be set at least to the sector-size of the underlying disks (2 to the power of ashift is the sector-size), or any disk which might be put in the pool (for example the replacement of a defective disk).


Defines whether compression is enabled for rpool.


Defines which checksumming algorithm should be used for rpool.


Defines the copies parameter for rpool. Check the zfs(8) manpage for the semantics, and why this does not replace redundancy on disk-level.


Defines the total hard disk size to be used. This is useful to save free space on the hard disk(s) for further partitioning (for example to create a swap-partition). hdsize is only honored for bootable disks, that is only the first disk or mirror for RAID0, RAID1 or RAID10, and all disks in RAID-Z[123].

2.3.3. ZFS Performance Tips

ZFS works best with a lot of memory. If you intend to use ZFS make sure to have enough RAM available for it. A good calculation is 4GB plus 1GB RAM for each TB RAW disk space.

ZFS can use a dedicated drive as write cache, called the ZFS Intent Log (ZIL). Use a fast drive (SSD) for it. It can be added after installation with the following command:

# zpool add <pool-name> log </dev/path_to_fast_ssd>

2.4. Install Proxmox VE on Debian

Proxmox VE ships as a set of Debian packages and can be installed on to of a standard Debian installation. After configuring the repositories you need to run the following commands:

# apt-get update
# apt-get install proxmox-ve

Installing on top of an existing Debian installation looks easy, but it presumes that the base system has been installed correctly and that you know how you want to configure and use the local storage. You also need to configure the network manually.

In general, this is not trivial, especially when LVM or ZFS is used.

A detailed step by step how-to can be found on the wiki.

3. Host System Administration

The following sections will focus on common virtualization tasks and explain the Proxmox VE specifics regarding the administration and management of the host machine.

Proxmox VE is based on Debian GNU/Linux with additional repositories to provide the Proxmox VE related packages. This means that the full range of Debian packages is available including security updates and bug fixes. Proxmox VE provides it’s own Linux kernel based on the Ubuntu kernel. It has all the necessary virtualization and container features enabled and includes ZFS and several extra hardware drivers.

For other topics not included in the following sections, please refer to the Debian documentation. The Debian Administrator's Handbook is available online, and provides a comprehensive introduction to the Debian operating system (see [Hertzog13]).

3.1. Package Repositories

Proxmox VE uses APT as its package management tool like any other Debian-based system. Repositories are defined in the file /etc/apt/sources.list and in .list files placed in /etc/apt/sources.list.d/.

Each line defines a package repository. The preferred source must come first. Empty lines are ignored. A # character anywhere on a line marks the remainder of that line as a comment. The available packages from a repository are acquired by running apt-get update. Updates can be installed directly using apt-get, or via the GUI.

File /etc/apt/sources.list
deb http://ftp.debian.org/debian buster main contrib
deb http://ftp.debian.org/debian buster-updates main contrib

# security updates
deb http://security.debian.org/debian-security buster/updates main contrib

Proxmox VE additionally provides three different package repositories.

3.1.1. Proxmox VE Enterprise Repository

This is the default, stable, and recommended repository, available for all Proxmox VE subscription users. It contains the most stable packages and is suitable for production use. The pve-enterprise repository is enabled by default:

File /etc/apt/sources.list.d/pve-enterprise.list
deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise

The root@pam user is notified via email about available updates. Click the Changelog button in the GUI to see more details about the selected update.

You need a valid subscription key to access the pve-enterprise repository. Different support levels are available. Further details can be found at https://www.proxmox.com/en/proxmox-ve/pricing.

Note You can disable this repository by commenting out the above line using a # (at the start of the line). This prevents error messages if you do not have a subscription key. Please configure the pve-no-subscription repository in that case.

3.1.2. Proxmox VE No-Subscription Repository

This is the recommended repository for testing and non-production use. Its packages are not as heavily tested and validated. You don’t need a subscription key to access the pve-no-subscription repository.

We recommend to configure this repository in /etc/apt/sources.list.

File /etc/apt/sources.list
deb http://ftp.debian.org/debian buster main contrib
deb http://ftp.debian.org/debian buster-updates main contrib

# PVE pve-no-subscription repository provided by proxmox.com,
# NOT recommended for production use
deb http://download.proxmox.com/debian/pve buster pve-no-subscription

# security updates
deb http://security.debian.org/debian-security buster/updates main contrib

3.1.3. Proxmox VE Test Repository

This repository contains the latest packages and is primarily used by developers to test new features. To configure it, add the following line to etc/apt/sources.list:

sources.list entry for pvetest
deb http://download.proxmox.com/debian/pve buster pvetest
Warning The pvetest repository should (as the name implies) only be used for testing new features or bug fixes.

3.1.4. Ceph Octopus Repository

Note Ceph Octopus (15.2) was declared stable with Proxmox VE 6.3 and is the most recent Ceph release supported. It will continue to get updates for the remaining life time of the 6.x release.

This repository holds the main Proxmox VE Ceph Octopus packages. They are suitable for production. Use this repository if you run the Ceph client or a full Ceph cluster on Proxmox VE.

File /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-octopus buster main

3.1.5. Ceph Octopus Test Repository

This Ceph repository contains the Ceph packages before they are moved to the main repository. It is used to test new Ceph releases on Proxmox VE.

File /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-octopus buster test

3.1.6. Ceph Nautilus Repository

Note Ceph Nautlius (14.2) is the older supported Ceph version, introduced with Proxmox VE 6.0. It will continue to get updates until end of Q2 2021, so you will eventually need to upgrade to Ceph Octopus.

This repository holds the main Proxmox VE Ceph Nautilus packages. They are suitable for production. Use this repository if you run the Ceph client or a full Ceph cluster on Proxmox VE.

File /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-nautilus buster main

3.1.7. Ceph Nautilus Test Repository

This Ceph repository contains the Ceph packages before they are moved to the main repository. It is used to test new Ceph releases on Proxmox VE.

File /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-nautilus buster test

3.1.8. Proxmox VE Ceph Luminous Repository For Upgrade

If Ceph is deployed this repository is needed for the upgrade from Proxmox VE 5.x to Proxmox VE 6.0. It provides packages for the older Ceph Luminous release for Proxmox VE 6.0.

The Upgrade 5.x to 6.0 document explains how to use this repository in detail.

File /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-luminous buster main

3.1.9. SecureApt

The Release files in the repositories are signed with GnuPG. APT is using these signatures to verify that all packages are from a trusted source.

If you install Proxmox VE from an official ISO image, the key for verification is already installed.

If you install Proxmox VE on top of Debian, download and install the key with the following commands:

 # wget http://download.proxmox.com/debian/proxmox-ve-release-6.x.gpg -O /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg

Verify the checksum afterwards with:

# sha512sum /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg

The output should be:

acca6f416917e8e11490a08a1e2842d500b3a5d9f322c6319db0927b2901c3eae23cfb5cd5df6facf2b57399d3cfa52ad7769ebdd75d9b204549ca147da52626 /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg


# md5sum /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg

The output should be:

f3f6c5a3a67baf38ad178e5ff1ee270c /etc/apt/trusted.gpg.d/proxmox-ve-release-6.x.gpg

3.2. System Software Updates

Proxmox provides updates on a regular basis for all repositories. To install updates use the web-based GUI or the following CLI commands:

# apt-get update
# apt-get dist-upgrade
Note The APT package management system is very flexible and provides many features, see man apt-get, or [Hertzog13] for additional information.
Tip Regular updates are essential to get the latest patches and security related fixes. Major system upgrades are announced in the Proxmox VE Community Forum.

3.3. Network Configuration

Network configuration can be done either via the GUI, or by manually editing the file /etc/network/interfaces, which contains the whole network configuration. The interfaces(5) manual page contains the complete format description. All Proxmox VE tools try hard to keep direct user modifications, but using the GUI is still preferable, because it protects you from errors.

Once the network is configured, you can use the Debian traditional tools ifup and ifdown commands to bring interfaces up and down.

3.3.1. Apply Network Changes

Proxmox VE does not write changes directly to /etc/network/interfaces. Instead, we write into a temporary file called /etc/network/interfaces.new, this way you can do many related changes at once. This also allows to ensure your changes are correct before applying, as a wrong network configuration may render a node inaccessible.

Reboot Node to apply

With the default installed ifupdown network managing package you need to reboot to commit any pending network changes. Most of the time, the basic Proxmox VE network setup is stable and does not change often, so rebooting should not be required often.

Reload Network with ifupdown2

With the optional ifupdown2 network managing package you also can reload the network configuration live, without requiring a reboot.

Since Proxmox VE 6.1 you can apply pending network changes over the web-interface, using the Apply Configuration button in the Network panel of a node.

To install ifupdown2 ensure you have the latest Proxmox VE updates installed, then

Warning installing ifupdown2 will remove ifupdown, but as the removal scripts of ifupdown before version 0.8.35+pve1 have a issue where network is fully stopped on removal
[Introduced with Debian Buster: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=945877]
you must ensure that you have a up to date ifupdown package version.

For the installation itself you can then simply do:

apt install ifupdown2

With that you’re all set. You can also switch back to the ifupdown variant at any time, if you run into issues.

3.3.2. Naming Conventions

We currently use the following naming conventions for device names:

  • Ethernet devices: en*, systemd network interface names. This naming scheme is used for new Proxmox VE installations since version 5.0.

  • Ethernet devices: eth[N], where 0 ≤ N (eth0, eth1, …) This naming scheme is used for Proxmox VE hosts which were installed before the 5.0 release. When upgrading to 5.0, the names are kept as-is.

  • Bridge names: vmbr[N], where 0 ≤ N ≤ 4094 (vmbr0 - vmbr4094)

  • Bonds: bond[N], where 0 ≤ N (bond0, bond1, …)

  • VLANs: Simply add the VLAN number to the device name, separated by a period (eno1.50, bond1.30)

This makes it easier to debug networks problems, because the device name implies the device type.

Systemd Network Interface Names

Systemd uses the two character prefix en for Ethernet network devices. The next characters depends on the device driver and the fact which schema matches first.

  • o<index>[n<phys_port_name>|d<dev_port>] — devices on board

  • s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — device by hotplug id

  • [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] — devices by bus id

  • x<MAC> — device by MAC address

The most common patterns are:

  • eno1 — is the first on board NIC

  • enp3s0f1 — is the NIC on pcibus 3 slot 0 and use the NIC function 1.

For more information see Predictable Network Interface Names.

3.3.3. Choosing a network configuration

Depending on your current network organization and your resources you can choose either a bridged, routed, or masquerading networking setup.

Proxmox VE server in a private LAN, using an external gateway to reach the internet

The Bridged model makes the most sense in this case, and this is also the default mode on new Proxmox VE installations. Each of your Guest system will have a virtual interface attached to the Proxmox VE bridge. This is similar in effect to having the Guest network card directly connected to a new switch on your LAN, the Proxmox VE host playing the role of the switch.

Proxmox VE server at hosting provider, with public IP ranges for Guests

For this setup, you can use either a Bridged or Routed model, depending on what your provider allows.

Proxmox VE server at hosting provider, with a single public IP address

In that case the only way to get outgoing network accesses for your guest systems is to use Masquerading. For incoming network access to your guests, you will need to configure Port Forwarding.

For further flexibility, you can configure VLANs (IEEE 802.1q) and network bonding, also known as "link aggregation". That way it is possible to build complex and flexible virtual networks.

3.3.4. Default Configuration using a Bridge


Bridges are like physical network switches implemented in software. All virtual guests can share a single bridge, or you can create multiple bridges to separate network domains. Each host can have up to 4094 bridges.

The installation program creates a single bridge named vmbr0, which is connected to the first Ethernet card. The corresponding configuration in /etc/network/interfaces might look like this:

auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0

Virtual machines behave as if they were directly connected to the physical network. The network, in turn, sees each virtual machine as having its own MAC, even though there is only one network cable connecting all of these VMs to the network.

3.3.5. Routed Configuration

Most hosting providers do not support the above setup. For security reasons, they disable networking as soon as they detect multiple MAC addresses on a single interface.

Tip Some providers allow you to register additional MACs through their management interface. This avoids the problem, but can be clumsy to configure because you need to register a MAC for each of your VMs.

You can avoid the problem by “routing” all traffic via a single interface. This makes sure that all network packets use the same MAC address.


A common scenario is that you have a public IP (assume for this example), and an additional IP block for your VMs ( We recommend the following setup for such situations:

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet static
        post-up echo 1 > /proc/sys/net/ipv4/ip_forward
        post-up echo 1 > /proc/sys/net/ipv4/conf/eno1/proxy_arp

auto vmbr0
iface vmbr0 inet static
        bridge-ports none
        bridge-stp off
        bridge-fd 0

3.3.6. Masquerading (NAT) with iptables

Masquerading allows guests having only a private IP address to access the network by using the host IP address for outgoing traffic. Each outgoing packet is rewritten by iptables to appear as originating from the host, and responses are rewritten accordingly to be routed to the original sender.

auto lo
iface lo inet loopback

auto eno1
#real IP address
iface eno1 inet static

auto vmbr0
#private sub network
iface vmbr0 inet static
        bridge-ports none
        bridge-stp off
        bridge-fd 0

        post-up   echo 1 > /proc/sys/net/ipv4/ip_forward
        post-up   iptables -t nat -A POSTROUTING -s '' -o eno1 -j MASQUERADE
        post-down iptables -t nat -D POSTROUTING -s '' -o eno1 -j MASQUERADE
Note In some masquerade setups with firewall enabled, conntrack zones might be needed for outgoing connections. Otherwise the firewall could block outgoing connections since they will prefer the POSTROUTING of the VM bridge (and not MASQUERADE).

Adding these lines in the /etc/network/interfaces can fix this problem:

post-up   iptables -t raw -I PREROUTING -i fwbr+ -j CT --zone 1
post-down iptables -t raw -D PREROUTING -i fwbr+ -j CT --zone 1

For more information about this, refer to the following links:

3.3.7. Linux Bond

Bonding (also called NIC teaming or Link Aggregation) is a technique for binding multiple NIC’s to a single network device. It is possible to achieve different goals, like make the network fault-tolerant, increase the performance or both together.

High-speed hardware like Fibre Channel and the associated switching hardware can be quite expensive. By doing link aggregation, two NICs can appear as one logical interface, resulting in double speed. This is a native Linux kernel feature that is supported by most switches. If your nodes have multiple Ethernet ports, you can distribute your points of failure by running network cables to different switches and the bonded connection will failover to one cable or the other in case of network trouble.

Aggregated links can improve live-migration delays and improve the speed of replication of data between Proxmox VE Cluster nodes.

There are 7 modes for bonding:

  • Round-robin (balance-rr): Transmit network packets in sequential order from the first available network interface (NIC) slave through the last. This mode provides load balancing and fault tolerance.

  • Active-backup (active-backup): Only one NIC slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The single logical bonded interface’s MAC address is externally visible on only one NIC (port) to avoid distortion in the network switch. This mode provides fault tolerance.

  • XOR (balance-xor): Transmit network packets based on [(source MAC address XOR’d with destination MAC address) modulo NIC slave count]. This selects the same NIC slave for each destination MAC address. This mode provides load balancing and fault tolerance.

  • Broadcast (broadcast): Transmit network packets on all slave network interfaces. This mode provides fault tolerance.

  • IEEE 802.3ad Dynamic link aggregation (802.3ad)(LACP): Creates aggregation groups that share the same speed and duplex settings. Utilizes all slave network interfaces in the active aggregator group according to the 802.3ad specification.

  • Adaptive transmit load balancing (balance-tlb): Linux bonding driver mode that does not require any special network-switch support. The outgoing network packet traffic is distributed according to the current load (computed relative to the speed) on each network interface slave. Incoming traffic is received by one currently designated slave network interface. If this receiving slave fails, another slave takes over the MAC address of the failed receiving slave.

  • Adaptive load balancing (balance-alb): Includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special network switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the NIC slaves in the single logical bonded interface such that different network-peers use different MAC addresses for their network packet traffic.

If your switch support the LACP (IEEE 802.3ad) protocol then we recommend using the corresponding bonding mode (802.3ad). Otherwise you should generally use the active-backup mode.
If you intend to run your cluster network on the bonding interfaces, then you have to use active-passive mode on the bonding interfaces, other modes are unsupported.

The following bond configuration can be used as distributed/shared storage network. The benefit would be that you get more speed and the network will be fault-tolerant.

Example: Use bond with fixed IP address
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

auto bond0
iface bond0 inet static
      bond-slaves eno1 eno2
      bond-miimon 100
      bond-mode 802.3ad
      bond-xmit-hash-policy layer2+3

auto vmbr0
iface vmbr0 inet static
        bridge-ports eno3
        bridge-stp off
        bridge-fd 0

Another possibility it to use the bond directly as bridge port. This can be used to make the guest network fault-tolerant.

Example: Use a bond as bridge port
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

auto bond0
iface bond0 inet manual
      bond-slaves eno1 eno2
      bond-miimon 100
      bond-mode 802.3ad
      bond-xmit-hash-policy layer2+3

auto vmbr0
iface vmbr0 inet static
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0

3.3.8. VLAN 802.1Q

A virtual LAN (VLAN) is a broadcast domain that is partitioned and isolated in the network at layer two. So it is possible to have multiple networks (4096) in a physical network, each independent of the other ones.

Each VLAN network is identified by a number often called tag. Network packages are then tagged to identify which virtual network they belong to.

VLAN for Guest Networks

Proxmox VE supports this setup out of the box. You can specify the VLAN tag when you create a VM. The VLAN tag is part of the guest network configuration. The networking layer supports different modes to implement VLANs, depending on the bridge configuration:

  • VLAN awareness on the Linux bridge: In this case, each guest’s virtual network card is assigned to a VLAN tag, which is transparently supported by the Linux bridge. Trunk mode is also possible, but that makes configuration in the guest necessary.

  • "traditional" VLAN on the Linux bridge: In contrast to the VLAN awareness method, this method is not transparent and creates a VLAN device with associated bridge for each VLAN. That is, creating a guest on VLAN 5 for example, would create two interfaces eno1.5 and vmbr0v5, which would remain until a reboot occurs.

  • Open vSwitch VLAN: This mode uses the OVS VLAN feature.

  • Guest configured VLAN: VLANs are assigned inside the guest. In this case, the setup is completely done inside the guest and can not be influenced from the outside. The benefit is that you can use more than one VLAN on a single virtual NIC.

VLAN on the Host

To allow host communication with an isolated network. It is possible to apply VLAN tags to any network device (NIC, Bond, Bridge). In general, you should configure the VLAN on the interface with the least abstraction layers between itself and the physical NIC.

For example, in a default configuration where you want to place the host management address on a separate VLAN.

Example: Use VLAN 5 for the Proxmox VE management IP with traditional Linux bridge
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno1.5 inet manual

auto vmbr0v5
iface vmbr0v5 inet static
        bridge-ports eno1.5
        bridge-stp off
        bridge-fd 0

auto vmbr0
iface vmbr0 inet manual
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
Example: Use VLAN 5 for the Proxmox VE management IP with VLAN aware Linux bridge
auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0.5
iface vmbr0.5 inet static

auto vmbr0
iface vmbr0 inet manual
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes

The next example is the same setup but a bond is used to make this network fail-safe.

Example: Use VLAN 5 with bond0 for the Proxmox VE management IP with traditional Linux bridge
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

auto bond0
iface bond0 inet manual
      bond-slaves eno1 eno2
      bond-miimon 100
      bond-mode 802.3ad
      bond-xmit-hash-policy layer2+3

iface bond0.5 inet manual

auto vmbr0v5
iface vmbr0v5 inet static
        bridge-ports bond0.5
        bridge-stp off
        bridge-fd 0

auto vmbr0
iface vmbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0

3.3.9. Disabling IPv6 on the Node

Proxmox VE works correctly in all environments, irrespective of whether IPv6 is deployed or not. We recommend leaving all settings at the provided defaults.

Should you still need to disable support for IPv6 on your node, do so by creating an appropriate sysctl.conf (5) snippet file and setting the proper sysctls, for example adding /etc/sysctl.d/disable-ipv6.conf with content:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

This method is preferred to disabling the loading of the IPv6 module on the kernel commandline.

3.4. Time Synchronization

The Proxmox VE cluster stack itself relies heavily on the fact that all the nodes have precisely synchronized time. Some other components, like Ceph, also refuse to work properly if the local time on nodes is not in sync.

Time synchronization between nodes can be achieved with the “Network Time Protocol” (NTP). Proxmox VE uses systemd-timesyncd as NTP client by default, preconfigured to use a set of public servers. This setup works out of the box in most cases.

3.4.1. Using Custom NTP Servers

In some cases, it might be desired to not use the default NTP servers. For example, if your Proxmox VE nodes do not have access to the public internet (e.g., because of restrictive firewall rules), you need to setup local NTP servers and tell systemd-timesyncd to use them:

File /etc/systemd/timesyncd.conf
NTP=ntp1.example.com ntp2.example.com ntp3.example.com ntp4.example.com

After restarting the synchronization service (systemctl restart systemd-timesyncd) you should verify that your newly configured NTP servers are used by checking the journal (journalctl --since -1h -u systemd-timesyncd):

Oct 07 14:58:36 node1 systemd[1]: Stopping Network Time Synchronization...
Oct 07 14:58:36 node1 systemd[1]: Starting Network Time Synchronization...
Oct 07 14:58:36 node1 systemd[1]: Started Network Time Synchronization.
Oct 07 14:58:36 node1 systemd-timesyncd[13514]: Using NTP server (ntp1.example.com).
Oct 07 14:58:36 nora systemd-timesyncd[13514]: interval/delta/delay/jitter/drift 64s/-0.002s/0.020s/0.000s/-31ppm

3.5. External Metric Server


In Proxmox VE, you can define external metric servers, which will periodically receive various stats about your hosts, virtual guests and storages.

Currently supported are:

The external metric server definitions are saved in /etc/pve/status.cfg, and can be edited through the web interface.

3.5.1. Graphite server configuration


The default port is set to 2003 and the default graphite path is proxmox.

By default, Proxmox VE sends the data over UDP, so the graphite server has to be configured to accept this. Here the maximum transmission unit (MTU) can be configured for environments not using the standard 1500 MTU.

You can also configure the plugin to use TCP. In order not to block the important pvestatd statistic collection daemon, a timeout is required to cope with network problems.

3.5.2. Influxdb plugin configuration


Proxmox VE sends the data over UDP, so the influxdb server has to be configured for this. The MTU can also be configured here, if necessary.

Here is an example configuration for influxdb (on your influxdb server):

   enabled = true
   bind-address = ""
   database = "proxmox"
   batch-size = 1000
   batch-timeout = "1s"

With this configuration, your server listens on all IP addresses on port 8089, and writes the data in the proxmox database

Alternatively, the plugin can be configured to use the http(s) API of InfluxDB 2.x. InfluxDB 1.8.x does contain a forwards compatible API endpoint for this v2 API.

To use it, set influxdbproto to http or https (depending on your configuration). By default, Proxmox VE uses the organization proxmox and the bucket/db proxmox (They can be set with the configuration organization and bucket respectively).

Since InfluxDB’s v2 API is only available with authentication, you have to generate a token that can write into the correct bucket and set it.

In the v2 compatible API of 1.8.x, you can use user:password as token (if required), and can omit the organization since that has no meaning in InfluxDB 1.x.

You can also set the HTTP Timeout (default is 1s) with the timeout setting, as well as the maximum batch size (default 25000000 bytes) with the max-body-size setting (this corresponds to the InfluxDB setting with the same name).

3.6. Disk Health Monitoring

Although a robust and redundant storage is recommended, it can be very helpful to monitor the health of your local disks.

Starting with Proxmox VE 4.3, the package smartmontools
[smartmontools homepage https://www.smartmontools.org]
is installed and required. This is a set of tools to monitor and control the S.M.A.R.T. system for local hard disks.

You can get the status of a disk by issuing the following command:

# smartctl -a /dev/sdX

where /dev/sdX is the path to one of your local disks.

If the output says:

SMART support is: Disabled

you can enable it with the command:

# smartctl -s on /dev/sdX

For more information on how to use smartctl, please see man smartctl.

By default, smartmontools daemon smartd is active and enabled, and scans the disks under /dev/sdX and /dev/hdX every 30 minutes for errors and warnings, and sends an e-mail to root if it detects a problem.

For more information about how to configure smartd, please see man smartd and man smartd.conf.

If you use your hard disks with a hardware raid controller, there are most likely tools to monitor the disks in the raid array and the array itself. For more information about this, please refer to the vendor of your raid controller.

3.7. Logical Volume Manager (LVM)

Most people install Proxmox VE directly on a local disk. The Proxmox VE installation CD offers several options for local disk management, and the current default setup uses LVM. The installer let you select a single disk for such setup, and uses that disk as physical volume for the Volume Group (VG) pve. The following output is from a test installation using a small 8GB disk:

# pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/sda3  pve  lvm2 a--  7.87g 876.00m

# vgs
  VG   #PV #LV #SN Attr   VSize VFree
  pve    1   3   0 wz--n- 7.87g 876.00m

The installer allocates three Logical Volumes (LV) inside this VG:

# lvs
  LV   VG   Attr       LSize   Pool Origin Data%  Meta%
  data pve  twi-a-tz--   4.38g             0.00   0.63
  root pve  -wi-ao----   1.75g
  swap pve  -wi-ao---- 896.00m

Formatted as ext4, and contains the operation system.


Swap partition


This volume uses LVM-thin, and is used to store VM images. LVM-thin is preferable for this task, because it offers efficient support for snapshots and clones.

For Proxmox VE versions up to 4.1, the installer creates a standard logical volume called “data”, which is mounted at /var/lib/vz.

Starting from version 4.2, the logical volume “data” is a LVM-thin pool, used to store block based guest images, and /var/lib/vz is simply a directory on the root file system.

3.7.1. Hardware

We highly recommend to use a hardware RAID controller (with BBU) for such setups. This increases performance, provides redundancy, and make disk replacements easier (hot-pluggable).

LVM itself does not need any special hardware, and memory requirements are very low.

3.7.2. Bootloader

We install two boot loaders by default. The first partition contains the standard GRUB boot loader. The second partition is an EFI System Partition (ESP), which makes it possible to boot on EFI systems.

3.7.3. Creating a Volume Group

Let’s assume we have an empty disk /dev/sdb, onto which we want to create a volume group named “vmdata”.

Caution Please note that the following commands will destroy all existing data on /dev/sdb.

First create a partition.

# sgdisk -N 1 /dev/sdb

Create a Physical Volume (PV) without confirmation and 250K metadatasize.

# pvcreate --metadatasize 250k -y -ff /dev/sdb1

Create a volume group named “vmdata” on /dev/sdb1

# vgcreate vmdata /dev/sdb1

3.7.4. Creating an extra LV for /var/lib/vz

This can be easily done by creating a new thin LV.

# lvcreate -n <Name> -V <Size[M,G,T]> <VG>/<LVThin_pool>

A real world example:

# lvcreate -n vz -V 10G pve/data

Now a filesystem must be created on the LV.

# mkfs.ext4 /dev/pve/vz

At last this has to be mounted.

Warning be sure that /var/lib/vz is empty. On a default installation it’s not.

To make it always accessible add the following line in /etc/fstab.

# echo '/dev/pve/vz /var/lib/vz ext4 defaults 0 2' >> /etc/fstab

3.7.5. Resizing the thin pool

Resize the LV and the metadata pool can be achieved with the following command.

# lvresize --size +<size[\M,G,T]> --poolmetadatasize +<size[\M,G]> <VG>/<LVThin_pool>
Note When extending the data pool, the metadata pool must also be extended.

3.7.6. Create a LVM-thin pool

A thin pool has to be created on top of a volume group. How to create a volume group see Section LVM.

# lvcreate -L 80G -T -n vmstore vmdata

3.8. ZFS on Linux

ZFS is a combined file system and logical volume manager designed by Sun Microsystems. Starting with Proxmox VE 3.4, the native Linux kernel port of the ZFS file system is introduced as optional file system and also as an additional selection for the root file system. There is no need for manually compile ZFS modules - all packages are included.

By using ZFS, its possible to achieve maximum enterprise features with low budget hardware, but also high performance systems by leveraging SSD caching or even SSD only setups. ZFS can replace cost intense hardware raid cards by moderate CPU and memory load combined with easy management.

General ZFS advantages
  • Easy configuration and management with Proxmox VE GUI and CLI.

  • Reliable

  • Protection against data corruption

  • Data compression on file system level

  • Snapshots

  • Copy-on-write clone

  • Various raid levels: RAID0, RAID1, RAID10, RAIDZ-1, RAIDZ-2 and RAIDZ-3

  • Can use SSD for cache

  • Self healing

  • Continuous integrity checking

  • Designed for high storage capacities

  • Asynchronous replication over network

  • Open Source

  • Encryption

3.8.1. Hardware

ZFS depends heavily on memory, so you need at least 8GB to start. In practice, use as much as you can get for your hardware/budget. To prevent data corruption, we recommend the use of high quality ECC RAM.

If you use a dedicated cache and/or log disk, you should use an enterprise class SSD (e.g. Intel SSD DC S3700 Series). This can increase the overall performance significantly.

Important Do not use ZFS on top of a hardware RAID controller which has its own cache management. ZFS needs to communicate directly with the disks. An HBA adapter or something like an LSI controller flashed in “IT” mode is more appropriate.

If you are experimenting with an installation of Proxmox VE inside a VM (Nested Virtualization), don’t use virtio for disks of that VM, as they are not supported by ZFS. Use IDE or SCSI instead (also works with the virtio SCSI controller type).

3.8.2. Installation as Root File System

When you install using the Proxmox VE installer, you can choose ZFS for the root file system. You need to select the RAID type at installation time:


Also called “striping”. The capacity of such volume is the sum of the capacities of all disks. But RAID0 does not add any redundancy, so the failure of a single drive makes the volume unusable.


Also called “mirroring”. Data is written identically to all disks. This mode requires at least 2 disks with the same size. The resulting capacity is that of a single disk.


A combination of RAID0 and RAID1. Requires at least 4 disks.


A variation on RAID-5, single parity. Requires at least 3 disks.


A variation on RAID-5, double parity. Requires at least 4 disks.


A variation on RAID-5, triple parity. Requires at least 5 disks.

The installer automatically partitions the disks, creates a ZFS pool called rpool, and installs the root file system on the ZFS subvolume rpool/ROOT/pve-1.

Another subvolume called rpool/data is created to store VM images. In order to use that with the Proxmox VE tools, the installer creates the following configuration entry in /etc/pve/storage.cfg:

zfspool: local-zfs
        pool rpool/data
        content images,rootdir

After installation, you can view your ZFS pool status using the zpool command:

# zpool status
  pool: rpool
 state: ONLINE
  scan: none requested

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0

errors: No known data errors

The zfs command is used configure and manage your ZFS file systems. The following command lists all file systems after installation:

# zfs list
rpool             4.94G  7.68T    96K  /rpool
rpool/ROOT         702M  7.68T    96K  /rpool/ROOT
rpool/ROOT/pve-1   702M  7.68T   702M  /
rpool/data          96K  7.68T    96K  /rpool/data
rpool/swap        4.25G  7.69T    64K  -

3.8.3. ZFS RAID Level Considerations

There are a few factors to take into consideration when choosing the layout of a ZFS pool. The basic building block of a ZFS pool is the virtual device, or vdev. All vdevs in a pool are used equally and the data is striped among them (RAID0). Check the zpool(8) manpage for more details on vdevs.


Each vdev type has different performance behaviors. The two parameters of interest are the IOPS (Input/Output Operations per Second) and the bandwidth with which data can be written or read.

A mirror vdev (RAID1) will approximately behave like a single disk in regards to both parameters when writing data. When reading data if will behave like the number of disks in the mirror.

A common situation is to have 4 disks. When setting it up as 2 mirror vdevs (RAID10) the pool will have the write characteristics as two single disks in regard of IOPS and bandwidth. For read operations it will resemble 4 single disks.

A RAIDZ of any redundancy level will approximately behave like a single disk in regard of IOPS with a lot of bandwidth. How much bandwidth depends on the size of the RAIDZ vdev and the redundancy level.

For running VMs, IOPS is the more important metric in most situations.

Size, Space usage and Redundancy

While a pool made of mirror vdevs will have the best performance characteristics, the usable space will be 50% of the disks available. Less if a mirror vdev consists of more than 2 disks, for example in a 3-way mirror. At least one healthy disk per mirror is needed for the pool to stay functional.

The usable space of a RAIDZ type vdev of N disks is roughly N-P, with P being the RAIDZ-level. The RAIDZ-level indicates how many arbitrary disks can fail without losing data. A special case is a 4 disk pool with RAIDZ2. In this situation it is usually better to use 2 mirror vdevs for the better performance as the usable space will be the same.

Another important factor when using any RAIDZ level is how ZVOL datasets, which are used for VM disks, behave. For each data block the pool needs parity data which is at least the size of the minimum block size defined by the ashift value of the pool. With an ashift of 12 the block size of the pool is 4k. The default block size for a ZVOL is 8k. Therefore, in a RAIDZ2 each 8k block written will cause two additional 4k parity blocks to be written, 8k + 4k + 4k = 16k. This is of course a simplified approach and the real situation will be slightly different with metadata, compression and such not being accounted for in this example.

This behavior can be observed when checking the following properties of the ZVOL:

  • volsize

  • refreservation (if the pool is not thin provisioned)

  • used (if the pool is thin provisioned and without snapshots present)

# zfs get volsize,refreservation,used <pool>/vm-<vmid>-disk-X

volsize is the size of the disk as it is presented to the VM, while refreservation shows the reserved space on the pool which includes the expected space needed for the parity data. If the pool is thin provisioned, the refreservation will be set to 0. Another way to observe the behavior is to compare the used disk space within the VM and the used property. Be aware that snapshots will skew the value.

There are a few options to counter the increased use of space:

  • Increase the volblocksize to improve the data to parity ratio

  • Use mirror vdevs instead of RAIDZ

  • Use ashift=9 (block size of 512 bytes)

The volblocksize property can only be set when creating a ZVOL. The default value can be changed in the storage configuration. When doing this, the guest needs to be tuned accordingly and depending on the use case, the problem of write amplification if just moved from the ZFS layer up to the guest.

Using ashift=9 when creating the pool can lead to bad performance, depending on the disks underneath, and cannot be changed later on.

Mirror vdevs (RAID1, RAID10) have favorable behavior for VM workloads. Use them, unless your environment has specific needs and characteristics where RAIDZ performance characteristics are acceptable.

3.8.4. Bootloader

Proxmox VE uses proxmox-boot-tool to manage the bootloader configuration. See the chapter on Proxmox VE host bootloaders for details.

3.8.5. ZFS Administration

This section gives you some usage examples for common tasks. ZFS itself is really powerful and provides many options. The main commands to manage ZFS are zfs and zpool. Both commands come with great manual pages, which can be read with:

# man zpool
# man zfs
Create a new zpool

To create a new pool, at least one disk is needed. The ashift should have the same sector-size (2 power of ashift) or larger as the underlying disk.

# zpool create -f -o ashift=12 <pool> <device>

To activate compression (see section Compression in ZFS):

# zfs set compression=lz4 <pool>
Create a new pool with RAID-0

Minimum 1 disk

# zpool create -f -o ashift=12 <pool> <device1> <device2>
Create a new pool with RAID-1

Minimum 2 disks

# zpool create -f -o ashift=12 <pool> mirror <device1> <device2>
Create a new pool with RAID-10

Minimum 4 disks

# zpool create -f -o ashift=12 <pool> mirror <device1> <device2> mirror <device3> <device4>
Create a new pool with RAIDZ-1

Minimum 3 disks

# zpool create -f -o ashift=12 <pool> raidz1 <device1> <device2> <device3>
Create a new pool with RAIDZ-2

Minimum 4 disks

# zpool create -f -o ashift=12 <pool> raidz2 <device1> <device2> <device3> <device4>
Create a new pool with cache (L2ARC)

It is possible to use a dedicated cache drive partition to increase the performance (use SSD).

As <device> it is possible to use more devices, like it’s shown in "Create a new pool with RAID*".

# zpool create -f -o ashift=12 <pool> <device> cache <cache_device>
Create a new pool with log (ZIL)

It is possible to use a dedicated cache drive partition to increase the performance(SSD).

As <device> it is possible to use more devices, like it’s shown in "Create a new pool with RAID*".

# zpool create -f -o ashift=12 <pool> <device> log <log_device>
Add cache and log to an existing pool

If you have a pool without cache and log. First partition the SSD in 2 partition with parted or gdisk

Important Always use GPT partition tables.

The maximum size of a log device should be about half the size of physical memory, so this is usually quite small. The rest of the SSD can be used as cache.

# zpool add -f <pool> log <device-part1> cache <device-part2>
Changing a failed device
# zpool replace -f <pool> <old device> <new device>
Changing a failed bootable device

Depending on how Proxmox VE was installed it is either using proxmox-boot-tool
[Systems installed with Proxmox VE 6.4 or later, EFI systems installed with Proxmox VE 5.4 or later]
or plain grub as bootloader (see Host Bootloader). You can check by running:

# proxmox-boot-tool status

The first steps of copying the partition table, reissuing GUIDs and replacing the ZFS partition are the same. To make the system bootable from the new disk, different steps are needed which depend on the bootloader in use.

# sgdisk <healthy bootable device> -R <new device>
# sgdisk -G <new device>
# zpool replace -f <pool> <old zfs partition> <new zfs partition>
Note Use the zpool status -v command to monitor how far the resilvering process of the new disk has progressed.
With proxmox-boot-tool:
# proxmox-boot-tool format <new disk's ESP>
# proxmox-boot-tool init <new disk's ESP>
Note ESP stands for EFI System Partition, which is setup as partition #2 on bootable disks setup by the Proxmox VE installer since version 5.4. For details, see Setting up a new partition for use as synced ESP.
With grub:
# grub-install <new disk>

3.8.6. Activate E-Mail Notification

ZFS comes with an event daemon, which monitors events generated by the ZFS kernel module. The daemon can also send emails on ZFS events like pool errors. Newer ZFS packages ship the daemon in a separate package, and you can install it using apt-get:

# apt-get install zfs-zed

To activate the daemon it is necessary to edit /etc/zfs/zed.d/zed.rc with your favorite editor, and uncomment the ZED_EMAIL_ADDR setting:


Please note Proxmox VE forwards mails to root to the email address configured for the root user.

Important The only setting that is required is ZED_EMAIL_ADDR. All other settings are optional.

3.8.7. Limit ZFS Memory Usage

ZFS uses 50 % of the host memory for the Adaptive Replacement Cache (ARC) by default. Allocating enough memory for the ARC is crucial for IO performance, so reduce it with caution. As a general rule of thumb, allocate at least 2 GiB Base + 1 GiB/TiB-Storage. For example, if you have a pool with 8 TiB of available storage space then you should use 10 GiB of memory for the ARC.

You can change the ARC usage limit for the current boot (a reboot resets this change again) by writing to the zfs_arc_max module parameter directly:

 echo "$[10 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max

To permanently change the ARC limits, add the following line to /etc/modprobe.d/zfs.conf:

options zfs zfs_arc_max=8589934592

This example setting limits the usage to 8 GiB (8 * 230).

Important In case your desired zfs_arc_max value is lower than or equal to zfs_arc_min (which defaults to 1/32 of the system memory), zfs_arc_max will be ignored unless you also set zfs_arc_min to at most zfs_arc_max - 1.
echo "$[8 * 1024*1024*1024 - 1]" >/sys/module/zfs/parameters/zfs_arc_min
echo "$[8 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max

This example setting (temporarily) limits the usage to 8 GiB (8 * 230) on systems with more than 256 GiB of total memory, where simply setting zfs_arc_max alone would not work.


If your root file system is ZFS, you must update your initramfs every time this value changes:

# update-initramfs -u

You must reboot to activate these changes.

3.8.8. SWAP on ZFS

Swap-space created on a zvol may generate some troubles, like blocking the server or generating a high IO load, often seen when starting a Backup to an external Storage.

We strongly recommend to use enough memory, so that you normally do not run into low memory situations. Should you need or want to add swap, it is preferred to create a partition on a physical disk and use it as a swap device. You can leave some space free for this purpose in the advanced options of the installer. Additionally, you can lower the “swappiness” value. A good value for servers is 10:

# sysctl -w vm.swappiness=10

To make the swappiness persistent, open /etc/sysctl.conf with an editor of your choice and add the following line:

vm.swappiness = 10
Table 1. Linux kernel swappiness parameter values
Value Strategy

vm.swappiness = 0

The kernel will swap only to avoid an out of memory condition

vm.swappiness = 1

Minimum amount of swapping without disabling it entirely.

vm.swappiness = 10

This value is sometimes recommended to improve performance when sufficient memory exists in a system.

vm.swappiness = 60

The default value.

vm.swappiness = 100

The kernel will swap aggressively.

3.8.9. Encrypted ZFS Datasets

ZFS on Linux version 0.8.0 introduced support for native encryption of datasets. After an upgrade from previous ZFS on Linux versions, the encryption feature can be enabled per pool:

# zpool get feature@encryption tank
NAME  PROPERTY            VALUE            SOURCE
tank  feature@encryption  disabled         local

# zpool set feature@encryption=enabled

# zpool get feature@encryption tank
NAME  PROPERTY            VALUE            SOURCE
tank  feature@encryption  enabled         local
Warning There is currently no support for booting from pools with encrypted datasets using Grub, and only limited support for automatically unlocking encrypted datasets on boot. Older versions of ZFS without encryption support will not be able to decrypt stored data.
Note It is recommended to either unlock storage datasets manually after booting, or to write a custom unit to pass the key material needed for unlocking on boot to zfs load-key.
Warning Establish and test a backup procedure before enabling encryption of production data. If the associated key material/passphrase/keyfile has been lost, accessing the encrypted data is no longer possible.

Encryption needs to be setup when creating datasets/zvols, and is inherited by default to child datasets. For example, to create an encrypted dataset tank/encrypted_data and configure it as storage in Proxmox VE, run the following commands:

# zfs create -o encryption=on -o keyformat=passphrase tank/encrypted_data
Enter passphrase:
Re-enter passphrase:

# pvesm add zfspool encrypted_zfs -pool tank/encrypted_data

All guest volumes/disks create on this storage will be encrypted with the shared key material of the parent dataset.

To actually use the storage, the associated key material needs to be loaded and the dataset needs to be mounted. This can be done in one step with:

# zfs mount -l tank/encrypted_data
Enter passphrase for 'tank/encrypted_data':

It is also possible to use a (random) keyfile instead of prompting for a passphrase by setting the keylocation and keyformat properties, either at creation time or with zfs change-key on existing datasets:

# dd if=/dev/urandom of=/path/to/keyfile bs=32 count=1

# zfs change-key -o keyformat=raw -o keylocation=file:///path/to/keyfile tank/encrypted_data
Warning When using a keyfile, special care needs to be taken to secure the keyfile against unauthorized access or accidental loss. Without the keyfile, it is not possible to access the plaintext data!

A guest volume created underneath an encrypted dataset will have its encryptionroot property set accordingly. The key material only needs to be loaded once per encryptionroot to be available to all encrypted datasets underneath it.

See the encryptionroot, encryption, keylocation, keyformat and keystatus properties, the zfs load-key, zfs unload-key and zfs change-key commands and the Encryption section from man zfs for more details and advanced usage.

3.8.10. Compression in ZFS

When compression is enabled on a dataset, ZFS tries to compress all new blocks before writing them and decompresses them on reading. Already existing data will not be compressed retroactively.

You can enable compression with:

# zfs set compression=<algorithm> <dataset>

We recommend using the lz4 algorithm, because it adds very little CPU overhead. Other algorithms like lzjb and gzip-N, where N is an integer from 1 (fastest) to 9 (best compression ratio), are also available. Depending on the algorithm and how compressible the data is, having compression enabled can even increase I/O performance.

You can disable compression at any time with:

# zfs set compression=off <dataset>

Again, only new blocks will be affected by this change.

3.8.11. ZFS Special Device

Since version 0.8.0 ZFS supports special devices. A special device in a pool is used to store metadata, deduplication tables, and optionally small file blocks.

A special device can improve the speed of a pool consisting of slow spinning hard disks with a lot of metadata changes. For example workloads that involve creating, updating or deleting a large number of files will benefit from the presence of a special device. ZFS datasets can also be configured to store whole small files on the special device which can further improve the performance. Use fast SSDs for the special device.

Important The redundancy of the special device should match the one of the pool, since the special device is a point of failure for the whole pool.
Warning Adding a special device to a pool cannot be undone!
Create a pool with special device and RAID-1:
# zpool create -f -o ashift=12 <pool> mirror <device1> <device2> special mirror <device3> <device4>
Add a special device to an existing pool with RAID-1:
# zpool add <pool> special mirror <device1> <device2>

ZFS datasets expose the special_small_blocks=<size> property. size can be 0 to disable storing small file blocks on the special device or a power of two in the range between 512B to 128K. After setting the property new file blocks smaller than size will be allocated on the special device.

Important If the value for special_small_blocks is greater than or equal to the recordsize (default 128K) of the dataset, all data will be written to the special device, so be careful!

Setting the special_small_blocks property on a pool will change the default value of that property for all child ZFS datasets (for example all containers in the pool will opt in for small file blocks).

Opt in for all file smaller than 4K-blocks pool-wide:
# zfs set special_small_blocks=4K <pool>
Opt in for small file blocks for a single dataset:
# zfs set special_small_blocks=4K <pool>/<filesystem>
Opt out from small file blocks for a single dataset:
# zfs set special_small_blocks=0 <pool>/<filesystem>

3.8.12. ZFS Pool Features

Changes to the on-disk format in ZFS are only made between major version changes and are specified through features. All features, as well as the general mechanism are well documented in the zpool-features(5) manpage.

Since enabling new features can render a pool not importable by an older version of ZFS, this needs to be done actively by the administrator, by running zpool upgrade on the pool (see the zpool-upgrade(8) manpage).

Unless you need to use one of the new features, there is no upside to enabling them.

In fact, there are some downsides to enabling new features:

  • A system with root on ZFS, that still boots using grub will become unbootable if a new feature is active on the rpool, due to the incompatible implementation of ZFS in grub.

  • The system will not be able to import any upgraded pool when booted with an older kernel, which still ships with the old ZFS modules.

  • Booting an older Proxmox VE ISO to repair a non-booting system will likewise not work.

Important Do not upgrade your rpool if your system is still booted with grub, as this will render your system unbootable. This includes systems installed before Proxmox VE 5.4, and systems booting with legacy BIOS boot (see how to determine the bootloader).
Enable new features for a ZFS pool:
# zpool upgrade <pool>

3.9. Proxmox Node Management

The Proxmox VE node management tool (pvenode) allows to control node specific settings and resources.

Currently pvenode allows to set a node’s description and to manage the node’s SSL certificates used for the API and the web GUI through pveproxy.

3.9.1. Wake-on-LAN

Wake-on-LAN (WoL) allows to switch on a sleeping computer in the network by sending a magic packet. At least one NIC must support this feature and the respective option needs to be enabled in the computers firmware (BIOS/UEFI) configuration. The option name can vary from Enable Wake-on-Lan to Power On By PCIE Device, check your motherboards vendor manual, if unsure. ethtool can be used to check the WoL configuration of <interface> by running:

ethtool <interface> | grep Wake-on

pvenode allows to wake sleeping members of a cluster via WoL using the command:

pvenode wakeonlan <node>

This broadcasts the WoL magic packet on UDP port 9, containing the MAC address of <node> obtained from the wakeonlan property. The node specific wakeonlan property can be set by the following command:

pvenode config set -wakeonlan XX:XX:XX:XX:XX:XX

3.10. Certificate Management

3.10.1. Certificates for Intra-Cluster Communication

Each Proxmox VE cluster creates by default its own (self-signed) Certificate Authority (CA) and generates a certificate for each node which gets signed by the aforementioned CA. These certificates are used for encrypted communication with the cluster’s pveproxy service and the Shell/Console feature if SPICE is used.

The CA certificate and key are stored in the Proxmox Cluster File System (pmxcfs).

3.10.2. Certificates for API and Web GUI

The REST API and web GUI are provided by the pveproxy service, which runs on each node.

You have the following options for the certificate used by pveproxy:

  1. By default the node-specific certificate in /etc/pve/nodes/NODENAME/pve-ssl.pem is used. This certificate is signed by the cluster CA and therefore not automatically trusted by browsers and operating systems.

  2. use an externally provided certificate (e.g. signed by a commercial CA).

  3. use ACME (Let’s Encrypt) to get a trusted certificate with automatic renewal, this is also integrated in the Proxmox VE API and Webinterface.

For options 2 and 3 the file /etc/pve/local/pveproxy-ssl.pem (and /etc/pve/local/pveproxy-ssl.key, which needs to be without password) is used.

Note Keep in mind that /etc/pve/local is a node specific symlink to /etc/pve/nodes/NODENAME.

Certificates are managed with the Proxmox VE Node management command (see the pvenode(1) manpage).

Warning Do not replace or manually modify the automatically generated node certificate files in /etc/pve/local/pve-ssl.pem and /etc/pve/local/pve-ssl.key or the cluster CA files in /etc/pve/pve-root-ca.pem and /etc/pve/priv/pve-root-ca.key.

3.10.3. Upload Custom Certificate

If you already have a certificate which you want to use for a Proxmox VE node you can upload that certificate simply over the web interface.


Note that the certificates key file, if provided, mustn’t be password protected.

3.10.4. Trusted certificates via Let’s Encrypt (ACME)

Proxmox VE includes an implementation of the Automatic Certificate Management Environment ACME protocol, allowing Proxmox VE admins to use an ACME provider like Let’s Encrypt for easy setup of TLS certificates which are accepted and trusted on modern operating systems and web browsers out of the box.

Currently, the two ACME endpoints implemented are the Let’s Encrypt (LE) production and its staging environment. Our ACME client supports validation of http-01 challenges using a built-in web server and validation of dns-01 challenges using a DNS plugin supporting all the DNS API endpoints acme.sh does.

ACME Account

You need to register an ACME account per cluster with the endpoint you want to use. The email address used for that account will serve as contact point for renewal-due or similar notifications from the ACME endpoint.

You can register and deactivate ACME accounts over the web interface Datacenter -> ACME or using the pvenode command line tool.

 pvenode acme account register account-name mail@example.com
Tip Because of rate-limits you should use LE staging for experiments or if you use ACME for the first time.
ACME Plugins

The ACME plugins task is to provide automatic verification that you, and thus the Proxmox VE cluster under your operation, are the real owner of a domain. This is the basis building block for automatic certificate management.

The ACME protocol specifies different types of challenges, for example the http-01 where a web server provides a file with a certain content to prove that it controls a domain. Sometimes this isn’t possible, either because of technical limitations or if the address of a record to is not reachable from the public internet. The dns-01 challenge can be used in these cases. This challenge is fulfilled by creating a certain DNS record in the domain’s zone.


Proxmox VE supports both of those challenge types out of the box, you can configure plugins either over the web interface under Datacenter -> ACME, or using the pvenode acme plugin add command.

ACME Plugin configurations are stored in /etc/pve/priv/acme/plugins.cfg. A plugin is available for all nodes in the cluster.

Node Domains

Each domain is node specific. You can add new or manage existing domain entries under Node -> Certificates, or using the pvenode config command.


After configuring the desired domain(s) for a node and ensuring that the desired ACME account is selected, you can order your new certificate over the web-interface. On success the interface will reload after 10 seconds.

Renewal will happen automatically.

3.10.5. ACME HTTP Challenge Plugin

There is always an implicitly configured standalone plugin for validating http-01 challenges via the built-in webserver spawned on port 80.

Note The name standalone means that it can provide the validation on it’s own, without any third party service. So, this plugin works also for cluster nodes.

There are a few prerequisites to use it for certificate management with Let’s Encrypts ACME.

  • You have to accept the ToS of Let’s Encrypt to register an account.

  • Port 80 of the node needs to be reachable from the internet.

  • There must be no other listener on port 80.

  • The requested (sub)domain needs to resolve to a public IP of the Node.

3.10.6. ACME DNS API Challenge Plugin

On systems where external access for validation via the http-01 method is not possible or desired, it is possible to use the dns-01 validation method. This validation method requires a DNS server that allows provisioning of TXT records via an API.

Configuring ACME DNS APIs for validation

Proxmox VE re-uses the DNS plugins developed for the acme.sh
[acme.sh https://github.com/acmesh-official/acme.sh]
project, please refer to its documentation for details on configuration of specific APIs.

The easiest way to configure a new plugin with the DNS API is using the web interface (Datacenter -> ACME).


Choose DNS as challenge type. Then you can select your API provider, enter the credential data to access your account over their API.

Tip See the acme.sh How to use DNS API wiki for more detailed information about getting API credentials for your provider.

As there are many DNS providers and API endpoints Proxmox VE automatically generates the form for the credentials for some providers. For the others you will see a bigger text area, simply copy all the credentials KEY=VALUE pairs in there.

DNS Validation through CNAME Alias

A special alias mode can be used to handle the validation on a different domain/DNS server, in case your primary/real DNS does not support provisioning via an API. Manually set up a permanent CNAME record for _acme-challenge.domain1.example pointing to _acme-challenge.domain2.example and set the alias property in the Proxmox VE node configuration file to domain2.example to allow the DNS server of domain2.example to validate all challenges for domain1.example.

Combination of Plugins

Combining http-01 and dns-01 validation is possible in case your node is reachable via multiple domains with different requirements / DNS provisioning capabilities. Mixing DNS APIs from multiple providers or instances is also possible by specifying different plugin instances per domain.

Tip Accessing the same service over multiple domains increases complexity and should be avoided if possible.

3.10.7. Automatic renewal of ACME certificates

If a node has been successfully configured with an ACME-provided certificate (either via pvenode or via the GUI), the certificate will be automatically renewed by the pve-daily-update.service. Currently, renewal will be attempted if the certificate has expired already, or will expire in the next 30 days.

3.10.8. ACME Examples with pvenode

Example: Sample pvenode invocation for using Let’s Encrypt certificates
root@proxmox:~# pvenode acme account register default mail@example.invalid
Directory endpoints:
0) Let's Encrypt V2 (https://acme-v02.api.letsencrypt.org/directory)
1) Let's Encrypt V2 Staging (https://acme-staging-v02.api.letsencrypt.org/directory)
2) Custom
Enter selection: 1

Terms of Service: https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf
Do you agree to the above terms? [y|N]y
Task OK
root@proxmox:~# pvenode config set --acme domains=example.invalid
root@proxmox:~# pvenode acme cert order
Loading ACME account details
Placing ACME order
Status is 'valid'!

All domains validated!
Downloading certificate
Setting pveproxy certificate and key
Restarting pveproxy
Task OK
Example: Setting up the OVH API for validating a domain
Note the account registration steps are the same no matter which plugins are used, and are not repeated here.
Note OVH_AK and OVH_AS need to be obtained from OVH according to the OVH API documentation

First you need to get all information so you and Proxmox VE can access the API.

root@proxmox:~# cat /path/to/api-token
root@proxmox:~# source /path/to/api-token
root@proxmox:~# curl -XPOST -H"X-Ovh-Application: $OVH_AK" -H "Content-type: application/json" \
https://eu.api.ovh.com/1.0/auth/credential  -d '{
  "accessRules": [
    {"method": "GET","path": "/auth/time"},
    {"method": "GET","path": "/domain"},
    {"method": "GET","path": "/domain/zone/*"},
    {"method": "GET","path": "/domain/zone/*/record"},
    {"method": "POST","path": "/domain/zone/*/record"},
    {"method": "POST","path": "/domain/zone/*/refresh"},
    {"method": "PUT","path": "/domain/zone/*/record/"},
    {"method": "DELETE","path": "/domain/zone/*/record/*"}

(open validation URL and follow instructions to link Application Key with account/Consumer Key)

root@proxmox:~# echo "OVH_CK=ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ" >> /path/to/api-token

Now you can setup the the ACME plugin:

root@proxmox:~# pvenode acme plugin add dns example_plugin --api ovh --data /path/to/api_token
root@proxmox:~# pvenode acme plugin config example_plugin
│ key    │ value                                    │
│ api    │ ovh                                      │
│ data   │ OVH_AK=XXXXXXXXXXXXXXXX                  │
│ digest │ 867fcf556363ca1bea866863093fcab83edf47a1 │
│ plugin │ example_plugin                           │
│ type   │ dns                                      │

At last you can configure the domain you want to get certificates for and place the certificate order for it:

root@proxmox:~# pvenode config set -acmedomain0 example.proxmox.com,plugin=example_plugin
root@proxmox:~# pvenode acme cert order
Loading ACME account details
Placing ACME order
Order URL: https://acme-staging-v02.api.letsencrypt.org/acme/order/11111111/22222222

Getting authorization details from 'https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/33333333'
The validation for example.proxmox.com is pending!
[Wed Apr 22 09:25:30 CEST 2020] Using OVH endpoint: ovh-eu
[Wed Apr 22 09:25:30 CEST 2020] Checking authentication
[Wed Apr 22 09:25:30 CEST 2020] Consumer key is ok.
[Wed Apr 22 09:25:31 CEST 2020] Adding record
[Wed Apr 22 09:25:32 CEST 2020] Added, sleep 10 seconds.
Add TXT record: _acme-challenge.example.proxmox.com
Triggering validation
Sleeping for 5 seconds
Status is 'valid'!
[Wed Apr 22 09:25:48 CEST 2020] Using OVH endpoint: ovh-eu
[Wed Apr 22 09:25:48 CEST 2020] Checking authentication
[Wed Apr 22 09:25:48 CEST 2020] Consumer key is ok.
Remove TXT record: _acme-challenge.example.proxmox.com

All domains validated!

Creating CSR
Checking order status
Order is ready, finalizing order

Downloading certificate
Setting pveproxy certificate and key
Restarting pveproxy
Task OK
Example: Switching from the staging to the regular ACME directory

Changing the ACME directory for an account is unsupported, but as Proxmox VE supports more than one account you can just create a new one with the production (trusted) ACME directory as endpoint. You can also deactivate the staging account and recreate it.

Example: Changing the default ACME account from staging to directory using pvenode
root@proxmox:~# pvenode acme account deactivate default
Renaming account file from '/etc/pve/priv/acme/default' to '/etc/pve/priv/acme/_deactivated_default_4'
Task OK

root@proxmox:~# pvenode acme account register default example@proxmox.com
Directory endpoints:
0) Let's Encrypt V2 (https://acme-v02.api.letsencrypt.org/directory)
1) Let's Encrypt V2 Staging (https://acme-staging-v02.api.letsencrypt.org/directory)
2) Custom
Enter selection: 0

Terms of Service: https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf
Do you agree to the above terms? [y|N]y
Task OK

3.11. Host Bootloader

Proxmox VE currently uses one of two bootloaders depending on the disk setup selected in the installer.

For EFI Systems installed with ZFS as the root filesystem systemd-boot is used. All other deployments use the standard grub bootloader (this usually also applies to systems which are installed on top of Debian).

3.11.1. Partitioning Scheme Used by the Installer

The Proxmox VE installer creates 3 partitions on all disks selected for installation.

The created partitions are:

  • a 1 MB BIOS Boot Partition (gdisk type EF02)

  • a 512 MB EFI System Partition (ESP, gdisk type EF00)

  • a third partition spanning the set hdsize parameter or the remaining space used for the chosen storage type

Systems using ZFS as root filesystem are booted with a kernel and initrd image stored on the 512 MB EFI System Partition. For legacy BIOS systems, grub is used, for EFI systems systemd-boot is used. Both are installed and configured to point to the ESPs.

grub in BIOS mode (--target i386-pc) is installed onto the BIOS Boot Partition of all selected disks on all systems booted with grub
[These are all installs with root on ext4 or xfs and installs with root on ZFS on non-EFI systems]

3.11.2. Synchronizing the content of the ESP with proxmox-boot-tool

proxmox-boot-tool is a utility used to keep the contents of the EFI System Partitions properly configured and synchronized. It copies certain kernel versions to all ESPs and configures the respective bootloader to boot from the vfat formatted ESPs. In the context of ZFS as root filesystem this means that you can use all optional features on your root pool instead of the subset which is also present in the ZFS implementation in grub or having to create a separate small boot-pool
[Booting ZFS on root with grub https://github.com/zfsonlinux/zfs/wiki/Debian-Stretch-Root-on-ZFS]

In setups with redundancy all disks are partitioned with an ESP, by the installer. This ensures the system boots even if the first boot device fails or if the BIOS can only boot from a particular disk.

The ESPs are not kept mounted during regular operation. This helps to prevent filesystem corruption to the vfat formatted ESPs in case of a system crash, and removes the need to manually adapt /etc/fstab in case the primary boot device fails.

proxmox-boot-tool handles the following tasks:

  • formatting and setting up a new partition

  • copying and configuring new kernel images and initrd images to all listed ESPs

  • synchronizing the configuration on kernel upgrades and other maintenance tasks

  • managing the list of kernel versions which are synchronized

You can view the currently configured ESPs and their state by running:

# proxmox-boot-tool status
Setting up a new partition for use as synced ESP

To format and initialize a partition as synced ESP, e.g., after replacing a failed vdev in an rpool, or when converting an existing system that pre-dates the sync mechanism, proxmox-boot-tool from pve-kernel-helpers can be used.

Warning the format command will format the <partition>, make sure to pass in the right device/partition!

For example, to format an empty partition /dev/sda2 as ESP, run the following:

# proxmox-boot-tool format /dev/sda2

To setup an existing, unmounted ESP located on /dev/sda2 for inclusion in Proxmox VE’s kernel update synchronization mechanism, use the following:

# proxmox-boot-tool init /dev/sda2

Afterwards /etc/kernel/proxmox-boot-uuids should contain a new line with the UUID of the newly added partition. The init command will also automatically trigger a refresh of all configured ESPs.

Updating the configuration on all ESPs

To copy and configure all bootable kernels and keep all ESPs listed in /etc/kernel/proxmox-boot-uuids in sync you just need to run:

# proxmox-boot-tool refresh

(The equivalent to running update-grub systems with ext4 or xfs on root).

This is necessary should you make changes to the kernel commandline, or want to sync all kernels and initrds.

Note Both update-initramfs and apt (when necessary) will automatically trigger a refresh.
Kernel Versions considered by proxmox-boot-tool

The following kernel versions are configured by default:

  • the currently running kernel

  • the version being newly installed on package updates

  • the two latest already installed kernels

  • the latest version of the second-to-last kernel series (e.g. 5.0, 5.3), if applicable

  • any manually selected kernels

Manually keeping a kernel bootable

Should you wish to add a certain kernel and initrd image to the list of bootable kernels use proxmox-boot-tool kernel add.

For example run the following to add the kernel with ABI version 5.0.15-1-pve to the list of kernels to keep installed and synced to all ESPs:

# proxmox-boot-tool kernel add 5.0.15-1-pve

proxmox-boot-tool kernel list will list all kernel versions currently selected for booting:

# proxmox-boot-tool kernel list
Manually selected kernels:

Automatically selected kernels:

Run proxmox-boot-tool kernel remove to remove a kernel from the list of manually selected kernels, for example:

# proxmox-boot-tool kernel remove 5.0.15-1-pve
Note It’s required to run proxmox-boot-tool refresh to update all EFI System Partitions (ESPs) after a manual kernel addition or removal from above.

3.11.3. Determine which Bootloader is Used


The simplest and most reliable way to determine which bootloader is used, is to watch the boot process of the Proxmox VE node.

You will either see the blue box of grub or the simple black on white systemd-boot.


Determining the bootloader from a running system might not be 100% accurate. The safest way is to run the following command:

# efibootmgr -v

If it returns a message that EFI variables are not supported, grub is used in BIOS/Legacy mode.

If the output contains a line that looks similar to the following, grub is used in UEFI mode.

Boot0005* proxmox       [...] File(\EFI\proxmox\grubx64.efi)

If the output contains a line similar to the following, systemd-boot is used.

Boot0006* Linux Boot Manager    [...] File(\EFI\systemd\systemd-bootx64.efi)

By running:

# proxmox-boot-tool status

you can find out if proxmox-boot-tool is configured, which is a good indication of how the system is booted.

3.11.4. Grub

grub has been the de-facto standard for booting Linux systems for many years and is quite well documented
[Grub Manual https://www.gnu.org/software/grub/manual/grub/grub.html]


Changes to the grub configuration are done via the defaults file /etc/default/grub or config snippets in /etc/default/grub.d. To regenerate the configuration file after a change to the configuration run:
[Systems using proxmox-boot-tool will call proxmox-boot-tool refresh upon update-grub.]

# update-grub

3.11.5. Systemd-boot

systemd-boot is a lightweight EFI bootloader. It reads the kernel and initrd images directly from the EFI Service Partition (ESP) where it is installed. The main advantage of directly loading the kernel from the ESP is that it does not need to reimplement the drivers for accessing the storage. In Proxmox VE proxmox-boot-tool is used to keep the configuration on the ESPs synchronized.


systemd-boot is configured via the file loader/loader.conf in the root directory of an EFI System Partition (ESP). See the loader.conf(5) manpage for details.

Each bootloader entry is placed in a file of its own in the directory loader/entries/

An example entry.conf looks like this (/ refers to the root of the ESP):

title    Proxmox
version  5.0.15-1-pve
options   root=ZFS=rpool/ROOT/pve-1 boot=zfs
linux    /EFI/proxmox/5.0.15-1-pve/vmlinuz-5.0.15-1-pve
initrd   /EFI/proxmox/5.0.15-1-pve/initrd.img-5.0.15-1-pve

3.11.6. Editing the Kernel Commandline

You can modify the kernel commandline in the following places, depending on the bootloader used:


The kernel commandline needs to be placed in the variable GRUB_CMDLINE_LINUX_DEFAULT in the file /etc/default/grub. Running update-grub appends its content to all linux entries in /boot/grub/grub.cfg.


The kernel commandline needs to be placed as one line in /etc/kernel/cmdline. To apply your changes, run proxmox-boot-tool refresh, which sets it as the option line for all config files in loader/entries/proxmox-*.conf.

4. Graphical User Interface

Proxmox VE is simple. There is no need to install a separate management tool, and everything can be done through your web browser (Latest Firefox or Google Chrome is preferred). A built-in HTML5 console is used to access the guest console. As an alternative, SPICE can be used.

Because we use the Proxmox cluster file system (pmxcfs), you can connect to any node to manage the entire cluster. Each node can manage the entire cluster. There is no need for a dedicated manager node.

You can use the web-based administration interface with any modern browser. When Proxmox VE detects that you are connecting from a mobile device, you are redirected to a simpler, touch-based user interface.

The web interface can be reached via https://youripaddress:8006 (default login is: root, and the password is specified during the installation process).

4.1. Features

  • Seamless integration and management of Proxmox VE clusters

  • AJAX technologies for dynamic updates of resources

  • Secure access to all Virtual Machines and Containers via SSL encryption (https)

  • Fast search-driven interface, capable of handling hundreds and probably thousands of VMs

  • Secure HTML5 console or SPICE

  • Role based permission management for all objects (VMs, storages, nodes, etc.)

  • Support for multiple authentication sources (e.g. local, MS ADS, LDAP, …)

  • Two-Factor Authentication (OATH, Yubikey)

  • Based on ExtJS 6.x JavaScript framework

4.2. Login


When you connect to the server, you will first see the login window. Proxmox VE supports various authentication backends (Realm), and you can select the language here. The GUI is translated to more than 20 languages.

Note You can save the user name on the client side by selecting the checkbox at the bottom. This saves some typing when you login next time.

4.3. GUI Overview


The Proxmox VE user interface consists of four regions.


On top. Shows status information and contains buttons for most important actions.

Resource Tree

At the left side. A navigation tree where you can select specific objects.

Content Panel

Center region. Selected objects display configuration options and status here.

Log Panel

At the bottom. Displays log entries for recent tasks. You can double-click on those log entries to get more details, or to abort a running task.

Note You can shrink and expand the size of the resource tree and log panel, or completely hide the log panel. This can be helpful when you work on small displays and want more space to view other content.

4.3.1. Header

On the top left side, the first thing you see is the Proxmox logo. Next to it is the current running version of Proxmox VE. In the search bar nearside you can search for specific objects (VMs, containers, nodes, …). This is sometimes faster than selecting an object in the resource tree.

To the right of the search bar we see the identity (login name). The gear symbol is a button opening the My Settings dialog. There you can customize some client side user interface setting (reset the saved login name, reset saved layout).

The rightmost part of the header contains four buttons:


Opens a new browser window showing the reference documentation.

Create VM

Opens the virtual machine creation wizard.

Create CT

Open the container creation wizard.


Logout, and show the login dialog again.

4.3.2. My Settings


The My Settings window allows you to set locally stored settings. These include the Dashboard Storages which allow you to enable or disable specific storages to be counted towards the total amount visible in the datacenter summary. If no storage is checked the total is the sum of all storages, same as enabling every single one.

Below the dashboard settings you find the stored user name and a button to clear it as well as a button to reset every layout in the GUI to its default.

On the right side there are xterm.js Settings. These contain the following options:


The font to be used in xterm.js (e.g. Arial).


The preferred font size to be used.

Letter Spacing

Increases or decreases spacing between letters in text.

Line Height

Specify the absolute height of a line.

4.3.3. Resource Tree

This is the main navigation tree. On top of the tree you can select some predefined views, which change the structure of the tree below. The default view is the Server View, and it shows the following object types:


Contains cluster-wide settings (relevant for all nodes).


Represents the hosts inside a cluster, where the guests run.


VMs, containers and templates.


Data Storage.


It is possible to group guests using a pool to simplify management.

The following view types are available:

Server View

Shows all kinds of objects, grouped by nodes.

Folder View

Shows all kinds of objects, grouped by object type.

Storage View

Only shows storage objects, grouped by nodes.

Pool View

Show VMs and containers, grouped by pool.

4.3.4. Log Panel

The main purpose of the log panel is to show you what is currently going on in your cluster. Actions like creating an new VM are executed in the background, and we call such a background job a task.

Any output from such a task is saved into a separate log file. You can view that log by simply double-click a task log entry. It is also possible to abort a running task there.

Please note that we display the most recent tasks from all cluster nodes here. So you can see when somebody else is working on another cluster node in real-time.

Note We remove older and finished task from the log panel to keep that list short. But you can still find those tasks within the node panel in the Task History.

Some short-running actions simply send logs to all cluster members. You can see those messages in the Cluster log panel.

4.4. Content Panels

When you select an item from the resource tree, the corresponding object displays configuration and status information in the content panel. The following sections provide a brief overview of this functionality. Please refer to the corresponding chapters in the reference documentation to get more detailed information.

4.4.1. Datacenter


On the datacenter level, you can access cluster-wide settings and information.

  • Search: perform a cluster-wide search for nodes, VMs, containers, storage devices, and pools.

  • Summary: gives a brief overview of the cluster’s health and resource usage.

  • Cluster: provides the functionality and information necessary to create or join a cluster.

  • Options: view and manage cluster-wide default settings.

  • Storage: provides an interface for managing cluster storage.

  • Backup: schedule backup jobs. This operates cluster wide, so it doesn’t matter where the VMs/containers are on your cluster when scheduling.

  • Replication: view and manage replication jobs.

  • Permissions: manage user, group, and API token permissions, and LDAP, MS-AD and Two-Factor authentication.

  • HA: manage Proxmox VE High Availability.

  • ACME: set up ACME (Let’s Encrypt) certificates for server nodes.

  • Firewall: configure and make templates for the Proxmox Firewall cluster wide.

  • Metric Server: define external metric servers for Proxmox VE.

  • Support: display information about your support subscription.

4.4.2. Nodes


Nodes in your cluster can be managed individually at this level.

The top header has useful buttons such as Reboot, Shutdown, Shell, Bulk Actions and Help. Shell has the options noVNC, SPICE and xterm.js. Bulk Actions has the options Bulk Start, Bulk Stop and Bulk Migrate.

  • Search: search a node for VMs, containers, storage devices, and pools.

  • Summary: display a brief overview of the node’s resource usage.

  • Notes: write custom notes about a node.

  • Shell: access to a shell interface for the node.

  • System: configure network, DNS and time settings, and access the syslog.

  • Updates: upgrade the system and see the available new packages.

  • Firewall: manage the Proxmox Firewall for a specific node.

  • Disks: get an overview of the attached disks, and manage how they are used.

  • Ceph: is only used if you have installed a Ceph server on your host. In this case, you can manage your Ceph cluster and see the status of it here.

  • Replication: view and manage replication jobs.

  • Task History: see a list of past tasks.

  • Subscription: upload a subscription key, and generate a system report for use in support cases.

4.4.3. Guests


There are two different kinds of guests and both can be converted to a template. One of them is a Kernel-based Virtual Machine (KVM) and the other is a Linux Container (LXC). Navigation for these are mostly the same; only some options are different.

To access the various guest management interfaces, select a VM or container from the menu on the left.

The header contains commands for items such as power management, migration, console access and type, cloning, HA, and help. Some of these buttons contain drop-down menus, for example, Shutdown also contains other power options, and Console contains the different console types: SPICE, noVNC and xterm.js.

The panel on the right contains an interface for whatever item is selected from the menu on the left.

The available interfaces are as follows.

  • Summary: provides a brief overview of the VM’s activity.

  • Console: access to an interactive console for the VM/container.

  • (KVM)Hardware: define the hardware available to the KVM VM.

  • (LXC)Resources: define the system resources available to the LXC.

  • (LXC)Network: configure a container’s network settings.

  • (LXC)DNS: configure a container’s DNS settings.

  • Options: manage guest options.

  • Task History: view all previous tasks related to the selected guest.

  • (KVM) Monitor: an interactive communication interface to the KVM process.

  • Backup: create and restore system backups.

  • Replication: view and manage the replication jobs for the selected guest.

  • Snapshots: create and restore VM snapshots.

  • Firewall: configure the firewall on the VM level.

  • Permissions: manage permissions for the selected guest.

4.4.4. Storage


As with the guest interface, the interface for storage consists of a menu on the left for certain storage elements and an interface on the right to manage these elements.

In this view we have a two partition split-view. On the left side we have the storage options and on the right side the content of the selected option will be shown.

  • Summary: shows important information about the storage, such as the type, usage, and content which it stores.

  • Content: a menu item for each content type which the storage stores, for example, Backups, ISO Images, CT Templates.

  • Permissions: manage permissions for the storage.

4.4.5. Pools


Again, the pools view comprises two partitions: a menu on the left, and the corresponding interfaces for each menu item on the right.

  • Summary: shows a description of the pool.

  • Members: display and manage pool members (guests and storage).

  • Permissions: manage the permissions for the pool.

5. Cluster Manager

The Proxmox VE cluster manager pvecm is a tool to create a group of physical servers. Such a group is called a cluster. We use the Corosync Cluster Engine for reliable group communication. There’s no explicit limit for the number of nodes in a cluster. In practice, the actual possible node count may be limited by the host and network performance. Currently (2021), there are reports of clusters (using high-end enterprise hardware) with over 50 nodes in production.

pvecm can be used to create a new cluster, join nodes to a cluster, leave the cluster, get status information and do various other cluster-related tasks. The Proxmox Cluster File System (“pmxcfs”) is used to transparently distribute the cluster configuration to all cluster nodes.

Grouping nodes into a cluster has the following advantages:

  • Centralized, web based management

  • Multi-master clusters: each node can do all management tasks

  • pmxcfs: database-driven file system for storing configuration files, replicated in real-time on all nodes using corosync.

  • Easy migration of virtual machines and containers between physical hosts

  • Fast deployment

  • Cluster-wide services like firewall and HA

5.1. Requirements

  • All nodes must be able to connect to each other via UDP ports 5404 and 5405 for corosync to work.

  • Date and time have to be synchronized.

  • SSH tunnel on TCP port 22 between nodes is used.

  • If you are interested in High Availability, you need to have at least three nodes for reliable quorum. All nodes should have the same version.

  • We recommend a dedicated NIC for the cluster traffic, especially if you use shared storage.

  • Root password of a cluster node is required for adding nodes.

Note It is not possible to mix Proxmox VE 3.x and earlier with Proxmox VE 4.X cluster nodes.
Note While it’s possible to mix Proxmox VE 4.4 and Proxmox VE 5.0 nodes, doing so is not supported as production configuration and should only used temporarily during upgrading the whole cluster from one to another major version.
Note Running a cluster of Proxmox VE 6.x with earlier versions is not possible. The cluster protocol (corosync) between Proxmox VE 6.x and earlier versions changed fundamentally. The corosync 3 packages for Proxmox VE 5.4 are only intended for the upgrade procedure to Proxmox VE 6.0.

5.2. Preparing Nodes

First, install Proxmox VE on all nodes. Make sure that each node is installed with the final hostname and IP configuration. Changing the hostname and IP is not possible after cluster creation.

While it’s common to reference all nodenames and their IPs in /etc/hosts (or make their names resolvable through other means), this is not necessary for a cluster to work. It may be useful however, as you can then connect from one node to the other with SSH via the easier to remember node name (see also Link Address Types). Note that we always recommend to reference nodes by their IP addresses in the cluster configuration.

5.3. Create a Cluster

You can either create a cluster on the console (login via ssh), or through the API using the Proxmox VE Webinterface (Datacenter → Cluster).

Note Use a unique name for your cluster. This name cannot be changed later. The cluster name follows the same rules as node names.

5.3.1. Create via Web GUI


Under Datacenter → Cluster, click on Create Cluster. Enter the cluster name and select a network connection from the dropdown to serve as the main cluster network (Link 0). It defaults to the IP resolved via the node’s hostname.

To add a second link as fallback, you can select the Advanced checkbox and choose an additional network interface (Link 1, see also Corosync Redundancy).

Note Ensure the network selected for the cluster communication is not used for any high traffic loads like those of (network) storages or live-migration. While the cluster network itself produces small amounts of data, it is very sensitive to latency. Check out full cluster network requirements.

5.3.2. Create via Command Line

Login via ssh to the first Proxmox VE node and run the following command:

 hp1# pvecm create CLUSTERNAME

To check the state of the new cluster use:

 hp1# pvecm status

5.3.3. Multiple Clusters In Same Network

It is possible to create multiple clusters in the same physical or logical network. Each such cluster must have a unique name to avoid possible clashes in the cluster communication stack. This also helps avoid human confusion by making clusters clearly distinguishable.

While the bandwidth requirement of a corosync cluster is relatively low, the latency of packages and the package per second (PPS) rate is the limiting factor. Different clusters in the same network can compete with each other for these resources, so it may still make sense to use separate physical network infrastructure for bigger clusters.

5.4. Adding Nodes to the Cluster

Caution A node that is about to be added to the cluster cannot hold any guests. All existing configuration in /etc/pve is overwritten when joining a cluster, since guest IDs could be conflicting. As a workaround create a backup of the guest (vzdump) and restore it as a different ID after the node has been added to the cluster.

5.4.1. Join Node to Cluster via GUI


Login to the web interface on an existing cluster node. Under Datacenter → Cluster, click the button Join Information at the top. Then, click on the button Copy Information. Alternatively, copy the string from the Information field manually.


Next, login to the web interface on the node you want to add. Under Datacenter → Cluster, click on Join Cluster. Fill in the Information field with the Join Information text you copied earlier. Most settings required for joining the cluster will be filled out automatically. For security reasons, the cluster password has to be entered manually.

Note To enter all required data manually, you can disable the Assisted Join checkbox.

After clicking the Join button, the cluster join process will start immediately. After the node joined the cluster its current node certificate will be replaced by one signed from the cluster certificate authority (CA), that means the current session will stop to work after a few seconds. You might then need to force-reload the webinterface and re-login with the cluster credentials.

Now your node should be visible under Datacenter → Cluster.

5.4.2. Join Node to Cluster via Command Line

Login via ssh to the node you want to join into an existing cluster.

 hp2# pvecm add IP-ADDRESS-CLUSTER

For IP-ADDRESS-CLUSTER use the IP or hostname of an existing cluster node. An IP address is recommended (see Link Address Types).

To check the state of the cluster use:

 # pvecm status
Cluster status after adding 4 nodes
hp2# pvecm status
Quorum information
Date:             Mon Apr 20 12:30:13 2015
Quorum provider:  corosync_votequorum
Nodes:            4
Node ID:          0x00000001
Ring ID:          1/8
Quorate:          Yes

Votequorum information
Expected votes:   4
Highest expected: 4
Total votes:      4
Quorum:           3
Flags:            Quorate

Membership information
    Nodeid      Votes Name
0x00000001          1
0x00000002          1 (local)
0x00000003          1
0x00000004          1

If you only want the list of all nodes use:

 # pvecm nodes
List nodes in a cluster
hp2# pvecm nodes

Membership information
    Nodeid      Votes Name
         1          1 hp1
         2          1 hp2 (local)
         3          1 hp3
         4          1 hp4

5.4.3. Adding Nodes With Separated Cluster Network

When adding a node to a cluster with a separated cluster network you need to use the link0 parameter to set the nodes address on that network:


If you want to use the built-in redundancy of the kronosnet transport layer, also use the link1 parameter.

Using the GUI, you can select the correct interface from the corresponding Link 0 and Link 1 fields in the Cluster Join dialog.

5.5. Remove a Cluster Node

Caution Read carefully the procedure before proceeding, as it could not be what you want or need.

Move all virtual machines from the node. Make sure you have no local data or backups you want to keep, or save them accordingly. In the following example we will remove the node hp4 from the cluster.

Log in to a different cluster node (not hp4), and issue a pvecm nodes command to identify the node ID to remove:

hp1# pvecm nodes

Membership information
    Nodeid      Votes Name
         1          1 hp1 (local)
         2          1 hp2
         3          1 hp3
         4          1 hp4

At this point you must power off hp4 and make sure that it will not power on again (in the network) as it is.

Important As said above, it is critical to power off the node before removal, and make sure that it will never power on again (in the existing cluster network) as it is. If you power on the node as it is, your cluster will be screwed up and it could be difficult to restore a clean cluster state.

After powering off the node hp4, we can safely remove it from the cluster.

 hp1# pvecm delnode hp4
 Killing node 4

Use pvecm nodes or pvecm status to check the node list again. It should look something like:

hp1# pvecm status

Quorum information
Date:             Mon Apr 20 12:44:28 2015
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1/8
Quorate:          Yes

Votequorum information
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate

Membership information
    Nodeid      Votes Name
0x00000001          1 (local)
0x00000002          1
0x00000003          1

If, for whatever reason, you want this server to join the same cluster again, you have to

  • reinstall Proxmox VE on it from scratch

  • then join it, as explained in the previous section.

Note After removal of the node, its SSH fingerprint will still reside in the known_hosts of the other nodes. If you receive an SSH error after rejoining a node with the same IP or hostname, run pvecm updatecerts once on the re-added node to update its fingerprint cluster wide.

5.5.1. Separate A Node Without Reinstalling

Caution This is not the recommended method, proceed with caution. Use the above mentioned method if you’re unsure.

You can also separate a node from a cluster without reinstalling it from scratch. But after removing the node from the cluster it will still have access to the shared storages! This must be resolved before you start removing the node from the cluster. A Proxmox VE cluster cannot share the exact same storage with another cluster, as storage locking doesn’t work over the cluster boundary. Further, it may also lead to VMID conflicts.

Its suggested that you create a new storage where only the node which you want to separate has access. This can be a new export on your NFS or a new Ceph pool, to name a few examples. Its just important that the exact same storage does not gets accessed by multiple clusters. After setting this storage up move all data from the node and its VMs to it. Then you are ready to separate the node from the cluster.

Warning Ensure all shared resources are cleanly separated! Otherwise you will run into conflicts and problems.

First, stop the corosync and the pve-cluster services on the node:

systemctl stop pve-cluster
systemctl stop corosync

Start the cluster filesystem again in local mode:

pmxcfs -l

Delete the corosync configuration files:

rm /etc/pve/corosync.conf
rm -r /etc/corosync/*

You can now start the filesystem again as normal service:

killall pmxcfs
systemctl start pve-cluster

The node is now separated from the cluster. You can deleted it from a remaining node of the cluster with:

pvecm delnode oldnode

If the command failed, because the remaining node in the cluster lost quorum when the now separate node exited, you may set the expected votes to 1 as a workaround:

pvecm expected 1

And then repeat the pvecm delnode command.

Now switch back to the separated node, here delete all remaining files left from the old cluster. This ensures that the node can be added to another cluster again without problems.

rm /var/lib/corosync/*

As the configuration files from the other nodes are still in the cluster filesystem you may want to clean those up too. Remove simply the whole directory recursive from /etc/pve/nodes/NODENAME, but check three times that you used the correct one before deleting it.

Caution The nodes SSH keys are still in the authorized_key file, this means the nodes can still connect to each other with public key authentication. This should be fixed by removing the respective keys from the /etc/pve/priv/authorized_keys file.

5.6. Quorum

Proxmox VE use a quorum-based technique to provide a consistent state among all cluster nodes.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system.

Quorum (distributed computing)
— from Wikipedia

In case of network partitioning, state changes requires that a majority of nodes are online. The cluster switches to read-only mode if it loses quorum.

Note Proxmox VE assigns a single vote to each node by default.

5.7. Cluster Network

The cluster network is the core of a cluster. All messages sent over it have to be delivered reliably to all nodes in their respective order. In Proxmox VE this part is done by corosync, an implementation of a high performance, low overhead high availability development toolkit. It serves our decentralized configuration file system (pmxcfs).

5.7.1. Network Requirements

This needs a reliable network with latencies under 2 milliseconds (LAN performance) to work properly. The network should not be used heavily by other members, ideally corosync runs on its own network. Do not use a shared network for corosync and storage (except as a potential low-priority fallback in a redundant configuration).

Before setting up a cluster, it is good practice to check if the network is fit for that purpose. To make sure the nodes can connect to each other on the cluster network, you can test the connectivity between them with the ping tool.

If the Proxmox VE firewall is enabled, ACCEPT rules for corosync will automatically be generated - no manual action is required.

Note Corosync used Multicast before version 3.0 (introduced in Proxmox VE 6.0). Modern versions rely on Kronosnet for cluster communication, which, for now, only supports regular UDP unicast.
Caution You can still enable Multicast or legacy unicast by setting your transport to udp or udpu in your corosync.conf, but keep in mind that this will disable all cryptography and redundancy support. This is therefore not recommended.

5.7.2. Separate Cluster Network

When creating a cluster without any parameters the corosync cluster network is generally shared with the Web UI and the VMs and their traffic. Depending on your setup, even storage traffic may get sent over the same network. Its recommended to change that, as corosync is a time critical real time application.

Setting Up A New Network

First, you have to set up a new network interface. It should be on a physically separate network. Ensure that your network fulfills the cluster network requirements.

Separate On Cluster Creation

This is possible via the linkX parameters of the pvecm create command used for creating a new cluster.

If you have set up an additional NIC with a static address on, and want to send and receive all cluster communication over this interface, you would execute:

pvecm create test --link0

To check if everything is working properly execute:

systemctl status corosync

Afterwards, proceed as described above to add nodes with a separated cluster network.

Separate After Cluster Creation

You can do this if you have already created a cluster and want to switch its communication to another network, without rebuilding the whole cluster. This change may lead to short durations of quorum loss in the cluster, as nodes have to restart corosync and come up one after the other on the new network.

Check how to edit the corosync.conf file first. Then, open it and you should see a file similar to:

logging {
  debug: off
  to_syslog: yes

nodelist {

  node {
    name: due
    nodeid: 2
    quorum_votes: 1
    ring0_addr: due

  node {
    name: tre
    nodeid: 3
    quorum_votes: 1
    ring0_addr: tre

  node {
    name: uno
    nodeid: 1
    quorum_votes: 1
    ring0_addr: uno


quorum {
  provider: corosync_votequorum

totem {
  cluster_name: testcluster
  config_version: 3
  ip_version: ipv4-6
  secauth: on
  version: 2
  interface {
    linknumber: 0

Note ringX_addr actually specifies a corosync link address, the name "ring" is a remnant of older corosync versions that is kept for backwards compatibility.

The first thing you want to do is add the name properties in the node entries if you do not see them already. Those must match the node name.

Then replace all addresses from the ring0_addr properties of all nodes with the new addresses. You may use plain IP addresses or hostnames here. If you use hostnames ensure that they are resolvable from all nodes. (see also Link Address Types)

In this example, we want to switch the cluster communication to the network. So we replace all ring0_addr respectively.

Note The exact same procedure can be used to change other ringX_addr values as well, although we recommend to not change multiple addresses at once, to make it easier to recover if something goes wrong.

After we increase the config_version property, the new configuration file should look like:

logging {
  debug: off
  to_syslog: yes

nodelist {

  node {
    name: due
    nodeid: 2
    quorum_votes: 1

  node {
    name: tre
    nodeid: 3
    quorum_votes: 1

  node {
    name: uno
    nodeid: 1
    quorum_votes: 1


quorum {
  provider: corosync_votequorum

totem {
  cluster_name: testcluster
  config_version: 4
  ip_version: ipv4-6
  secauth: on
  version: 2
  interface {
    linknumber: 0


Then, after a final check if all changed information is correct, we save it and once again follow the edit corosync.conf file section to bring it into effect.

The changes will be applied live, so restarting corosync is not strictly necessary. If you changed other settings as well, or notice corosync complaining, you can optionally trigger a restart.

On a single node execute:

systemctl restart corosync

Now check if everything is fine:

systemctl status corosync

If corosync runs again correct restart corosync also on all other nodes. They will then join the cluster membership one by one on the new network.

5.7.3. Corosync addresses

A corosync link address (for backwards compatibility denoted by ringX_addr in corosync.conf) can be specified in two ways:

  • IPv4/v6 addresses will be used directly. They are recommended, since they are static and usually not changed carelessly.

  • Hostnames will be resolved using getaddrinfo, which means that per default, IPv6 addresses will be used first, if available (see also man gai.conf). Keep this in mind, especially when upgrading an existing cluster to IPv6.

Caution Hostnames should be used with care, since the address they resolve to can be changed without touching corosync or the node it runs on - which may lead to a situation where an address is changed without thinking about implications for corosync.

A separate, static hostname specifically for corosync is recommended, if hostnames are preferred. Also, make sure that every node in the cluster can resolve all hostnames correctly.

Since Proxmox VE 5.1, while supported, hostnames will be resolved at the time of entry. Only the resolved IP is then saved to the configuration.

Nodes that joined the cluster on earlier versions likely still use their unresolved hostname in corosync.conf. It might be a good idea to replace them with IPs or a separate hostname, as mentioned above.

5.8. Corosync Redundancy

Corosync supports redundant networking via its integrated kronosnet layer by default (it is not supported on the legacy udp/udpu transports). It can be enabled by specifying more than one link address, either via the --linkX parameters of pvecm, in the GUI as Link 1 (while creating a cluster or adding a new node) or by specifying more than one ringX_addr in corosync.conf.

Note To provide useful failover, every link should be on its own physical network connection.

Links are used according to a priority setting. You can configure this priority by setting knet_link_priority in the corresponding interface section in corosync.conf, or, preferably, using the priority parameter when creating your cluster with pvecm:

 # pvecm create CLUSTERNAME --link0,priority=15 --link1,priority=20

This would cause link1 to be used first, since it has the higher priority.

If no priorities are configured manually (or two links have the same priority), links will be used in order of their number, with the lower number having higher priority.

Even if all links are working, only the one with the highest priority will see corosync traffic. Link priorities cannot be mixed, i.e. links with different priorities will not be able to communicate with each other.

Since lower priority links will not see traffic unless all higher priorities have failed, it becomes a useful strategy to specify even networks used for other tasks (VMs, storage, etc…) as low-priority links. If worst comes to worst, a higher-latency or more congested connection might be better than no connection at all.

To add a new link to a running configuration, first check how to edit the corosync.conf file.

Then, add a new ringX_addr to every node in the nodelist section. Make sure that your X is the same for every node you add it to, and that it is unique for each node.

Lastly, add a new interface, as shown below, to your totem section, replacing X with your link number chosen above.

Assuming you added a link with number 1, the new configuration file could look like this:

logging {
  debug: off
  to_syslog: yes

nodelist {

  node {
    name: due
    nodeid: 2
    quorum_votes: 1

  node {
    name: tre
    nodeid: 3
    quorum_votes: 1

  node {
    name: uno
    nodeid: 1
    quorum_votes: 1


quorum {
  provider: corosync_votequorum

totem {
  cluster_name: testcluster
  config_version: 4
  ip_version: ipv4-6
  secauth: on
  version: 2
  interface {
    linknumber: 0
  interface {
    linknumber: 1

The new link will be enabled as soon as you follow the last steps to edit the corosync.conf file. A restart should not be necessary. You can check that corosync loaded the new link using:

journalctl -b -u corosync

It might be a good idea to test the new link by temporarily disconnecting the old link on one node and making sure that its status remains online while disconnected:

pvecm status

If you see a healthy cluster state, it means that your new link is being used.

5.9. Role of SSH in Proxmox VE Clusters

Proxmox VE utilizes SSH tunnels for various features.

  • Proxying console/shell sessions (node and guests)

    When using the shell for node B while being connected to node A, connects to a terminal proxy on node A, which is in turn connected to the login shell on node B via a non-interactive SSH tunnel.

  • VM and CT memory and local-storage migration in secure mode.

    During the migration one or more SSH tunnel(s) are established between the source and target nodes, in order to exchange migration information and transfer memory and disk contents.

  • Storage replication

Pitfalls due to automatic execution of .bashrc and siblings

In case you have a custom .bashrc, or similar files that get executed on login by the configured shell, ssh will automatically run it once the session is established successfully. This can cause some unexpected behavior, as those commands may be executed with root permissions on any above described operation. That can cause possible problematic side-effects!

In order to avoid such complications, it’s recommended to add a check in /root/.bashrc to make sure the session is interactive, and only then run .bashrc commands.

You can add this snippet at the beginning of your .bashrc file:

# Early exit if not running interactively to avoid side-effects!
case $- in
    *i*) ;;
      *) return;;

5.10. Corosync External Vote Support

This section describes a way to deploy an external voter in a Proxmox VE cluster. When configured, the cluster can sustain more node failures without violating safety properties of the cluster communication.

For this to work there are two services involved:

  • a so called qdevice daemon which runs on each Proxmox VE node

  • an external vote daemon which runs on an independent server.

As a result you can achieve higher availability even in smaller setups (for example 2+1 nodes).

5.10.1. QDevice Technical Overview

The Corosync Quorum Device (QDevice) is a daemon which runs on each cluster node. It provides a configured number of votes to the clusters quorum subsystem based on an external running third-party arbitrator’s decision. Its primary use is to allow a cluster to sustain more node failures than standard quorum rules allow. This can be done safely as the external device can see all nodes and thus choose only one set of nodes to give its vote. This will only be done if said set of nodes can have quorum (again) when receiving the third-party vote.

Currently only QDevice Net is supported as a third-party arbitrator. It is a daemon which provides a vote to a cluster partition if it can reach the partition members over the network. It will give only votes to one partition of a cluster at any time. It’s designed to support multiple clusters and is almost configuration and state free. New clusters are handled dynamically and no configuration file is needed on the host running a QDevice.

The external host has the only requirement that it needs network access to the cluster and a corosync-qnetd package available. We provide such a package for Debian based hosts, other Linux distributions should also have a package available through their respective package manager.

Note In contrast to corosync itself, a QDevice connects to the cluster over TCP/IP. The daemon may even run outside of the clusters LAN and can have longer latencies than 2 ms.

5.10.2. Supported Setups

We support QDevices for clusters with an even number of nodes and recommend it for 2 node clusters, if they should provide higher availability. For clusters with an odd node count we discourage the use of QDevices currently. The reason for this, is the difference of the votes the QDevice provides for each cluster type. Even numbered clusters get single additional vote, with this we can only increase availability, i.e. if the QDevice itself fails we are in the same situation as with no QDevice at all.

Now, with an odd numbered cluster size the QDevice provides (N-1) votes — where N corresponds to the cluster node count. This difference makes sense, if we had only one additional vote the cluster can get into a split brain situation. This algorithm would allow that all nodes but one (and naturally the QDevice itself) could fail. There are two drawbacks with this:

  • If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and said QDevice fails itself no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.

  • The fact that all but one node plus QDevice may fail sound promising at first, but this may result in a mass recovery of HA services that would overload the single node left. Also ceph server will stop to provide services after only ((N-1)/2) nodes are online.

If you understand the drawbacks and implications you can decide yourself if you should use this technology in an odd numbered cluster setup.

5.10.3. QDevice-Net Setup

We recommend to run any daemon which provides votes to corosync-qdevice as an unprivileged user. Proxmox VE and Debian provide a package which is already configured to do so. The traffic between the daemon and the cluster must be encrypted to ensure a safe and secure QDevice integration in Proxmox VE.

First, install the corosync-qnetd package on your external server

external# apt install corosync-qnetd

and the corosync-qdevice package on all cluster nodes

pve# apt install corosync-qdevice

After that, ensure that all your nodes on the cluster are online.

You can now easily set up your QDevice by running the following command on one of the Proxmox VE nodes:

pve# pvecm qdevice setup <QDEVICE-IP>

The SSH key from the cluster will be automatically copied to the QDevice.

Note Make sure that the SSH configuration on your external server allows root login via password, if you are asked for a password during this step.

After you enter the password and all the steps are successfully completed, you will see "Done". You can check the status now:

pve# pvecm status


Votequorum information
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
    Nodeid      Votes    Qdevice Name
    0x00000001          1    A,V,NMW (local)
    0x00000002          1    A,V,NMW
    0x00000000          1            Qdevice

which means the QDevice is set up.

5.10.4. Frequently Asked Questions

Tie Breaking

In case of a tie, where two same-sized cluster partitions cannot see each other but the QDevice, the QDevice chooses randomly one of those partitions and provides a vote to it.

Possible Negative Implications

For clusters with an even node count there are no negative implications when setting up a QDevice. If it fails to work, you are as good as without QDevice at all.

Adding/Deleting Nodes After QDevice Setup

If you want to add a new node or remove an existing one from a cluster with a QDevice setup, you need to remove the QDevice first. After that, you can add or remove nodes normally. Once you have a cluster with an even node count again, you can set up the QDevice again as described above.

Removing the QDevice

If you used the official pvecm tool to add the QDevice, you can remove it trivially by running:

pve# pvecm qdevice remove

5.11. Corosync Configuration

The /etc/pve/corosync.conf file plays a central role in a Proxmox VE cluster. It controls the cluster membership and its network. For further information about it, check the corosync.conf man page:

man corosync.conf

For node membership you should always use the pvecm tool provided by Proxmox VE. You may have to edit the configuration file manually for other changes. Here are a few best practice tips for doing this.

5.11.1. Edit corosync.conf

Editing the corosync.conf file is not always very straightforward. There are two on each cluster node, one in /etc/pve/corosync.conf and the other in /etc/corosync/corosync.conf. Editing the one in our cluster file system will propagate the changes to the local one, but not vice versa.

The configuration will get updated automatically as soon as the file changes. This means changes which can be integrated in a running corosync will take effect immediately. So you should always make a copy and edit that instead, to avoid triggering some unwanted changes by an in-between safe.

cp /etc/pve/corosync.conf /etc/pve/corosync.conf.new

Then open the config file with your favorite editor, nano and vim.tiny are preinstalled on any Proxmox VE node for example.

Note Always increment the config_version number on configuration changes, omitting this can lead to problems.

After making the necessary changes create another copy of the current working configuration file. This serves as a backup if the new configuration fails to apply or makes problems in other ways.

cp /etc/pve/corosync.conf /etc/pve/corosync.conf.bak

Then move the new configuration file over the old one:

mv /etc/pve/corosync.conf.new /etc/pve/corosync.conf

You may check with the commands

systemctl status corosync
journalctl -b -u corosync

If the change could be applied automatically. If not you may have to restart the corosync service via:

systemctl restart corosync

On errors check the troubleshooting section below.

5.11.2. Troubleshooting

Issue: quorum.expected_votes must be configured

When corosync starts to fail and you get the following message in the system log:

corosync[1647]:  [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
corosync[1647]:  [SERV  ] Service engine 'corosync_quorum' failed to load for reason
    'configuration error: nodelist or quorum.expected_votes must be configured!'

It means that the hostname you set for corosync ringX_addr in the configuration could not be resolved.

Write Configuration When Not Quorate

If you need to change /etc/pve/corosync.conf on an node with no quorum, and you know what you do, use:

pvecm expected 1

This sets the expected vote count to 1 and makes the cluster quorate. You can now fix your configuration, or revert it back to the last working backup.

This is not enough if corosync cannot start anymore. Here it is best to edit the local copy of the corosync configuration in /etc/corosync/corosync.conf so that corosync can start again. Ensure that on all nodes this configuration has the same content to avoid split brains. If you are not sure what went wrong it’s best to ask the Proxmox Community to help you.

5.11.3. Corosync Configuration Glossary


This names the different link addresses for the kronosnet connections between nodes.

5.12. Cluster Cold Start

It is obvious that a cluster is not quorate when all nodes are offline. This is a common case after a power failure.

Note It is always a good idea to use an uninterruptible power supply (“UPS”, also called “battery backup”) to avoid this state, especially if you want HA.

On node startup, the pve-guests service is started and waits for quorum. Once quorate, it starts all guests which have the onboot flag set.

When you turn on nodes, or when power comes back after power failure, it is likely that some nodes boots faster than others. Please keep in mind that guest startup is delayed until you reach quorum.

5.13. Guest Migration

Migrating virtual guests to other nodes is a useful feature in a cluster. There are settings to control the behavior of such migrations. This can be done via the configuration file datacenter.cfg or for a specific migration via API or command line parameters.

It makes a difference if a Guest is online or offline, or if it has local resources (like a local disk).

For Details about Virtual Machine Migration see the QEMU/KVM Migration Chapter.

For Details about Container Migration see the Container Migration Chapter.

5.13.1. Migration Type

The migration type defines if the migration data should be sent over an encrypted (secure) channel or an unencrypted (insecure) one. Setting the migration type to insecure means that the RAM content of a virtual guest gets also transferred unencrypted, which can lead to information disclosure of critical data from inside the guest (for example passwords or encryption keys).

Therefore, we strongly recommend using the secure channel if you do not have full control over the network and can not guarantee that no one is eavesdropping on it.

Note Storage migration does not follow this setting. Currently, it always sends the storage content over a secure channel.

Encryption requires a lot of computing power, so this setting is often changed to "unsafe" to achieve better performance. The impact on modern systems is lower because they implement AES encryption in hardware. The performance impact is particularly evident in fast networks where you can transfer 10 Gbps or more.

5.13.2. Migration Network

By default, Proxmox VE uses the network in which cluster communication takes place to send the migration traffic. This is not optimal because sensitive cluster traffic can be disrupted and this network may not have the best bandwidth available on the node.

Setting the migration network parameter allows the use of a dedicated network for the entire migration traffic. In addition to the memory, this also affects the storage traffic for offline migrations.

The migration network is set as a network in the CIDR notation. This has the advantage that you do not have to set individual IP addresses for each node. Proxmox VE can determine the real address on the destination node from the network specified in the CIDR form. To enable this, the network must be specified so that each node has one, but only one IP in the respective network.


We assume that we have a three-node setup with three separate networks. One for public communication with the Internet, one for cluster communication and a very fast one, which we want to use as a dedicated network for migration.

A network configuration for such a setup might look as follows:

iface eno1 inet manual

# public network
auto vmbr0
iface vmbr0 inet static
    address 192.X.Y.57
    gateway 192.X.Y.1
    bridge-ports eno1
    bridge-stp off
    bridge-fd 0

# cluster network
auto eno2
iface eno2 inet static

# fast network
auto eno3
iface eno3 inet static

Here, we will use the network as a migration network. For a single migration, you can do this using the migration_network parameter of the command line tool:

# qm migrate 106 tre --online --migration_network

To configure this as the default network for all migrations in the cluster, set the migration property of the /etc/pve/datacenter.cfg file:

# use dedicated migration network
migration: secure,network=
Note The migration type must always be set when the migration network gets set in /etc/pve/datacenter.cfg.

6. Proxmox Cluster File System (pmxcfs)

The Proxmox Cluster file system (“pmxcfs”) is a database-driven file system for storing configuration files, replicated in real time to all cluster nodes using corosync. We use this to store all PVE related configuration files.

Although the file system stores all data inside a persistent database on disk, a copy of the data resides in RAM. That imposes restriction on the maximum size, which is currently 30MB. This is still enough to store the configuration of several thousand virtual machines.

This system provides the following advantages:

  • seamless replication of all configuration to all nodes in real time

  • provides strong consistency checks to avoid duplicate VM IDs

  • read-only when a node loses quorum

  • automatic updates of the corosync cluster configuration to all nodes

  • includes a distributed locking mechanism

6.1. POSIX Compatibility

The file system is based on FUSE, so the behavior is POSIX like. But some feature are simply not implemented, because we do not need them:

  • you can just generate normal files and directories, but no symbolic links, …

  • you can’t rename non-empty directories (because this makes it easier to guarantee that VMIDs are unique).

  • you can’t change file permissions (permissions are based on path)

  • O_EXCL creates were not atomic (like old NFS)

  • O_TRUNC creates are not atomic (FUSE restriction)

6.2. File Access Rights

All files and directories are owned by user root and have group www-data. Only root has write permissions, but group www-data can read most files. Files below the following paths:


are only accessible by root.

6.3. Technology

We use the Corosync Cluster Engine for cluster communication, and SQlite for the database file. The file system is implemented in user space using FUSE.

6.4. File System Layout

The file system is mounted at:


6.4.1. Files


Corosync cluster configuration file (previous to Proxmox VE 4.x this file was called cluster.conf)


Proxmox VE storage configuration


Proxmox VE datacenter wide configuration (keyboard layout, proxy, …)


Proxmox VE access control configuration (users/groups/…)


Proxmox VE authentication domains


Proxmox VE external metrics server configuration


Public key used by ticket system


Public certificate of cluster CA


Shadow password file


Private key used by ticket system


Private key of cluster CA


Public SSL certificate for web server (signed by cluster CA)


Private SSL key for pve-ssl.pem


Public SSL certificate (chain) for web server (optional override for pve-ssl.pem)


Private SSL key for pveproxy-ssl.pem (optional)


VM configuration data for KVM VMs


VM configuration data for LXC containers


Firewall configuration applied to all nodes


Firewall configuration for individual nodes


Firewall configuration for VMs and Containers







6.4.3. Special status files for debugging (JSON)


File versions (to detect file modifications)


Info about cluster members


List of all VMs


Cluster log (last 50 entries)


RRD data (most recent entries)

6.4.4. Enable/Disable debugging

You can enable verbose syslog messages with:

echo "1" >/etc/pve/.debug

And disable verbose syslog messages with:

echo "0" >/etc/pve/.debug

6.5. Recovery

If you have major problems with your Proxmox VE host, e.g. hardware issues, it could be helpful to just copy the pmxcfs database file /var/lib/pve-cluster/config.db and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service and replace the config.db file (needed permissions 0600). Second, adapt /etc/hostname and /etc/hosts according to the lost Proxmox VE host, then reboot and check. (And don’t forget your VM/CT data)

6.5.1. Remove Cluster configuration

The recommended way is to reinstall the node after you removed it from your cluster. This makes sure that all secret cluster/ssh keys and any shared configuration data is destroyed.

In some cases, you might prefer to put a node back to local mode without reinstall, which is described in Separate A Node Without Reinstalling

6.5.2. Recovering/Moving Guests from Failed Nodes

For the guest configuration files in nodes/<NAME>/qemu-server/ (VMs) and nodes/<NAME>/lxc/ (containers), Proxmox VE sees the containing node <NAME> as owner of the respective guest. This concept enables the usage of local locks instead of expensive cluster-wide locks for preventing concurrent guest configuration changes.

As a consequence, if the owning node of a guest fails (e.g., because of a power outage, fencing event, ..), a regular migration is not possible (even if all the disks are located on shared storage) because such a local lock on the (dead) owning node is unobtainable. This is not a problem for HA-managed guests, as Proxmox VE’s High Availability stack includes the necessary (cluster-wide) locking and watchdog functionality to ensure correct and automatic recovery of guests from fenced nodes.

If a non-HA-managed guest has only shared disks (and no other local resources which are only available on the failed node are configured), a manual recovery is possible by simply moving the guest configuration file from the failed node’s directory in /etc/pve/ to an alive node’s directory (which changes the logical owner or location of the guest).

For example, recovering the VM with ID 100 from a dead node1 to another node node2 works with the following command executed when logged in as root on any member node of the cluster:

mv /etc/pve/nodes/node1/qemu-server/100.conf /etc/pve/nodes/node2/
Warning Before manually recovering a guest like this, make absolutely sure that the failed source node is really powered off/fenced. Otherwise Proxmox VE’s locking principles are violated by the mv command, which can have unexpected consequences.
Warning Guest with local disks (or other local resources which are only available on the dead node) are not recoverable like this. Either wait for the failed node to rejoin the cluster or restore such guests from backups.

7. Proxmox VE Storage

The Proxmox VE storage model is very flexible. Virtual machine images can either be stored on one or several local storages, or on shared storage like NFS or iSCSI (NAS, SAN). There are no limits, and you may configure as many storage pools as you like. You can use all storage technologies available for Debian Linux.

One major benefit of storing VMs on shared storage is the ability to live-migrate running machines without any downtime, as all nodes in the cluster have direct access to VM disk images. There is no need to copy VM image data, so live migration is very fast in that case.

The storage library (package libpve-storage-perl) uses a flexible plugin system to provide a common interface to all storage types. This can be easily adopted to include further storage types in the future.

7.1. Storage Types

There are basically two different classes of storage types:

File level storage

File level based storage technologies allow access to a fully featured (POSIX) file system. They are in general more flexible than any Block level storage (see below), and allow you to store content of any type. ZFS is probably the most advanced system, and it has full support for snapshots and clones.

Block level storage

Allows to store large raw images. It is usually not possible to store other files (ISO, backups, ..) on such storage types. Most modern block level storage implementations support snapshots and clones. RADOS and GlusterFS are distributed systems, replicating storage data to different nodes.

Table 2. Available storage types
Description PVE type Level Shared Snapshots Stable

ZFS (local)
























Proxmox Backup
















































ZFS over iSCSI






1: On file based storages, snapshots are possible with the qcow2 format.

2: It is possible to use LVM on top of an iSCSI or FC-based storage. That way you get a shared LVM storage.

7.1.1. Thin Provisioning

A number of storages, and the Qemu image format qcow2, support thin provisioning. With thin provisioning activated, only the blocks that the guest system actually use will be written to the storage.

Say for instance you create a VM with a 32GB hard disk, and after installing the guest system OS, the root file system of the VM contains 3 GB of data. In that case only 3GB are written to the storage, even if the guest VM sees a 32GB hard drive. In this way thin provisioning allows you to create disk images which are larger than the currently available storage blocks. You can create large disk images for your VMs, and when the need arises, add more disks to your storage without resizing the VMs' file systems.

All storage types which have the “Snapshots” feature also support thin provisioning.

Caution If a storage runs full, all guests using volumes on that storage receive IO errors. This can cause file system inconsistencies and may corrupt your data. So it is advisable to avoid over-provisioning of your storage resources, or carefully observe free space to avoid such conditions.

7.2. Storage Configuration

All Proxmox VE related storage configuration is stored within a single text file at /etc/pve/storage.cfg. As this file is within /etc/pve/, it gets automatically distributed to all cluster nodes. So all nodes share the same storage configuration.

Sharing storage configuration makes perfect sense for shared storage, because the same “shared” storage is accessible from all nodes. But it is also useful for local storage types. In this case such local storage is available on all nodes, but it is physically different and can have totally different content.

7.2.1. Storage Pools

Each storage pool has a <type>, and is uniquely identified by its <STORAGE_ID>. A pool configuration looks like this:

<type>: <STORAGE_ID>
        <property> <value>
        <property> <value>

The <type>: <STORAGE_ID> line starts the pool definition, which is then followed by a list of properties. Most properties require a value. Some have reasonable defaults, in which case you can omit the value.

To be more specific, take a look at the default storage configuration after installation. It contains one special local storage pool named local, which refers to the directory /var/lib/vz and is always available. The Proxmox VE installer creates additional storage entries depending on the storage type chosen at installation time.

Default storage configuration (/etc/pve/storage.cfg)
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

# default image store on LVM based installation
lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

# default image store on ZFS based installation
zfspool: local-zfs
        pool rpool/data
        content images,rootdir

7.2.2. Common Storage Properties

A few storage properties are common among different storage types.


List of cluster node names where this storage is usable/accessible. One can use this property to restrict storage access to a limited set of nodes.


A storage can support several content types, for example virtual disk images, cdrom iso images, container templates or container root directories. Not all storage types support all content types. One can set this property to select what this storage is used for.


KVM-Qemu VM images.


Allow to store container data.


Container templates.


Backup files (vzdump).


ISO images


Snippet files, for example guest hook scripts


Mark storage as shared.


You can use this flag to disable the storage completely.


Deprecated, please use prune-backups instead. Maximum number of backup files per VM. Use 0 for unlimited.


Retention options for backups. For details, see Backup Retention.


Default image format (raw|qcow2|vmdk)

Warning It is not advisable to use the same storage pool on different Proxmox VE clusters. Some storage operation need exclusive access to the storage, so proper locking is required. While this is implemented within a cluster, it does not work between different clusters.

7.3. Volumes

We use a special notation to address storage data. When you allocate data from a storage pool, it returns such a volume identifier. A volume is identified by the <STORAGE_ID>, followed by a storage type dependent volume name, separated by colon. A valid <VOLUME_ID> looks like:


To get the file system path for a <VOLUME_ID> use:

pvesm path <VOLUME_ID>

7.3.1. Volume Ownership

There exists an ownership relation for image type volumes. Each such volume is owned by a VM or Container. For example volume local:230/example-image.raw is owned by VM 230. Most storage backends encodes this ownership information into the volume name.

When you remove a VM or Container, the system also removes all associated volumes which are owned by that VM or Container.

7.4. Using the Command Line Interface

It is recommended to familiarize yourself with the concept behind storage pools and volume identifiers, but in real life, you are not forced to do any of those low level operations on the command line. Normally, allocation and removal of volumes is done by the VM and Container management tools.

Nevertheless, there is a command line tool called pvesm (“Proxmox VE Storage Manager”), which is able to perform common storage management tasks.

7.4.1. Examples

Add storage pools

pvesm add dir <STORAGE_ID> --path <PATH>
pvesm add nfs <STORAGE_ID> --path <PATH> --server <SERVER> --export <EXPORT>
pvesm add lvm <STORAGE_ID> --vgname <VGNAME>
pvesm add iscsi <STORAGE_ID> --portal <HOST[:PORT]> --target <TARGET>

Disable storage pools

pvesm set <STORAGE_ID> --disable 1

Enable storage pools

pvesm set <STORAGE_ID> --disable 0

Change/set storage options

pvesm set <STORAGE_ID> <OPTIONS>
pvesm set <STORAGE_ID> --shared 1
pvesm set local --format qcow2
pvesm set <STORAGE_ID> --content iso

Remove storage pools. This does not delete any data, and does not disconnect or unmount anything. It just removes the storage configuration.

pvesm remove <STORAGE_ID>

Allocate volumes

pvesm alloc <STORAGE_ID> <VMID> <name> <size> [--format <raw|qcow2>]

Allocate a 4G volume in local storage. The name is auto-generated if you pass an empty string as <name>

pvesm alloc local <VMID> '' 4G

Free volumes

pvesm free <VOLUME_ID>
Warning This really destroys all volume data.

List storage status

pvesm status

List storage contents

pvesm list <STORAGE_ID> [--vmid <VMID>]

List volumes allocated by VMID

pvesm list <STORAGE_ID> --vmid <VMID>

List iso images

pvesm list <STORAGE_ID> --iso

List container templates

pvesm list <STORAGE_ID> --vztmpl

Show file system path for a volume

pvesm path <VOLUME_ID>

Exporting the volume local:103/vm-103-disk-0.qcow2 to the file target. This is mostly used internally with pvesm import. The stream format qcow2+size is different to the qcow2 format. Consequently, the exported file cannot simply be attached to a VM. This also holds for the other formats.

pvesm export local:103/vm-103-disk-0.qcow2 qcow2+size target --with-snapshots 1

7.5. Directory Backend

Storage pool type: dir

Proxmox VE can use local directories or locally mounted shares for storage. A directory is a file level storage, so you can store any content type like virtual disk images, containers, templates, ISO images or backup files.

Note You can mount additional storages via standard linux /etc/fstab, and then define a directory storage for that mount point. This way you can use any file system supported by Linux.

This backend assumes that the underlying directory is POSIX compatible, but nothing else. This implies that you cannot create snapshots at the storage level. But there exists a workaround for VM images using the qcow2 file format, because that format supports snapshots internally.

Tip Some storage types do not support O_DIRECT, so you can’t use cache mode none with such storages. Simply use cache mode writeback instead.

We use a predefined directory layout to store different content types into different sub-directories. This layout is used by all file level storage backends.

Table 3. Directory layout
Content type Subdir

VM images


ISO images


Container templates


Backup files




7.5.1. Configuration

This backend supports all common storage properties, and adds an additional property called path to specify the directory. This needs to be an absolute file system path.

Configuration Example (/etc/pve/storage.cfg)
dir: backup
        path /mnt/backup
        content backup
        maxfiles 7

Above configuration defines a storage pool called backup. That pool can be used to store up to 7 backups (maxfiles 7) per VM. The real path for the backup files is /mnt/backup/dump/....

7.5.2. File naming conventions

This backend uses a well defined naming scheme for VM images:


This specifies the owner VM.


This can be an arbitrary name (ascii) without white space. The backend uses disk-[N] as default, where [N] is replaced by an integer to make the name unique.


Specifies the image format (raw|qcow2|vmdk).

When you create a VM template, all VM images are renamed to indicate that they are now read-only, and can be used as a base image for clones:

Note Such base images are used to generate cloned images. So it is important that those files are read-only, and never get modified. The backend changes the access mode to 0444, and sets the immutable flag (chattr +i) if the storage supports that.

7.5.3. Storage Features

As mentioned above, most file systems do not support snapshots out of the box. To workaround that problem, this backend is able to use qcow2 internal snapshot capabilities.

Same applies to clones. The backend uses the qcow2 base image feature to create clones.

Table 4. Storage features for backend dir
Content types Image formats Shared Snapshots Clones

images rootdir vztmpl iso backup snippets

raw qcow2 vmdk subvol




7.5.4. Examples

Please use the following command to allocate a 4GB image on storage local:

# pvesm alloc local 100 vm-100-disk10.raw 4G
Formatting '/var/lib/vz/images/100/vm-100-disk10.raw', fmt=raw size=4294967296
successfully created 'local:100/vm-100-disk10.raw'
Note The image name must conform to above naming conventions.

The real file system path is shown with:

# pvesm path local:100/vm-100-disk10.raw

And you can remove the image with:

# pvesm free local:100/vm-100-disk10.raw

7.6. NFS Backend

Storage pool type: nfs

The NFS backend is based on the directory backend, so it shares most properties. The directory layout and the file naming conventions are the same. The main advantage is that you can directly configure the NFS server properties, so the backend can mount the share automatically. There is no need to modify /etc/fstab. The backend can also test if the server is online, and provides a method to query the server for exported shares.

7.6.1. Configuration

The backend supports all common storage properties, except the shared flag, which is always set. Additionally, the following properties are used to configure the NFS server:


Server IP or DNS name. To avoid DNS lookup delays, it is usually preferable to use an IP address instead of a DNS name - unless you have a very reliable DNS server, or list the server in the local /etc/hosts file.


NFS export path (as listed by pvesm nfsscan).

You can also set NFS mount options:


The local mount point (defaults to /mnt/pve/<STORAGE_ID>/).


NFS mount options (see man nfs).

Configuration Example (/etc/pve/storage.cfg)
nfs: iso-templates
        path /mnt/pve/iso-templates
        export /space/iso-templates
        options vers=3,soft
        content iso,vztmpl
Tip After an NFS request times out, NFS request are retried indefinitely by default. This can lead to unexpected hangs on the client side. For read-only content, it is worth to consider the NFS soft option, which limits the number of retries to three.

7.6.2. Storage Features

NFS does not support snapshots, but the backend uses qcow2 features to implement snapshots and cloning.

Table 5. Storage features for backend nfs
Content types Image formats Shared Snapshots Clones

images rootdir vztmpl iso backup snippets

raw qcow2 vmdk




7.6.3. Examples

You can get a list of exported NFS shares with:

# pvesm nfsscan <server>

7.7. CIFS Backend

Storage pool type: cifs

The CIFS backend extends the directory backend, so that no manual setup of a CIFS mount is needed. Such a storage can be added directly through the Proxmox VE API or the WebUI, with all our backend advantages, like server heartbeat check or comfortable selection of exported shares.

7.7.1. Configuration

The backend supports all common storage properties, except the shared flag, which is always set. Additionally, the following CIFS special properties are available:


Server IP or DNS name. Required.

Tip To avoid DNS lookup delays, it is usually preferable to use an IP address instead of a DNS name - unless you have a very reliable DNS server, or list the server in the local /etc/hosts file.

CIFS share to use (get available ones with pvesm scan cifs <address> or the WebUI). Required.


The username for the CIFS storage. Optional, defaults to ‘guest’.


The user password. Optional. It will be saved in a file only readable by root (/etc/pve/priv/storage/<STORAGE-ID>.pw).


Sets the user domain (workgroup) for this storage. Optional.


SMB protocol Version. Optional, default is 3. SMB1 is not supported due to security issues.


The local mount point. Optional, defaults to /mnt/pve/<STORAGE_ID>/.

Configuration Example (/etc/pve/storage.cfg)
cifs: backup
        path /mnt/pve/backup
        share VMData
        content backup
        username anna
        smbversion 3

7.7.2. Storage Features

CIFS does not support snapshots on a storage level. But you may use qcow2 backing files if you still want to have snapshots and cloning features available.

Table 6. Storage features for backend cifs
Content types Image formats Shared Snapshots Clones

images rootdir vztmpl iso backup snippets

raw qcow2 vmdk




7.7.3. Examples

You can get a list of exported CIFS shares with:

# pvesm scan cifs <server> [--username <username>] [--password]

Then you could add this share as a storage to the whole Proxmox VE cluster with:

# pvesm add cifs <storagename> --server <server> --share <share> [--username <username>] [--password]

7.8. Proxmox Backup Server

Storage pool type: pbs

This backend allows direct integration of a Proxmox Backup Server into Proxmox VE like any other storage. A Proxmox Backup storage can be added directly through the Proxmox VE API, CLI or the webinterface.

7.8.1. Configuration

The backend supports all common storage properties, except the shared flag, which is always set. Additionally, the following special properties to Proxmox Backup Server are available:


Server IP or DNS name. Required.


The username for the Proxmox Backup Server storage. Required.

Tip Do not forget to add the realm to the username. For example, root@pam or archiver@pbs.

The user password. The value will be saved in a file under /etc/pve/priv/storage/<STORAGE-ID>.pw with access restricted to the root user. Required.


The ID of the Proxmox Backup Server datastore to use. Required.


The fingerprint of the Proxmox Backup Server API TLS certificate. You can get it in the Servers Dashboard or using the proxmox-backup-manager cert info command. Required for self-signed certificates or any other one where the host does not trusts the servers CA.


A key to encrypt the backup data from the client side. Currently only non-password protected (no key derive function (kdf)) are supported. Will be saved in a file under /etc/pve/priv/storage/<STORAGE-ID>.enc with access restricted to the root user. Use the magic value autogen to automatically generate a new one using proxmox-backup-client key create --kdf none <path>. Optional.

Configuration Example (/etc/pve/storage.cfg)
pbs: backup
        datastore main
        server enya.proxmox.com
        content backup
        fingerprint 09:54:ef:..snip..:88:af:47:fe:4c:3b:cf:8b:26:88:0b:4e:3c:b2
        maxfiles 0
        username archiver@pbs

7.8.2. Storage Features

Proxmox Backup Server only supports backups, they can be block-level or file-level based. Proxmox VE uses block-level for virtual machines and file-level for container.

Table 7. Storage features for backend cifs
Content types Image formats Shared Snapshots Clones






7.8.3. Encryption


Optionally, you can configure client-side encryption with AES-256 in GCM mode. Encryption can be configured either via the web interface, or on the CLI with the encryption-key option (see above). The key will be saved in the file /etc/pve/priv/storage/<STORAGE-ID>.enc, which is only accessible by the root user.

Warning Without their key, backups will be inaccessible. Thus, you should keep keys ordered and in a place that is separate from the contents being backed up. It can happen, for example, that you back up an entire system, using a key on that system. If the system then becomes inaccessible for any reason and needs to be restored, this will not be possible as the encryption key will be lost along with the broken system.

It is recommended that you keep your key safe, but easily accessible, in order for quick disaster recovery. For this reason, the best place to store it is in your password manager, where it is immediately recoverable. As a backup to this, you should also save the key to a USB drive and store that in a secure place. This way, it is detached from any system, but is still easy to recover from, in case of emergency. Finally, in preparation for the worst case scenario, you should also consider keeping a paper copy of your key locked away in a safe place. The paperkey subcommand can be used to create a QR encoded version of your key. The following command sends the output of the paperkey command to a text file, for easy printing.

# proxmox-backup-client key paperkey /etc/pve/priv/storage/<STORAGE-ID>.enc --output-format text > qrkey.txt

Because the encryption is managed on the client side, you can use the same datastore on the server for unencrypted backups and encrypted backups, even if they are encrypted with different keys. However, deduplication between backups with different keys is not possible, so it is often better to create separate datastores.

Note Do not use encryption if there is no benefit from it, for example, when you are running the server locally in a trusted network. It is always easier to recover from unencrypted backups.

7.8.4. Example: Add Storage over CLI

Then you could add this share as a storage to the whole Proxmox VE cluster with:

# pvesm add pbs <id> --server <server> --datastore <datastore> --username <username> --fingerprint 00:B4:... --password

7.9. GlusterFS Backend

Storage pool type: glusterfs

GlusterFS is a scalable network file system. The system uses a modular design, runs on commodity hardware, and can provide a highly available enterprise storage at low costs. Such system is capable of scaling to several petabytes, and can handle thousands of clients.

Note After a node/brick crash, GlusterFS does a full rsync to make sure data is consistent. This can take a very long time with large files, so this backend is not suitable to store large VM images.

7.9.1. Configuration

The backend supports all common storage properties, and adds the following GlusterFS specific options:


GlusterFS volfile server IP or DNS name.


Backup volfile server IP or DNS name.


GlusterFS Volume.


GlusterFS transport: tcp, unix or rdma

Configuration Example (/etc/pve/storage.cfg)
glusterfs: Gluster
        volume glustervol
        content images,iso

7.9.2. File naming conventions

The directory layout and the file naming conventions are inherited from the dir backend.

7.9.3. Storage Features

The storage provides a file level interface, but no native snapshot/clone implementation.

Table 8. Storage features for backend glusterfs
Content types Image formats Shared Snapshots Clones

images vztmpl iso backup snippets

raw qcow2 vmdk




7.10. Local ZFS Pool Backend

Storage pool type: zfspool

This backend allows you to access local ZFS pools (or ZFS file systems inside such pools).

7.10.1. Configuration

The backend supports the common storage properties content, nodes, disable, and the following ZFS specific properties:


Select the ZFS pool/filesystem. All allocations are done within that pool.


Set ZFS blocksize parameter.


Use ZFS thin-provisioning. A sparse volume is a volume whose reservation is not equal to the volume size.


The mount point of the ZFS pool/filesystem. Changing this does not affect the mountpoint property of the dataset seen by zfs. Defaults to /<pool>.

Configuration Example (/etc/pve/storage.cfg)
zfspool: vmdata
        pool tank/vmdata
        content rootdir,images

7.10.2. File naming conventions

The backend uses the following naming scheme for VM images:

vm-<VMID>-<NAME>      // normal VM images
base-<VMID>-<NAME>    // template VM image (read-only)
subvol-<VMID>-<NAME>  // subvolumes (ZFS filesystem for containers)

This specifies the owner VM.


This can be an arbitrary name (ascii) without white space. The backend uses disk[N] as default, where [N] is replaced by an integer to make the name unique.

7.10.3. Storage Features

ZFS is probably the most advanced storage type regarding snapshot and cloning. The backend uses ZFS datasets for both VM images (format raw) and container data (format subvol). ZFS properties are inherited from the parent dataset, so you can simply set defaults on the parent dataset.

Table 9. Storage features for backend zfs
Content types Image formats Shared Snapshots Clones

images rootdir

raw subvol




7.10.4. Examples

It is recommended to create an extra ZFS file system to store your VM images:

# zfs create tank/vmdata

To enable compression on that newly allocated file system:

# zfs set compression=on tank/vmdata

You can get a list of available ZFS filesystems with:

# pvesm zfsscan

7.11. LVM Backend

Storage pool type: lvm

LVM is a light software layer on top of hard disks and partitions. It can be used to split available disk space into smaller logical volumes. LVM is widely used on Linux and makes managing hard drives easier.

Another use case is to put LVM on top of a big iSCSI LUN. That way you can easily manage space on that iSCSI LUN, which would not be possible otherwise, because the iSCSI specification does not define a management interface for space allocation.

7.11.1. Configuration

The LVM backend supports the common storage properties content, nodes, disable, and the following LVM specific properties:


LVM volume group name. This must point to an existing volume group.


Base volume. This volume is automatically activated before accessing the storage. This is mostly useful when the LVM volume group resides on a remote iSCSI server.


Zero-out data when removing LVs. When removing a volume, this makes sure that all data gets erased.


Wipe throughput (cstream -t parameter value).

Configuration Example (/etc/pve/storage.cfg)
lvm: myspace
        vgname myspace
        content rootdir,images

7.11.2. File naming conventions

The backend use basically the same naming conventions as the ZFS pool backend.

vm-<VMID>-<NAME>      // normal VM images

7.11.3. Storage Features

LVM is a typical block storage, but this backend does not support snapshots and clones. Unfortunately, normal LVM snapshots are quite inefficient, because they interfere with all writes on the entire volume group during snapshot time.

One big advantage is that you can use it on top of a shared storage, for example, an iSCSI LUN. The backend itself implements proper cluster-wide locking.

Tip The newer LVM-thin backend allows snapshots and clones, but does not support shared storage.
Table 10. Storage features for backend lvm
Content types Image formats Shared Snapshots Clones

images rootdir





7.11.4. Examples

List available volume groups:

# pvesm lvmscan

7.12. LVM thin Backend

Storage pool type: lvmthin

LVM normally allocates blocks when you create a volume. LVM thin pools instead allocates blocks when they are written. This behaviour is called thin-provisioning, because volumes can be much larger than physically available space.

You can use the normal LVM command line tools to manage and create LVM thin pools (see man lvmthin for details). Assuming you already have a LVM volume group called pve, the following commands create a new LVM thin pool (size 100G) called data:

lvcreate -L 100G -n data pve
lvconvert --type thin-pool pve/data

7.12.1. Configuration

The LVM thin backend supports the common storage properties content, nodes, disable, and the following LVM specific properties:


LVM volume group name. This must point to an existing volume group.


The name of the LVM thin pool.

Configuration Example (/etc/pve/storage.cfg)
lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

7.12.2. File naming conventions

The backend use basically the same naming conventions as the ZFS pool backend.

vm-<VMID>-<NAME>      // normal VM images

7.12.3. Storage Features

LVM thin is a block storage, but fully supports snapshots and clones efficiently. New volumes are automatically initialized with zero.

It must be mentioned that LVM thin pools cannot be shared across multiple nodes, so you can only use them as local storage.

Table 11. Storage features for backend lvmthin
Content types Image formats Shared Snapshots Clones

images rootdir





7.12.4. Examples

List available LVM thin pools on volume group pve:

# pvesm lvmthinscan pve

7.13. Open-iSCSI initiator

Storage pool type: iscsi

iSCSI is a widely employed technology used to connect to storage servers. Almost all storage vendors support iSCSI. There are also open source iSCSI target solutions available, e.g. OpenMediaVault, which is based on Debian.

To use this backend, you need to install the Open-iSCSI (open-iscsi) package. This is a standard Debian package, but it is not installed by default to save resources.

# apt-get install open-iscsi

Low-level iscsi management task can be done using the iscsiadm tool.

7.13.1. Configuration

The backend supports the common storage properties content, nodes, disable, and the following iSCSI specific properties:


iSCSI portal (IP or DNS name with optional port).


iSCSI target.

Configuration Example (/etc/pve/storage.cfg)
iscsi: mynas
     target iqn.2006-01.openfiler.com:tsn.dcb5aaaddd
     content none
Tip If you want to use LVM on top of iSCSI, it make sense to set content none. That way it is not possible to create VMs using iSCSI LUNs directly.

7.13.2. File naming conventions

The iSCSI protocol does not define an interface to allocate or delete data. Instead, that needs to be done on the target side and is vendor specific. The target simply exports them as numbered LUNs. So Proxmox VE iSCSI volume names just encodes some information about the LUN as seen by the linux kernel.

7.13.3. Storage Features

iSCSI is a block level type storage, and provides no management interface. So it is usually best to export one big LUN, and setup LVM on top of that LUN. You can then use the LVM plugin to manage the storage on that iSCSI LUN.

Table 12. Storage features for backend iscsi
Content types Image formats Shared Snapshots Clones

images none





7.13.4. Examples

Scan a remote iSCSI portal, and returns a list of possible targets:

pvesm scan iscsi <HOST[:PORT]>

7.14. User Mode iSCSI Backend

Storage pool type: iscsidirect

This backend provides basically the same functionality as the Open-iSCSI backed, but uses a user-level library (package libiscsi2) to implement it.

It should be noted that there are no kernel drivers involved, so this can be viewed as performance optimization. But this comes with the drawback that you cannot use LVM on top of such iSCSI LUN. So you need to manage all space allocations at the storage server side.

7.14.1. Configuration

The user mode iSCSI backend uses the same configuration options as the Open-iSCSI backed.

Configuration Example (/etc/pve/storage.cfg)
iscsidirect: faststore
     target iqn.2006-01.openfiler.com:tsn.dcb5aaaddd

7.14.2. Storage Features

Note This backend works with VMs only. Containers cannot use this driver.
Table 13. Storage features for backend iscsidirect
Content types Image formats Shared Snapshots Clones






7.15. Ceph RADOS Block Devices (RBD)

Storage pool type: rbd

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability. RADOS block devices implement a feature rich block level storage, and you get the following advantages:

  • thin provisioning

  • resizable volumes

  • distributed and redundant (striped over multiple OSDs)

  • full snapshot and clone capabilities

  • self healing

  • no single point of failure

  • scalable to the exabyte level

  • kernel and user space implementation available

Note For smaller deployments, it is also possible to run Ceph services directly on your Proxmox VE nodes. Recent hardware has plenty of CPU power and RAM, so running storage services and VMs on same node is possible.

7.15.1. Configuration

This backend supports the common storage properties nodes, disable, content, and the following rbd specific properties:


List of monitor daemon IPs. Optional, only needed if Ceph is not running on the PVE cluster.


Ceph pool name.


RBD user ID. Optional, only needed if Ceph is not running on the PVE cluster. Note that only the user ID should be used. The "client." type prefix must be left out.


Enforce access to rados block devices through the krbd kernel module. Optional.

Note Containers will use krbd independent of the option value.
Configuration Example for a external Ceph cluster (/etc/pve/storage.cfg)
rbd: ceph-external
        pool ceph-external
        content images
        username admin
Tip You can use the rbd utility to do low-level management tasks.

7.15.2. Authentication

If you use cephx authentication, you need to copy the keyfile from your external Ceph cluster to a Proxmox VE host.

Create the directory /etc/pve/priv/ceph with

mkdir /etc/pve/priv/ceph

Then copy the keyring

scp <cephserver>:/etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<STORAGE_ID>.keyring

The keyring must be named to match your <STORAGE_ID>. Copying the keyring generally requires root privileges.

If Ceph is installed locally on the PVE cluster, this is done automatically by pveceph or in the GUI.

7.15.3. Storage Features

The rbd backend is a block level storage, and implements full snapshot and clone functionality.

Table 14. Storage features for backend rbd
Content types Image formats Shared Snapshots Clones

images rootdir





7.16. Ceph Filesystem (CephFS)

Storage pool type: cephfs

CephFS implements a POSIX-compliant filesystem, using a Ceph storage cluster to store its data. As CephFS builds upon Ceph, it shares most of its properties. This includes redundancy, scalability, self-healing, and high availability.

Tip Proxmox VE can manage Ceph setups, which makes configuring a CephFS storage easier. As modern hardware offers a lot of processing power and RAM, running storage services and VMs on same node is possible without a significant performance impact.

To use the CephFS storage plugin, you must replace the stock Debian Ceph client, by adding our Ceph repository. Once added, run apt update, followed by apt dist-upgrade, in order to get the newest packages.

Warning Please ensure that there are no other Ceph repositories configured. Otherwise the installation will fail or there will be mixed package versions on the node, leading to unexpected behavior.

7.16.1. Configuration

This backend supports the common storage properties nodes, disable, content, as well as the following cephfs specific properties:


List of monitor daemon addresses. Optional, only needed if Ceph is not running on the PVE cluster.


The local mount point. Optional, defaults to /mnt/pve/<STORAGE_ID>/.


Ceph user id. Optional, only needed if Ceph is not running on the PVE cluster, where it defaults to admin.


CephFS subdirectory to mount. Optional, defaults to /.


Access CephFS through FUSE, instead of the kernel client. Optional, defaults to 0.

Configuration example for an external Ceph cluster (/etc/pve/storage.cfg)
cephfs: cephfs-external
        path /mnt/pve/cephfs-external
        content backup
        username admin
Note Don’t forget to set up the client’s secret key file, if cephx was not disabled.

7.16.2. Authentication

If you use cephx authentication, which is enabled by default, you need to copy the secret from your external Ceph cluster to a Proxmox VE host.

Create the directory /etc/pve/priv/ceph with

mkdir /etc/pve/priv/ceph

Then copy the secret

scp cephfs.secret <proxmox>:/etc/pve/priv/ceph/<STORAGE_ID>.secret

The secret must be renamed to match your <STORAGE_ID>. Copying the secret generally requires root privileges. The file must only contain the secret key itself, as opposed to the rbd backend which also contains a [client.userid] section.

A secret can be received from the Ceph cluster (as Ceph admin) by issuing the command below, where userid is the client ID that has been configured to access the cluster. For further information on Ceph user management, see the Ceph docs
[Ceph user management https://docs.ceph.com/en/nautilus/rados/operations/user-management/]

ceph auth get-key client.userid > cephfs.secret

If Ceph is installed locally on the PVE cluster, that is, it was set up using pveceph, this is done automatically.

7.16.3. Storage Features

The cephfs backend is a POSIX-compliant filesystem, on top of a Ceph cluster.

Table 15. Storage features for backend cephfs
Content types Image formats Shared Snapshots Clones

vztmpl iso backup snippets





[1] While no known bugs exist, snapshots are not yet guaranteed to be stable, as they lack sufficient testing.

8. Deploy Hyper-Converged Ceph Cluster


Proxmox VE unifies your compute and storage systems, that is, you can use the same physical nodes within a cluster for both computing (processing VMs and containers) and replicated storage. The traditional silos of compute and storage resources can be wrapped up into a single hyper-converged appliance. Separate storage networks (SANs) and connections via network attached storage (NAS) disappear. With the integration of Ceph, an open source software-defined storage platform, Proxmox VE has the ability to run and manage Ceph storage directly on the hypervisor nodes.

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability.

Some advantages of Ceph on Proxmox VE are:
  • Easy setup and management via CLI and GUI

  • Thin provisioning

  • Snapshot support

  • Self healing

  • Scalable to the exabyte level

  • Setup pools with different performance and redundancy characteristics

  • Data is replicated, making it fault tolerant

  • Runs on commodity hardware

  • No need for hardware RAID controllers

  • Open source

For small to medium-sized deployments, it is possible to install a Ceph server for RADOS Block Devices (RBD) directly on your Proxmox VE cluster nodes (see Ceph RADOS Block Devices (RBD)). Recent hardware has a lot of CPU power and RAM, so running storage services and VMs on the same node is possible.

To simplify management, we provide pveceph - a tool for installing and managing Ceph services on Proxmox VE nodes.

Ceph consists of multiple Daemons, for use as an RBD storage:
  • Ceph Monitor (ceph-mon)

  • Ceph Manager (ceph-mgr)

  • Ceph OSD (ceph-osd; Object Storage Daemon)

Tip We highly recommend to get familiar with Ceph
[Ceph intro https://docs.ceph.com/en/nautilus/start/intro/]
, its architecture
[Ceph architecture https://docs.ceph.com/en/nautilus/architecture/]
and vocabulary
[Ceph glossary https://docs.ceph.com/en/nautilus/glossary]

8.1. Precondition

To build a hyper-converged Proxmox + Ceph Cluster, you must use at least three (preferably) identical servers for the setup.

Check also the recommendations from Ceph’s website.


A high CPU core frequency reduces latency and should be preferred. As a simple rule of thumb, you should assign a CPU core (or thread) to each Ceph service to provide enough resources for stable and durable Ceph performance.


Especially in a hyper-converged setup, the memory consumption needs to be carefully monitored. In addition to the predicted memory usage of virtual machines and containers, you must also account for having enough memory available for Ceph to provide excellent and stable performance.

As a rule of thumb, for roughly 1 TiB of data, 1 GiB of memory will be used by an OSD. Especially during recovery, re-balancing or backfilling.

The daemon itself will use additional memory. The Bluestore backend of the daemon requires by default 3-5 GiB of memory (adjustable). In contrast, the legacy Filestore backend uses the OS page cache and the memory consumption is generally related to PGs of an OSD daemon.


We recommend a network bandwidth of at least 10 GbE or more, which is used exclusively for Ceph. A meshed network setup
[Full Mesh Network for Ceph https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server]
is also an option if there are no 10 GbE switches available.

The volume of traffic, especially during recovery, will interfere with other services on the same network and may even break the Proxmox VE cluster stack.

Furthermore, you should estimate your bandwidth needs. While one HDD might not saturate a 1 Gb link, multiple HDD OSDs per node can, and modern NVMe SSDs will even saturate 10 Gbps of bandwidth quickly. Deploying a network capable of even more bandwidth will ensure that this isn’t your bottleneck and won’t be anytime soon. 25, 40 or even 100 Gbps are possible.


When planning the size of your Ceph cluster, it is important to take the recovery time into consideration. Especially with small clusters, recovery might take long. It is recommended that you use SSDs instead of HDDs in small setups to reduce recovery time, minimizing the likelihood of a subsequent failure event during recovery.

In general, SSDs will provide more IOPS than spinning disks. With this in mind, in addition to the higher cost, it may make sense to implement a class based separation of pools. Another way to speed up OSDs is to use a faster disk as a journal or DB/Write-Ahead-Log device, see creating Ceph OSDs. If a faster disk is used for multiple OSDs, a proper balance between OSD and WAL / DB (or journal) disk must be selected, otherwise the faster disk becomes the bottleneck for all linked OSDs.

Aside from the disk type, Ceph performs best with an even sized and distributed amount of disks per node. For example, 4 x 500 GB disks within each node is better than a mixed setup with a single 1 TB and three 250 GB disk.

You also need to balance OSD count and single OSD capacity. More capacity allows you to increase storage density, but it also means that a single OSD failure forces Ceph to recover more data at once.

Avoid RAID

As Ceph handles data object redundancy and multiple parallel writes to disks (OSDs) on its own, using a RAID controller normally doesn’t improve performance or availability. On the contrary, Ceph is designed to handle whole disks on it’s own, without any abstraction in between. RAID controllers are not designed for the Ceph workload and may complicate things and sometimes even reduce performance, as their write and caching algorithms may interfere with the ones from Ceph.

Warning Avoid RAID controllers. Use host bus adapter (HBA) instead.
Note The above recommendations should be seen as a rough guidance for choosing hardware. Therefore, it is still essential to adapt it to your specific needs. You should test your setup and monitor health and performance continuously.

8.2. Initial Ceph Installation & Configuration

8.2.1. Using the Web-based Wizard


With Proxmox VE you have the benefit of an easy to use installation wizard for Ceph. Click on one of your cluster nodes and navigate to the Ceph section in the menu tree. If Ceph is not already installed, you will see a prompt offering to do so.

The wizard is divided into multiple sections, where each needs to finish successfully, in order to use Ceph.

First you need to chose which Ceph version you want to install. Prefer the one from your other nodes, or the newest if this is the first node you install Ceph.

After starting the installation, the wizard will download and install all the required packages from Proxmox VE’s Ceph repository.


After finishing the installation step, you will need to create a configuration. This step is only needed once per cluster, as this configuration is distributed automatically to all remaining cluster members through Proxmox VE’s clustered configuration file system (pmxcfs).

The configuration step includes the following settings:

  • Public Network: You can set up a dedicated network for Ceph. This setting is required. Separating your Ceph traffic is highly recommended. Otherwise, it could cause trouble with other latency dependent services, for example, cluster communication may decrease Ceph’s performance.

  • Cluster Network: As an optional step, you can go even further and separate the OSD replication & heartbeat traffic as well. This will relieve the public network and could lead to significant performance improvements, especially in large clusters.


You have two more options which are considered advanced and therefore should only changed if you know what you are doing.

  • Number of replicas: Defines how often an object is replicated

  • Minimum replicas: Defines the minimum number of required replicas for I/O to be marked as complete.

Additionally, you need to choose your first monitor node. This step is required.

That’s it. You should now see a success page as the last step, with further instructions on how to proceed. Your system is now ready to start using Ceph. To get started, you will need to create some additional monitors, OSDs and at least one pool.

The rest of this chapter will guide you through getting the most out of your Proxmox VE based Ceph setup. This includes the aforementioned tips and more, such as CephFS, which is a helpful addition to your new Ceph cluster.

8.2.2. CLI Installation of Ceph Packages

Alternatively to the the recommended Proxmox VE Ceph installation wizard available in the web-interface, you can use the following CLI command on each node:

pveceph install

This sets up an apt package repository in /etc/apt/sources.list.d/ceph.list and installs the required software.

8.2.3. Initial Ceph configuration via CLI

Use the Proxmox VE Ceph installation wizard (recommended) or run the following command on one node:

pveceph init --network

This creates an initial configuration at /etc/pve/ceph.conf with a dedicated network for Ceph. This file is automatically distributed to all Proxmox VE nodes, using pmxcfs. The command also creates a symbolic link at /etc/ceph/ceph.conf, which points to that file. Thus, you can simply run Ceph commands without the need to specify a configuration file.

8.3. Ceph Monitor


The Ceph Monitor (MON)
[Ceph Monitor https://docs.ceph.com/en/nautilus/start/intro/]
maintains a master copy of the cluster map. For high availability, you need at least 3 monitors. One monitor will already be installed if you used the installation wizard. You won’t need more than 3 monitors, as long as your cluster is small to medium-sized. Only really large clusters will require more than this.

8.3.1. Create Monitors

On each node where you want to place a monitor (three monitors are recommended), create one by using the Ceph → Monitor tab in the GUI or run:

pveceph mon create

8.3.2. Destroy Monitors

To remove a Ceph Monitor via the GUI, first select a node in the tree view and go to the Ceph → Monitor panel. Select the MON and click the Destroy button.

To remove a Ceph Monitor via the CLI, first connect to the node on which the MON is running. Then execute the following command:

pveceph mon destroy
Note At least three Monitors are needed for quorum.

8.4. Ceph Manager

The Manager daemon runs alongside the monitors. It provides an interface to monitor the cluster. Since the release of Ceph luminous, at least one ceph-mgr
[Ceph Manager https://docs.ceph.com/en/nautilus/mgr/]
daemon is required.

8.4.1. Create Manager

Multiple Managers can be installed, but only one Manager is active at any given time.

pveceph mgr create
Note It is recommended to install the Ceph Manager on the monitor nodes. For high availability install more then one manager.

8.4.2. Destroy Manager

To remove a Ceph Manager via the GUI, first select a node in the tree view and go to the Ceph → Monitor panel. Select the Manager and click the Destroy button.

To remove a Ceph Monitor via the CLI, first connect to the node on which the Manager is running. Then execute the following command:

pveceph mgr destroy
Note While a manager is not a hard-dependency, it is crucial for a Ceph cluster, as it handles important features like PG-autoscaling, device health monitoring, telemetry and more.

8.5. Ceph OSDs


Ceph Object Storage Daemons store objects for Ceph over the network. It is recommended to use one OSD per physical disk.

8.5.1. Create OSDs

You can create an OSD either via the Proxmox VE web-interface or via the CLI using pveceph. For example:

pveceph osd create /dev/sd[X]
Tip We recommend a Ceph cluster with at least three nodes and at least 12 OSDs, evenly distributed among the nodes.

If the disk was in use before (for example, for ZFS or as an OSD) you first need to zap all traces of that usage. To remove the partition table, boot sector and any other OSD leftover, you can use the following command:

ceph-volume lvm zap /dev/sd[X] --destroy
Warning The above command will destroy all data on the disk!
Ceph Bluestore

Starting with the Ceph Kraken release, a new Ceph OSD storage type was introduced called Bluestore
[Ceph Bluestore https://ceph.com/community/new-luminous-bluestore/]
. This is the default when creating OSDs since Ceph Luminous.

pveceph osd create /dev/sd[X]
Block.db and block.wal

If you want to use a separate DB/WAL device for your OSDs, you can specify it through the -db_dev and -wal_dev options. The WAL is placed with the DB, if not specified separately.

pveceph osd create /dev/sd[X] -db_dev /dev/sd[Y] -wal_dev /dev/sd[Z]

You can directly choose the size of those with the -db_size and -wal_size parameters respectively. If they are not given, the following values (in order) will be used:

  • bluestore_block_{db,wal}_size from Ceph configuration…

    • … database, section osd

    • … database, section global

    • … file, section osd

    • … file, section global

  • 10% (DB)/1% (WAL) of OSD size

Note The DB stores BlueStore’s internal metadata, and the WAL is BlueStore’s internal journal or write-ahead log. It is recommended to use a fast SSD or NVRAM for better performance.
Ceph Filestore

Before Ceph Luminous, Filestore was used as the default storage type for Ceph OSDs. Starting with Ceph Nautilus, Proxmox VE does not support creating such OSDs with pveceph anymore. If you still want to create filestore OSDs, use ceph-volume directly.

ceph-volume lvm create --filestore --data /dev/sd[X] --journal /dev/sd[Y]

8.5.2. Destroy OSDs

To remove an OSD via the GUI, first select a Proxmox VE node in the tree view and go to the Ceph → OSD panel. Then select the OSD to destroy and click the OUT button. Once the OSD status has changed from in to out, click the STOP button. Finally, after the status has changed from up to down, select Destroy from the More drop-down menu.

To remove an OSD via the CLI run the following commands.

ceph osd out <ID>
systemctl stop ceph-osd@<ID>.service
Note The first command instructs Ceph not to include the OSD in the data distribution. The second command stops the OSD service. Until this time, no data is lost.

The following command destroys the OSD. Specify the -cleanup option to additionally destroy the partition table.

pveceph osd destroy <ID>
Warning The above command will destroy all data on the disk!

8.6. Ceph Pools


A pool is a logical group for storing objects. It holds a collection of objects, known as Placement Groups (PG, pg_num).

8.6.1. Create and Edit Pools

You can create and edit pools from the command line or the web-interface of any Proxmox VE host under Ceph → Pools.

When no options are given, we set a default of 128 PGs, a size of 3 replicas and a min_size of 2 replicas, to ensure no data loss occurs if any OSD fails.

Warning Do not set a min_size of 1. A replicated pool with min_size of 1 allows I/O on an object when it has only 1 replica, which could lead to data loss, incomplete PGs or unfound objects.

It is advised that you either enable the PG-Autoscaler or calculate the PG number based on your setup. You can find the formula and the PG calculator
[PG calculator https://ceph.com/pgcalc/]
online. From Ceph Nautilus onward, you can change the number of PGs
[Placement Groups https://docs.ceph.com/en/nautilus/rados/operations/placement-groups/]
after the setup.

The PG autoscaler
[Automated Scaling https://docs.ceph.com/en/nautilus/rados/operations/placement-groups/#automated-scaling]
can automatically scale the PG count for a pool in the background. Setting the Target Size or Target Ratio advanced parameters helps the PG-Autoscaler to make better decisions.

Example for creating a pool over the CLI
pveceph pool create <name> --add_storages
Tip If you would also like to automatically define a storage for your pool, keep the ‘Add as Storage’ checkbox checked in the web-interface, or use the command line option --add_storages at pool creation.
Pool Options

The following options are available on pool creation, and partially also when editing a pool.


The name of the pool. This must be unique and can’t be changed afterwards.


The number of replicas per object. Ceph always tries to have this many copies of an object. Default: 3.

PG Autoscale Mode

The automatic PG scaling mode
of the pool. If set to warn, it produces a warning message when a pool has a non-optimal PG count. Default: warn.

Add as Storage

Configure a VM or container storage using the new pool. Default: true (only visible on creation).

Advanced Options
Min. Size

The minimum number of replicas per object. Ceph will reject I/O on the pool if a PG has less than this many replicas. Default: 2.

Crush Rule

The rule to use for mapping object placement in the cluster. These rules define how data is placed within the cluster. See Ceph CRUSH & device classes for information on device-based rules.

# of PGs

The number of placement groups
that the pool should have at the beginning. Default: 128.

Target Ratio

The ratio of data that is expected in the pool. The PG autoscaler uses the ratio relative to other ratio sets. It takes precedence over the target size if both are set.

Target Size

The estimated amount of data expected in the pool. The PG autoscaler uses this size to estimate the optimal PG count.

Min. # of PGs

The minimum number of placement groups. This setting is used to fine-tune the lower bound of the PG count for that pool. The PG autoscaler will not merge PGs below this threshold.

Further information on Ceph pool handling can be found in the Ceph pool operation
[Ceph pool operation https://docs.ceph.com/en/nautilus/rados/operations/pools/]

8.6.2. Destroy Pools

To destroy a pool via the GUI, select a node in the tree view and go to the Ceph → Pools panel. Select the pool to destroy and click the Destroy button. To confirm the destruction of the pool, you need to enter the pool name.

Run the following command to destroy a pool. Specify the -remove_storages to also remove the associated storage.

pveceph pool destroy <name>
Note Pool deletion runs in the background and can take some time. You will notice the data usage in the cluster decreasing throughout this process.

8.6.3. PG Autoscaler

The PG autoscaler allows the cluster to consider the amount of (expected) data stored in each pool and to choose the appropriate pg_num values automatically. It is available since Ceph Nautilus.

You may need to activate the PG autoscaler module before adjustments can take effect.

ceph mgr module enable pg_autoscaler

The autoscaler is configured on a per pool basis and has the following modes:


A health warning is issued if the suggested pg_num value differs too much from the current value.


The pg_num is adjusted automatically with no need for any manual interaction.


No automatic pg_num adjustments are made, and no warning will be issued if the PG count is not optimal.

The scaling factor can be adjusted to facilitate future data storage with the target_size, target_size_ratio and the pg_num_min options.

Warning By default, the autoscaler considers tuning the PG count of a pool if it is off by a factor of 3. This will lead to a considerable shift in data placement and might introduce a high load on the cluster.

You can find a more in-depth introduction to the PG autoscaler on Ceph’s Blog - New in Nautilus: PG merging and autotuning.

8.7. Ceph CRUSH & device classes


[CRUSH https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf]
(Controlled Replication Under Scalable Hashing) algorithm is at the foundation of Ceph.

CRUSH calculates where to store and retrieve data from. This has the advantage that no central indexing service is needed. CRUSH works using a map of OSDs, buckets (device locations) and rulesets (data replication) for pools.

Note Further information can be found in the Ceph documentation, under the section CRUSH map
[CRUSH map https://docs.ceph.com/en/nautilus/rados/operations/crush-map/]

This map can be altered to reflect different replication hierarchies. The object replicas can be separated (e.g., failure domains), while maintaining the desired distribution.

A common configuration is to use different classes of disks for different Ceph pools. For this reason, Ceph introduced device classes with luminous, to accommodate the need for easy ruleset generation.

The device classes can be seen in the ceph osd tree output. These classes represent their own root bucket, which can be seen with the below command.

ceph osd crush tree --show-shadow

Example output form the above command:

-16  nvme 2.18307 root default~nvme
-13  nvme 0.72769     host sumi1~nvme
 12  nvme 0.72769         osd.12
-14  nvme 0.72769     host sumi2~nvme
 13  nvme 0.72769         osd.13
-15  nvme 0.72769     host sumi3~nvme
 14  nvme 0.72769         osd.14
 -1       7.70544 root default
 -3       2.56848     host sumi1
 12  nvme 0.72769         osd.12
 -5       2.56848     host sumi2
 13  nvme 0.72769         osd.13
 -7       2.56848     host sumi3
 14  nvme 0.72769         osd.14

To instruct a pool to only distribute objects on a specific device class, you first need to create a ruleset for the device class:

ceph osd crush rule create-replicated <rule-name> <root> <failure-domain> <class>


name of the rule, to connect with a pool (seen in GUI & CLI)


which crush root it should belong to (default ceph root "default")


at which failure-domain the objects should be distributed (usually host)


what type of OSD backing store to use (e.g., nvme, ssd, hdd)

Once the rule is in the CRUSH map, you can tell a pool to use the ruleset.

ceph osd pool set <pool-name> crush_rule <rule-name>
Tip If the pool already contains objects, these must be moved accordingly. Depending on your setup, this may introduce a big performance impact on your cluster. As an alternative, you can create a new pool and move disks separately.

8.8. Ceph Client


Following the setup from the previous sections, you can configure Proxmox VE to use such pools to store VM and Container images. Simply use the GUI to add a new RBD storage (see section Ceph RADOS Block Devices (RBD)).

You also need to copy the keyring to a predefined location for an external Ceph cluster. If Ceph is installed on the Proxmox nodes itself, then this will be done automatically.

Note The filename needs to be <storage_id> + `.keyring, where <storage_id> is the expression after rbd: in /etc/pve/storage.cfg. In the following example, my-ceph-storage is the <storage_id>:
mkdir /etc/pve/priv/ceph
cp /etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/my-ceph-storage.keyring

8.9. CephFS

Ceph also provides a filesystem, which runs on top of the same object storage as RADOS block devices do. A Metadata Server (MDS) is used to map the RADOS backed objects to files and directories, allowing Ceph to provide a POSIX-compliant, replicated filesystem. This allows you to easily configure a clustered, highly available, shared filesystem. Ceph’s Metadata Servers guarantee that files are evenly distributed over the entire Ceph cluster. As a result, even cases of high load will not overwhelm a single host, which can be an issue with traditional shared filesystem approaches, for example NFS.


Proxmox VE supports both creating a hyper-converged CephFS and using an existing CephFS as storage to save backups, ISO files, and container templates.

8.9.1. Metadata Server (MDS)

CephFS needs at least one Metadata Server to be configured and running, in order to function. You can create an MDS through the Proxmox VE web GUI’s Node -> CephFS panel or from the command line with:

pveceph mds create

Multiple metadata servers can be created in a cluster, but with the default settings, only one can be active at a time. If an MDS or its node becomes unresponsive (or crashes), another standby MDS will get promoted to active. You can speed up the handover between the active and standby MDS by using the hotstandby parameter option on creation, or if you have already created it you may set/add:

mds standby replay = true

in the respective MDS section of /etc/pve/ceph.conf. With this enabled, the specified MDS will remain in a warm state, polling the active one, so that it can take over faster in case of any issues.

Note This active polling will have an additional performance impact on your system and the active MDS.
Multiple Active MDS

Since Luminous (12.2.x) you can have multiple active metadata servers running at once, but this is normally only useful if you have a high amount of clients running in parallel. Otherwise the MDS is rarely the bottleneck in a system. If you want to set this up, please refer to the Ceph documentation.
[Configuring multiple active MDS daemons https://docs.ceph.com/en/nautilus/cephfs/multimds/]

8.9.2. Create CephFS

With Proxmox VE’s integration of CephFS, you can easily create a CephFS using the web interface, CLI or an external API interface. Some prerequisites are required for this to work:

Prerequisites for a successful CephFS setup:

After this is complete, you can simply create a CephFS through either the Web GUI’s Node -> CephFS panel or the command line tool pveceph, for example:

pveceph fs create --pg_num 128 --add-storage

This creates a CephFS named cephfs, using a pool for its data named cephfs_data with 128 placement groups and a pool for its metadata named cephfs_metadata with one quarter of the data pool’s placement groups (32). Check the Proxmox VE managed Ceph pool chapter or visit the Ceph documentation for more information regarding an appropriate placement group number (pg_num) for your setup
. Additionally, the --add-storage parameter will add the CephFS to the Proxmox VE storage configuration after it has been created successfully.

8.9.3. Destroy CephFS

Warning Destroying a CephFS will render all of its data unusable. This cannot be undone!

If you really want to destroy an existing CephFS, you first need to stop or destroy all metadata servers (M̀DS). You can destroy them either via the web interface or via the command line interface, by issuing

pveceph mds destroy NAME

on each Proxmox VE node hosting an MDS daemon.

Then, you can remove (destroy) the CephFS by issuing

ceph fs rm NAME --yes-i-really-mean-it

on a single node hosting Ceph. After this, you may want to remove the created data and metadata pools, this can be done either over the Web GUI or the CLI with:

pveceph pool destroy NAME

8.10. Ceph maintenance

8.10.1. Replace OSDs

One of the most common maintenance tasks in Ceph is to replace the disk of an OSD. If a disk is already in a failed state, then you can go ahead and run through the steps in Destroy OSDs. Ceph will recreate those copies on the remaining OSDs if possible. This rebalancing will start as soon as an OSD failure is detected or an OSD was actively stopped.

Note With the default size/min_size (3/2) of a pool, recovery only starts when ‘size + 1` nodes are available. The reason for this is that the Ceph object balancer CRUSH defaults to a full node as `failure domain’.

To replace a functioning disk from the GUI, go through the steps in Destroy OSDs. The only addition is to wait until the cluster shows HEALTH_OK before stopping the OSD to destroy it.

On the command line, use the following commands:

ceph osd out osd.<id>

You can check with the command below if the OSD can be safely removed.

ceph osd safe-to-destroy osd.<id>

Once the above check tells you that it is safe to remove the OSD, you can continue with the following commands:

systemctl stop ceph-osd@<id>.service
pveceph osd destroy <id>

Replace the old disk with the new one and use the same procedure as described in Create OSDs.

8.10.2. Trim/Discard

It is good practice to run fstrim (discard) regularly on VMs and containers. This releases data blocks that the filesystem isn’t using anymore. It reduces data usage and resource load. Most modern operating systems issue such discard commands to their disks regularly. You only need to ensure that the Virtual Machines enable the disk discard option.

8.10.3. Scrub & Deep Scrub

Ceph ensures data integrity by scrubbing placement groups. Ceph checks every object in a PG for its health. There are two forms of Scrubbing, daily cheap metadata checks and weekly deep data checks. The weekly deep scrub reads the objects and uses checksums to ensure data integrity. If a running scrub interferes with business (performance) needs, you can adjust the time when scrubs
[Ceph scrubbing https://docs.ceph.com/en/nautilus/rados/configuration/osd-config-ref/#scrubbing]
are executed.

8.11. Ceph Monitoring and Troubleshooting

It is important to continuously monitor the health of a Ceph deployment from the beginning, either by using the Ceph tools or by accessing the status through the Proxmox VE API.

The following Ceph commands can be used to see if the cluster is healthy (HEALTH_OK), if there are warnings (HEALTH_WARN), or even errors (HEALTH_ERR). If the cluster is in an unhealthy state, the status commands below will also give you an overview of the current events and actions to take.

# single time output
pve# ceph -s
# continuously output status changes (press CTRL+C to stop)
pve# ceph -w

To get a more detailed view, every Ceph service has a log file under /var/log/ceph/. If more detail is required, the log level can be adjusted
[Ceph log and debugging https://docs.ceph.com/en/nautilus/rados/troubleshooting/log-and-debug/]

You can find more information about troubleshooting
[Ceph troubleshooting https://docs.ceph.com/en/nautilus/rados/troubleshooting/]
a Ceph cluster on the official website.

9. Storage Replication

The pvesr command line tool manages the Proxmox VE storage replication framework. Storage replication brings redundancy for guests using local storage and reduces migration time.

It replicates guest volumes to another node so that all data is available without using shared storage. Replication uses snapshots to minimize traffic sent over the network. Therefore, new data is sent only incrementally after the initial full sync. In the case of a node failure, your guest data is still available on the replicated node.

The replication is done automatically in configurable intervals. The minimum replication interval is one minute, and the maximal interval once a week. The format used to specify those intervals is a subset of systemd calendar events, see Schedule Format section:

It is possible to replicate a guest to multiple target nodes, but not twice to the same target node.

Each replications bandwidth can be limited, to avoid overloading a storage or server.

Guests with replication enabled can currently only be migrated offline. Only changes since the last replication (so-called deltas) need to be transferred if the guest is migrated to a node to which it already is replicated. This reduces the time needed significantly. The replication direction automatically switches if you migrate a guest to the replication target node.

For example: VM100 is currently on nodeA and gets replicated to nodeB. You migrate it to nodeB, so now it gets automatically replicated back from nodeB to nodeA.

If you migrate to a node where the guest is not replicated, the whole disk data must send over. After the migration, the replication job continues to replicate this guest to the configured nodes.


High-Availability is allowed in combination with storage replication, but it has the following implications:

  • as live-migrations are currently not possible, redistributing services after a more preferred node comes online does not work. Keep that in mind when configuring your HA groups and their priorities for replicated guests.

  • recovery works, but there may be some data loss between the last synced time and the time a node failed.

9.1. Supported Storage Types

Table 16. Storage Types
Description PVE type Snapshots Stable

ZFS (local)




9.2. Schedule Format

Proxmox VE has a very flexible replication scheduler. It is based on the systemd time calendar event format.
[see man 7 systemd.time for more information]
Calendar events may be used to refer to one or more points in time in a single expression.

Such a calendar event uses the following format:

[day(s)] [[start-time(s)][/repetition-time(s)]]

This format allows you to configure a set of days on which the job should run. You can also set one or more start times. It tells the replication scheduler the moments in time when a job should start. With this information we, can create a job which runs every workday at 10 PM: 'mon,tue,wed,thu,fri 22' which could be abbreviated to: 'mon..fri 22', most reasonable schedules can be written quite intuitive this way.

Note Hours are formatted in 24-hour format.

To allow a convenient and shorter configuration, one or more repeat times per guest can be set. They indicate that replications are done on the start-time(s) itself and the start-time(s) plus all multiples of the repetition value. If you want to start replication at 8 AM and repeat it every 15 minutes until 9 AM you would use: '8:00/15'

Here you see that if no hour separation (:), is used the value gets interpreted as minute. If such a separation is used, the value on the left denotes the hour(s), and the value on the right denotes the minute(s). Further, you can use * to match all possible values.

To get additional ideas look at more Examples below.

9.2.1. Detailed Specification


Days are specified with an abbreviated English version: sun, mon, tue, wed, thu, fri and sat. You may use multiple days as a comma-separated list. A range of days can also be set by specifying the start and end day separated by “..”, for example mon..fri. These formats can be mixed. If omitted '*' is assumed.


A time format consists of hours and minutes interval lists. Hours and minutes are separated by ':'. Both hour and minute can be list and ranges of values, using the same format as days. First are hours, then minutes. Hours can be omitted if not needed. In this case '*' is assumed for the value of hours. The valid range for values is 0-23 for hours and 0-59 for minutes.

9.2.2. Examples:

Table 17. Schedule Examples
Schedule String Alternative Meaning



Every working day at 0:00



Only on weekends at 0:00



Only on Monday, Wednesday and Friday at 0:00



Every day at 12:05 PM



Every five minutes

mon..wed 30/10

mon,tue,wed 30/10

Monday, Tuesday, Wednesday 30, 40 and 50 minutes after every full hour

mon..fri 8..17,22:0/15


Every working day every 15 minutes between 8 AM and 6 PM and between 10 PM and 11 PM

fri 12..13:5/20

fri 12,13:5/20

Friday at 12:05, 12:25, 12:45, 13:05, 13:25 and 13:45



Every day starting at 12:05 until 22:05, every 2 hours



Every minute (minimum interval)

9.3. Error Handling

If a replication job encounters problems, it is placed in an error state. In this state, the configured replication intervals get suspended temporarily. The failed replication is repeatedly tried again in a 30 minute interval. Once this succeeds, the original schedule gets activated again.

9.3.1. Possible issues

Some of the most common issues are in the following list. Depending on your setup there may be another cause.

  • Network is not working.

  • No free space left on the replication target storage.

  • Storage with same storage ID available on the target node

Note You can always use the replication log to find out what is causing the problem.

9.3.2. Migrating a guest in case of Error

In the case of a grave error, a virtual guest may get stuck on a failed node. You then need to move it manually to a working node again.

9.3.3. Example

Let’s assume that you have two guests (VM 100 and CT 200) running on node A and replicate to node B. Node A failed and can not get back online. Now you have to migrate the guest to Node B manually.

  • connect to node B over ssh or open its shell via the WebUI

  • check if that the cluster is quorate

    # pvecm status
  • If you have no quorum, we strongly advise to fix this first and make the node operable again. Only if this is not possible at the moment, you may use the following command to enforce quorum on the current node:

    # pvecm expected 1
Warning Avoid changes which affect the cluster if expected votes are set (for example adding/removing nodes, storages, virtual guests) at all costs. Only use it to get vital guests up and running again or to resolve the quorum issue itself.
  • move both guest configuration files form the origin node A to node B:

    # mv /etc/pve/nodes/A/qemu-server/100.conf /etc/pve/nodes/B/qemu-server/100.conf
    # mv /etc/pve/nodes/A/lxc/200.conf /etc/pve/nodes/B/lxc/200.conf
  • Now you can start the guests again:

    # qm start 100
    # pct start 200

Remember to replace the VMIDs and node names with your respective values.

9.4. Managing Jobs


You can use the web GUI to create, modify, and remove replication jobs easily. Additionally, the command line interface (CLI) tool pvesr can be used to do this.

You can find the replication panel on all levels (datacenter, node, virtual guest) in the web GUI. They differ in which jobs get shown: all, node- or guest-specific jobs.

When adding a new job, you need to specify the guest if not already selected as well as the target node. The replication schedule can be set if the default of all 15 minutes is not desired. You may impose a rate-limit on a replication job. The rate limit can help to keep the load on the storage acceptable.

A replication job is identified by a cluster-wide unique ID. This ID is composed of the VMID in addition to a job number. This ID must only be specified manually if the CLI tool is used.

9.5. Command Line Interface Examples

Create a replication job which runs every 5 minutes with a limited bandwidth of 10 Mbps (megabytes per second) for the guest with ID 100.

# pvesr create-local-job 100-0 pve1 --schedule "*/5" --rate 10

Disable an active job with ID 100-0.

# pvesr disable 100-0

Enable a deactivated job with ID 100-0.

# pvesr enable 100-0

Change the schedule interval of the job with ID 100-0 to once per hour.

# pvesr update 100-0 --schedule '*/00'

10. Qemu/KVM Virtual Machines

Qemu (short form for Quick Emulator) is an open source hypervisor that emulates a physical computer. From the perspective of the host system where Qemu is running, Qemu is a user program which has access to a number of local resources like partitions, files, network cards which are then passed to an emulated computer which sees them as if they were real devices.

A guest operating system running in the emulated computer accesses these devices, and runs as if it were running on real hardware. For instance, you can pass an ISO image as a parameter to Qemu, and the OS running in the emulated computer will see a real CD-ROM inserted into a CD drive.

Qemu can emulate a great variety of hardware from ARM to Sparc, but Proxmox VE is only concerned with 32 and 64 bits PC clone emulation, since it represents the overwhelming majority of server hardware. The emulation of PC clones is also one of the fastest due to the availability of processor extensions which greatly speed up Qemu when the emulated architecture is the same as the host architecture.

Note You may sometimes encounter the term KVM (Kernel-based Virtual Machine). It means that Qemu is running with the support of the virtualization processor extensions, via the Linux KVM module. In the context of Proxmox VE Qemu and KVM can be used interchangeably, as Qemu in Proxmox VE will always try to load the KVM module.

Qemu inside Proxmox VE runs as a root process, since this is required to access block and PCI devices.

10.1. Emulated devices and paravirtualized devices

The PC hardware emulated by Qemu includes a mainboard, network controllers, SCSI, IDE and SATA controllers, serial ports (the complete list can be seen in the kvm(1) man page) all of them emulated in software. All these devices are the exact software equivalent of existing hardware devices, and if the OS running in the guest has the proper drivers it will use the devices as if it were running on real hardware. This allows Qemu to runs unmodified operating systems.

This however has a performance cost, as running in software what was meant to run in hardware involves a lot of extra work for the host CPU. To mitigate this, Qemu can present to the guest operating system paravirtualized devices, where the guest OS recognizes it is running inside Qemu and cooperates with the hypervisor.

Qemu relies on the virtio virtualization standard, and is thus able to present paravirtualized virtio devices, which includes a paravirtualized generic disk controller, a paravirtualized network card, a paravirtualized serial port, a paravirtualized SCSI controller, etc …

It is highly recommended to use the virtio devices whenever you can, as they provide a big performance improvement. Using the virtio generic disk controller versus an emulated IDE controller will double the sequential write throughput, as measured with bonnie++(8). Using the virtio network interface can deliver up to three times the throughput of an emulated Intel E1000 network card, as measured with iperf(1).
[See this benchmark on the KVM wiki https://www.linux-kvm.org/page/Using_VirtIO_NIC]

10.2. Virtual Machines Settings

Generally speaking Proxmox VE tries to choose sane defaults for virtual machines (VM). Make sure you understand the meaning of the settings you change, as it could incur a performance slowdown, or putting your data at risk.

10.2.1. General Settings


General settings of a VM include

  • the Node : the physical server on which the VM will run

  • the VM ID: a unique number in this Proxmox VE installation used to identify your VM

  • Name: a free form text string you can use to describe the VM

  • Resource Pool: a logical group of VMs

10.2.2. OS Settings


When creating a virtual machine (VM), setting the proper Operating System(OS) allows Proxmox VE to optimize some low level parameters. For instance Windows OS expect the BIOS clock to use the local time, while Unix based OS expect the BIOS clock to have the UTC time.

10.2.3. System Settings

On VM creation you can change some basic system components of the new VM. You can specify which display type you want to use.


Additionally, the SCSI controller can be changed. If you plan to install the QEMU Guest Agent, or if your selected ISO image already ships and installs it automatically, you may want to tick the Qemu Agent box, which lets Proxmox VE know that it can use its features to show some more information, and complete some actions (for example, shutdown or snapshots) more intelligently.

Proxmox VE allows to boot VMs with different firmware and machine types, namely SeaBIOS and OVMF. In most cases you want to switch from the default SeaBIOS to OVMF only if you plan to use PCIe pass through. A VMs Machine Type defines the hardware layout of the VM’s virtual motherboard. You can choose between the default Intel 440FX or the Q35 chipset, which also provides a virtual PCIe bus, and thus may be desired if one wants to pass through PCIe hardware.

10.2.4. Hard Disk


Qemu can emulate a number of storage controllers:

  • the IDE controller, has a design which goes back to the 1984 PC/AT disk controller. Even if this controller has been superseded by recent designs, each and every OS you can think of has support for it, making it a great choice if you want to run an OS released before 2003. You can connect up to 4 devices on this controller.

  • the SATA (Serial ATA) controller, dating from 2003, has a more modern design, allowing higher throughput and a greater number of devices to be connected. You can connect up to 6 devices on this controller.

  • the SCSI controller, designed in 1985, is commonly found on server grade hardware, and can connect up to 14 storage devices. Proxmox VE emulates by default a LSI 53C895A controller.

    A SCSI controller of type VirtIO SCSI is the recommended setting if you aim for performance and is automatically selected for newly created Linux VMs since Proxmox VE 4.3. Linux distributions have support for this controller since 2012, and FreeBSD since 2014. For Windows OSes, you need to provide an extra iso containing the drivers during the installation. If you aim at maximum performance, you can select a SCSI controller of type VirtIO SCSI single which will allow you to select the IO Thread option. When selecting VirtIO SCSI single Qemu will create a new controller for each disk, instead of adding all disks to the same controller.

  • The VirtIO Block controller, often just called VirtIO or virtio-blk, is an older type of paravirtualized controller. It has been superseded by the VirtIO SCSI Controller, in terms of features.

Image Format

On each controller you attach a number of emulated hard disks, which are backed by a file or a block device residing in the configured storage. The choice of a storage type will determine the format of the hard disk image. Storages which present block devices (LVM, ZFS, Ceph) will require the raw disk image format, whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose either the raw disk image format or the QEMU image format.

  • the QEMU image format is a copy on write format which allows snapshots, and thin provisioning of the disk image.

  • the raw disk image is a bit-to-bit image of a hard disk, similar to what you would get when executing the dd command on a block device in Linux. This format does not support thin provisioning or snapshots by itself, requiring cooperation from the storage layer for these tasks. It may, however, be up to 10% faster than the QEMU image format.
    [See this benchmark for details https://events.static.linuxfound.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf]

  • the VMware image format only makes sense if you intend to import/export the disk image to other hypervisors.

Cache Mode

Setting the Cache mode of the hard drive will impact how the host system will notify the guest systems of block write completions. The No cache default means that the guest system will be notified that a write is complete when each block reaches the physical storage write queue, ignoring the host page cache. This provides a good balance between safety and speed.

If you want the Proxmox VE backup manager to skip a disk when doing a backup of a VM, you can set the No backup option on that disk.

If you want the Proxmox VE storage replication mechanism to skip a disk when starting a replication job, you can set the Skip replication option on that disk. As of Proxmox VE 5.0, replication requires the disk images to be on a storage of type zfspool, so adding a disk image to other storages when the VM has replication configured requires to skip replication for this disk image.


If your storage supports thin provisioning (see the storage chapter in the Proxmox VE guide), you can activate the Discard option on a drive. With Discard set and a TRIM-enabled guest OS
[TRIM, UNMAP, and discard https://en.wikipedia.org/wiki/Trim_%28computing%29]
, when the VM’s filesystem marks blocks as unused after deleting files, the controller will relay this information to the storage, which will then shrink the disk image accordingly. For the guest to be able to issue TRIM commands, you must enable the Discard option on the drive. Some guest operating systems may also require the SSD Emulation flag to be set. Note that Discard on VirtIO Block drives is only supported on guests using Linux Kernel 5.0 or higher.

If you would like a drive to be presented to the guest as a solid-state drive rather than a rotational hard disk, you can set the SSD emulation option on that drive. There is no requirement that the underlying storage actually be backed by SSDs; this feature can be used with physical media of any type. Note that SSD emulation is not supported on VirtIO Block drives.

IO Thread

The option IO Thread can only be used when using a disk with the VirtIO controller, or with the SCSI controller, when the emulated controller type is VirtIO SCSI single. With this enabled, Qemu creates one I/O thread per storage controller, rather than a single thread for all I/O. This can increase performance when multiple disks are used and each disk has its own storage controller.

10.2.5. CPU


A CPU socket is a physical slot on a PC motherboard where you can plug a CPU. This CPU can then contain one or many cores, which are independent processing units. Whether you have a single CPU socket with 4 cores, or two CPU sockets with two cores is mostly irrelevant from a performance point of view. However some software licenses depend on the number of sockets a machine has, in that case it makes sense to set the number of sockets to what the license allows you.

Increasing the number of virtual CPUs (cores and sockets) will usually provide a performance improvement though that is heavily dependent on the use of the VM. Multi-threaded applications will of course benefit from a large number of virtual CPUs, as for each virtual cpu you add, Qemu will create a new thread of execution on the host system. If you’re not sure about the workload of your VM, it is usually a safe bet to set the number of Total cores to 2.

Note It is perfectly safe if the overall number of cores of all your VMs is greater than the number of cores on the server (e.g., 4 VMs with each 4 cores on a machine with only 8 cores). In that case the host system will balance the Qemu execution threads between your server cores, just like if you were running a standard multi-threaded application. However, Proxmox VE will prevent you from starting VMs with more virtual CPU cores than physically available, as this will only bring the performance down due to the cost of context switches.
Resource Limits

In addition to the number of virtual cores, you can configure how much resources a VM can get in relation to the host CPU time and also in relation to other VMs. With the cpulimit (“Host CPU Time”) option you can limit how much CPU time the whole VM can use on the host. It is a floating point value representing CPU time in percent, so 1.0 is equal to 100%, 2.5 to 250% and so on. If a single process would fully use one single core it would have 100% CPU Time usage. If a VM with four cores utilizes all its cores fully it would theoretically use 400%. In reality the usage may be even a bit higher as Qemu can have additional threads for VM peripherals besides the vCPU core ones. This setting can be useful if a VM should have multiple vCPUs, as it runs a few processes in parallel, but the VM as a whole should not be able to run all vCPUs at 100% at the same time. Using a specific example: lets say we have a VM which would profit from having 8 vCPUs, but at no time all of those 8 cores should run at full load - as this would make the server so overloaded that other VMs and CTs would get to less CPU. So, we set the cpulimit limit to 4.0 (=400%). If all cores do the same heavy work they would all get 50% of a real host cores CPU time. But, if only 4 would do work they could still get almost 100% of a real core each.