Nested Virtualization

From Proxmox VE
Jump to: navigation, search

beware: this wiki article is a draft and may be not accurate or complete

What is

Nested virtualization is when you run an hypervisor, like PVE or others, inside a virtual machine (which is of course running on another hypervisor) instead that on real hardware. In other words, you have a host hypervisor, hosting a guest hypervisor (as a vm), which can hosts its own vms.

This obviously adds an overhead to the nested environment, but it could be useful in some cases:

  • it could let you test (or learn) how to manage hypervisors before actual implementation, or test some dangerous/tricky procedure involving hypervisors berfore actually doing it on the real thing.
  • it could enable businesses to deploy their own virtualization environment, e.g. on public services (cloud), see also http://www.ibm.com/developerworks/cloud/library/cl-nestedvirtualization/

Requirements

In order to have the fastest possible performance, near to native, any hypervisor should have access to some (real) hardware features that are generally useful for virtualization, the so called 'hardware-assisted virtualization extensions' (see http://en.wikipedia.org/wiki/Hardware-assisted_virtualization).

In nested virtualization, also the guest hypervisor should have access to hardware-assisted virtualization extensions, and that implies that the host hypervisor should expose those extension to its virtual machines. In principle it works without those extensions too but with poor performance and it is mot an option for productive environment (but maybe sufficient for some test cases). Exposing of those extensions requires in case of intel CPUs kernel 3 or higher, i.e. it is available in Proxmox VE 4.x/5.x, but not as default in older versions.

You will need to allocate plenty of cpu, ram and disk to those guest hypervisors. If you intend to migrate machines in a cluster, nested needs to be activated on all hosts in the cluster.

PVE as nested Hypervisor

PVE can:

  • host a nested (guest) hypervisor, but by default it does not expose hardware-assisted virtualization extensions to its VM, so you cannot expect to have optimal performance for virtual machines in the guest hypervisor unless you configure the VM´s (virtual hypervisor´s) CPU as "host" and have nested hardware-assisted virtualization extensions enabled in the physical PVE host.
  • be hosted as a nested (guest) hypervisor. If the host hypervisor can expose hardware-assisted virtualization extensions to PVE, it could be able to use them, and provide better performance to its guests, otherwise, as in the PVE-inside-PVE case, any vm (kvm) will only work after you turn off KVM hardware virtualization (see vm options).

Enable Nested Hardware-assisted Virtualization

To be done on the physical PVE host (or any other hypervisor).

To have nested hardware-assisted virtualization, you have to:

  • use AMD cpu or very recent Intel one
  • use kernel >= 3.10 (is always the case in Proxmox VE 4.x)
  • enable nested support

to check if is enabled do ("kvm_intel" for intel cpu, "kvm_amd" for AMD)

 root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested   
N


N means it's not, to enable ("kvm-intel" for intel):

 # echo "options kvm-intel nested=Y" > /etc/modprobe.d/kvm-intel.conf

(or "kvm-amd" for AMD, note the 1 instead of Y):

 # echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf

and reboot or reload the kernel modul

modprobe -r kvm_intel
modprobe kvm_intel

check again

 root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested                    
 Y


(pay attention where the dash "-" is used, and where it's underscore "_" instead)

Then create a guest where you install e.g. Proxmox as nested Virtualization Environment.

  • set the CPU type to "host"
  • in case of AMD CPU: add also the following in the configuration file:
 args: -cpu host,+svm

Once installed the guest OS, if GNU/Linux you can enter and verify that the hardware virtualization support is enabled by doing

 root@guest1# egrep '(vmx|svm)' --color=always /proc/cpuinfo

Example: PVE hosts a PVE guest hypervisor

Set a cluster of self nested PVE

In the physical host Proxmox you create 2 VM, and in each one install a new instance of Proxmox, so you can experiment with cluster concepts without the need of having multiple physical servers.

  • log into (web gui) your host pve (running on real hardware)
 => PVE
  • create two or more vm guests (kvm) in your host PVE, each with enough ram/disk and install PVE from iso on the each guest vm (same network)
 => PVE => VMPVE1 (guest PVE)
 => PVE => VMPVE2 (guest PVE)
 ...
  • log into (ssh/console) the first guest vm & create cluster CLUSTERNAME
 => PVE => VMPVE1 (guest PVE) => #pvecm create CLUSTERNAME
  • log into each other guest vm & join cluster <CLUSTERNAME>
 => PVE => VMPVE2 (guest PVE) => #pvecm add <IP address of VM1>
  • log into (web gui) any guest vm (guest pve) and manage the new (guest) cluster
 => PVE => VMPVE1/2 (guest PVE) => #pvecm n
  • create vm or ct inside the guest pve (nodes of CLUSTERNAME)
    • if you did't enable hardware-assisted nested virtualization, you have to turn off KVM hardware virtualization (see vm options)
    • install only CLI based, small ct or vm for those guest (do not try anything with a GUI, don't even think of running Windows...)
 => PVE => VMPVE1/2 (guest PVE) => VM/CT
  • install something on (eg) a vm (eg: a basic ubuntu server) from iso
 => PVE => VMPVE2 (guest PVE) => VM (basic ubuntu server)

vm/ct performance without hardware-assisted virtualization extensions

if you can't setup hardware-assisted virtualization extensions for the guest, performance is far from optimal! Use only to practice or test!

  • ct (lxc) will be faster, of course, quite usable
  • vm (kvm) will be really slow, nearly unusable (you can expect 10x slower or more), since (as said above) they're running without KVM hardware virtualization

but at least you can try or test "guest pve" features or setups:

  • you could create a small test cluster to practice with cluster concepts and operations
  • you could test a new pve version before upgrading
  • you could test setups conflicting with your production setup