Nested Virtualization: Difference between revisions

From Proxmox VE
Jump to navigation Jump to search
No edit summary
 
(29 intermediate revisions by 12 users not shown)
Line 1: Line 1:
'''beware: this wiki article is a draft and may be not accurate or complete'''
== What is ==
== What is ==
Nested virtualization is when you run an hypervisor, like PVE or others, inside a virtual machine (which is of course running on another hypervisor) instead that on real hardware. In other words, you have a host hypervisor, hosting a guest hypervisor (as a vm), which can hosts its own vms.  
Nested virtualization is when you run a hypervisor, like PVE or others, inside a virtual machine (which is of course running on another hypervisor) instead of on real hardware. In other words, you have a host hypervisor, hosting a guest hypervisor (as a VM), which can hosts its own VMs.  


This obviously adds an overhead to the nested environment, but it could be useful in some cases:  
This obviously adds an overhead to the nested environment, but it could be useful in some cases:  
* it could let you test (or learn) how to manage hypervisors before actual implementation, or test some dangerous/tricky procedure involving hypervisors berfore actually doing it on the real thing.  
* it could let you test (or learn) how to manage hypervisors before actual implementation, or test some dangerous/tricky procedure involving hypervisors before actually doing it on the real thing.
* it could enable businesses to deploy their own virtualization environment on public services (cloud). See also http://www.ibm.com/developerworks/cloud/library/cl-nestedvirtualization/
* it could enable businesses to deploy their own virtualization environment, e.g. on public services (cloud), see also http://www.ibm.com/developerworks/cloud/library/cl-nestedvirtualization/


== Requirements ==
== Requirements ==
In order to have the fastest possible performance, near to native, any hypervisor should have access to some (real) hardware features that are generally useful for virtualization, the so hardware-assisted virtualization extensions (see http://en.wikipedia.org/wiki/Hardware-assisted_virtualization).
In order to have the fastest possible performance, near to native, any hypervisor should have access to some (real) hardware features that are generally useful for virtualization, the so called 'hardware-assisted virtualization extensions' (see http://en.wikipedia.org/wiki/Hardware-assisted_virtualization).
 
In nested virtualization, also the guest hypervisor should have access to hardware-assisted virtualization extensions, and that implies that the host hypervisor should expose those extension to its virtual machines. In principle it works without those extensions too but with poor performance and it is not an option for productive environment (but maybe sufficient for some test cases). Exposing those extensions requires in case of intel CPUs kernel 3 or higher, i.e. it is available in Proxmox VE > 4.x, but not as default in older versions.
 
You will need to allocate plenty of CPU, RAM and disk space for those guest hypervisors.
 
== Proxmox VE and nesting ==
Proxmox VE can:
* '''host a nested (guest) hypervisor'''
By default, it does not expose hardware-assisted virtualization extensions to its VMs. Do not expect optimal performance for virtual machines on the guest hypervisor, unless you configure the VM's CPU as "host" and have nested hardware-assisted virtualization extensions enabled on the physical PVE host.


In nested virtualization, also the guest hypervisor should have access to hardware-assisted virtualization extensions, and that implies that the host hypervisor should expose those extension to its virtual machines.
{{Note|Microsoft Hyper-V as a nested Hypervisor on AMD CPUs should work with Proxmox VE 7.}}


You will need to allocate plenty of cpu, ram and disk to those guest hypervisors.
* '''be hosted as a nested (guest) hypervisor'''
The host hypervisor needs to expose the hardware-assisted virtualization extensions. Proxmox VE can use them to provide better performance to its guests. Otherwise, as in the PVE-inside-PVE case, any VM (KVM) needs to turn off the KVM hardware virtualization (see VM options).


== Nested PVE ==
== Enable Nested Hardware-assisted Virtualization ==
PVE can:
* host a nested (guest) hypervisor, but it does not expose hardware-assisted virtualization extensions to its VM, so you cannot expect to have optimal performance for virtual machines in the guest hypervsor.
* be hosted as a nested (guest) hypervisor. If the host hypervisor can expose hardware-assisted virtualization extensions to PVE, it could be able to use them, and provide better performance to its guests, otherwise, as in the PVE-inside-PVE case, any vm (kvm) will only work after you turn off KVM hardware virtualization (see vm options).


== PVE hosts a (guest) hypervisor with hardware assisted support ==
{{Note|VMs with nesting active (vmx/svm flag) cannot be live-migrated!}}
To have hardware-assisted virtualization, you have to:
 
To be done on the physical PVE host (or any other hypervisor).
 
To have nested hardware-assisted virtualization, you have to:


In the host (the one installed on hardware)
* use AMD cpu or very recent Intel one
* use AMD cpu or very recent Intel one
* use kernel >= 3.10
* use kernel >= 3.10 (is always the case after Proxmox VE 4.x)
* enable nested support
* enable nested support
to check if is enabled do ("kvm_amd" for AMD cpu, "kvm_intel" for intel)
to check if is enabled do ("kvm_intel" for intel cpu, "kvm_amd" for AMD)
   root@proxmox:~# cat /sys/module/kvm_amd/parameters/nested                                                                                                                                                                                                      
   root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested  
  0
N
0 means it's not, to enable ("kvm-amd" for amd, "kvm-intel" for intel):
 
N means it's not enabled, to activate it ("kvm-intel" for intel):
  # echo "options kvm-intel nested=Y" > /etc/modprobe.d/kvm-intel.conf
(or "kvm-amd" for AMD, note the 1 instead of Y):
   # echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf
   # echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf
and reboot or reload the kernel modul
and reboot or reload the kernel module
  modprobe -r kvm-amd
  modprobe -r kvm_intel
  modprobe kvm-amd
  modprobe kvm_intel


check again
check again
   root@proxmox:~# cat /sys/module/kvm_amd/parameters/nested                                                                                                                                                                                                      
   root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested                  
   1
   Y
 


(pay attention where the dash "-" is used, and where it's underscore "_" instead)
(pay attention where the dash "-" is used, and where it's underscore "_" instead)


Then create a guest where you install i.e. Proxmox as nested virtualizer
Then create a guest where you install e.g. Proxmox as nested Virtualization Environment.
* as CPU type use "host"
Set the CPU type to "host"
* you can use virtio driver/nic type, but you can't use it for it's VMs
  root@guest1# qm set <vmid> --cpu host
* in <VMID>.conf, by hand add
  args: -enable-kvm
(N.B. in older qemu was args: -enable-nesting)


Once installed the guest OS, if GNU/Linux you can enter and verify that the hardware virtualization support is enabled by doing
Once installed the guest OS, if GNU/Linux you can enter and verify that the hardware virtualization support is enabled by doing
   root@guest1# egrep '(vmx|svm)' --color=always /proc/cpuinfo
   root@guest1# egrep '(vmx|svm)' --color=always /proc/cpuinfo


=== Set a cluster of self nested PVE ===
== Example: PVE hosts a PVE guest hypervisor ==
In the physical host Proxmox you create 2 VM, and in each one install a new instance of Proxmox, so you can experiment with cluster concepts without the need of having multiple physical servers.
 
=== Set a cluster of self-nested PVE ===
In the physical host Proxmox you create 2 VM, and in each one install a new instance of PVE, so you can experiment with cluster concepts without the need of having multiple physical servers.
* log into (web gui) your host pve (running on real hardware)
* log into (web gui) your host pve (running on real hardware)
   => PVE
   => PVE
Line 67: Line 78:
   => PVE => VMPVE1/2 (guest PVE) => #pvecm n
   => PVE => VMPVE1/2 (guest PVE) => #pvecm n
* create vm or ct inside the guest pve (nodes of CLUSTERNAME)
* create vm or ct inside the guest pve (nodes of CLUSTERNAME)
** don't use virtio disk/network for those guest (kvm): won't work.
** if you did't enable hardware-assisted nested virtualization, you have to turn off KVM hardware virtualization (see VM options)
** if you did't enable hardware-assisted nested virtualization, you have to turn off KVM hardware virtualization (see vm options)
** install only CLI based, small ct or vm for those guest (do not try anything with a GUI, don't even think of running Windows...)
** install only CLI based, small ct or vm for those guest (do not try anything with a GUI, don't even think of running Windows...)


   => PVE => VMPVE1/2 (guest PVE) => VM/CT
   => PVE => VMPVE1/2 (guest PVE) => VM/CT


* install something on (eg) a vm (eg: a basic ubuntu server) from iso
* install something on a VM (for example a basic ubuntu server) from ISO
   => PVE => VMPVE2 (guest PVE) => VM (basic ubuntu server)
   => PVE => VMPVE2 (guest PVE) => VM (basic ubuntu server)


=== vm/ct performance withotu hardware-assisted virtualization extensions ===
=== VM/CT performance without hardware-assisted virtualization extensions ===
if you can't setup hardware-assisted virtualization extensions for the guest, performance is far from optimal! Use only to practice or test!
if you can't setup hardware-assisted virtualization extensions for the guest, performance is far from optimal! Use only to practice or test!
* ct (openvz) will be faster, of course, quite usable
* CT (LXC) will be faster, of course, quite usable
* vm (kvm) will be really slow, nearly unusable (you can expect 10x slower or more), since (as said above) they're running without KVM hardware virtualization
* VM (KVM) will be really slow, nearly unusable (you can expect 10x slower or more), since (as said above) they're running without KVM hardware virtualization


but at least you can try or test "guest pve" features or setups:
but at least you can try or test "guest pve" features or setups:
Line 85: Line 95:
* you could test a new pve version before upgrading
* you could test a new pve version before upgrading
* you could test setups conflicting with your production setup
* you could test setups conflicting with your production setup
== Troubleshooting ==
=== Bluescreen at boot since Windows 10 1803 ===
To fix this issue set /sys/module/kvm/parameters/ignore_msrs to Y
* temporary
echo "Y" > /sys/module/kvm/parameters/ignore_msrs
* persistent
echo "options kvm ignore_msrs=1" >> /etc/modprobe.d/kvm.conf
modprobe -r kvm
modprobe kvm

Latest revision as of 14:02, 3 March 2022

What is

Nested virtualization is when you run a hypervisor, like PVE or others, inside a virtual machine (which is of course running on another hypervisor) instead of on real hardware. In other words, you have a host hypervisor, hosting a guest hypervisor (as a VM), which can hosts its own VMs.

This obviously adds an overhead to the nested environment, but it could be useful in some cases:

  • it could let you test (or learn) how to manage hypervisors before actual implementation, or test some dangerous/tricky procedure involving hypervisors before actually doing it on the real thing.
  • it could enable businesses to deploy their own virtualization environment, e.g. on public services (cloud), see also http://www.ibm.com/developerworks/cloud/library/cl-nestedvirtualization/

Requirements

In order to have the fastest possible performance, near to native, any hypervisor should have access to some (real) hardware features that are generally useful for virtualization, the so called 'hardware-assisted virtualization extensions' (see http://en.wikipedia.org/wiki/Hardware-assisted_virtualization).

In nested virtualization, also the guest hypervisor should have access to hardware-assisted virtualization extensions, and that implies that the host hypervisor should expose those extension to its virtual machines. In principle it works without those extensions too but with poor performance and it is not an option for productive environment (but maybe sufficient for some test cases). Exposing those extensions requires in case of intel CPUs kernel 3 or higher, i.e. it is available in Proxmox VE > 4.x, but not as default in older versions.

You will need to allocate plenty of CPU, RAM and disk space for those guest hypervisors.

Proxmox VE and nesting

Proxmox VE can:

  • host a nested (guest) hypervisor

By default, it does not expose hardware-assisted virtualization extensions to its VMs. Do not expect optimal performance for virtual machines on the guest hypervisor, unless you configure the VM's CPU as "host" and have nested hardware-assisted virtualization extensions enabled on the physical PVE host.

Yellowpin.svg Note: Microsoft Hyper-V as a nested Hypervisor on AMD CPUs should work with Proxmox VE 7.
  • be hosted as a nested (guest) hypervisor

The host hypervisor needs to expose the hardware-assisted virtualization extensions. Proxmox VE can use them to provide better performance to its guests. Otherwise, as in the PVE-inside-PVE case, any VM (KVM) needs to turn off the KVM hardware virtualization (see VM options).

Enable Nested Hardware-assisted Virtualization

Yellowpin.svg Note: VMs with nesting active (vmx/svm flag) cannot be live-migrated!

To be done on the physical PVE host (or any other hypervisor).

To have nested hardware-assisted virtualization, you have to:

  • use AMD cpu or very recent Intel one
  • use kernel >= 3.10 (is always the case after Proxmox VE 4.x)
  • enable nested support

to check if is enabled do ("kvm_intel" for intel cpu, "kvm_amd" for AMD)

 root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested   
N


N means it's not enabled, to activate it ("kvm-intel" for intel):

 # echo "options kvm-intel nested=Y" > /etc/modprobe.d/kvm-intel.conf

(or "kvm-amd" for AMD, note the 1 instead of Y):

 # echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf

and reboot or reload the kernel module

modprobe -r kvm_intel
modprobe kvm_intel

check again

 root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested                    
 Y


(pay attention where the dash "-" is used, and where it's underscore "_" instead)

Then create a guest where you install e.g. Proxmox as nested Virtualization Environment. Set the CPU type to "host"

 root@guest1# qm set <vmid> --cpu host

Once installed the guest OS, if GNU/Linux you can enter and verify that the hardware virtualization support is enabled by doing

 root@guest1# egrep '(vmx|svm)' --color=always /proc/cpuinfo

Example: PVE hosts a PVE guest hypervisor

Set a cluster of self-nested PVE

In the physical host Proxmox you create 2 VM, and in each one install a new instance of PVE, so you can experiment with cluster concepts without the need of having multiple physical servers.

  • log into (web gui) your host pve (running on real hardware)
 => PVE
  • create two or more vm guests (kvm) in your host PVE, each with enough ram/disk and install PVE from iso on the each guest vm (same network)
 => PVE => VMPVE1 (guest PVE)
 => PVE => VMPVE2 (guest PVE)
 ...
  • log into (ssh/console) the first guest vm & create cluster CLUSTERNAME
 => PVE => VMPVE1 (guest PVE) => #pvecm create CLUSTERNAME
  • log into each other guest vm & join cluster <CLUSTERNAME>
 => PVE => VMPVE2 (guest PVE) => #pvecm add <IP address of VM1>
  • log into (web gui) any guest vm (guest pve) and manage the new (guest) cluster
 => PVE => VMPVE1/2 (guest PVE) => #pvecm n
  • create vm or ct inside the guest pve (nodes of CLUSTERNAME)
    • if you did't enable hardware-assisted nested virtualization, you have to turn off KVM hardware virtualization (see VM options)
    • install only CLI based, small ct or vm for those guest (do not try anything with a GUI, don't even think of running Windows...)
 => PVE => VMPVE1/2 (guest PVE) => VM/CT
  • install something on a VM (for example a basic ubuntu server) from ISO
 => PVE => VMPVE2 (guest PVE) => VM (basic ubuntu server)

VM/CT performance without hardware-assisted virtualization extensions

if you can't setup hardware-assisted virtualization extensions for the guest, performance is far from optimal! Use only to practice or test!

  • CT (LXC) will be faster, of course, quite usable
  • VM (KVM) will be really slow, nearly unusable (you can expect 10x slower or more), since (as said above) they're running without KVM hardware virtualization

but at least you can try or test "guest pve" features or setups:

  • you could create a small test cluster to practice with cluster concepts and operations
  • you could test a new pve version before upgrading
  • you could test setups conflicting with your production setup

Troubleshooting

Bluescreen at boot since Windows 10 1803

To fix this issue set /sys/module/kvm/parameters/ignore_msrs to Y

  • temporary
echo "Y" > /sys/module/kvm/parameters/ignore_msrs
  • persistent
echo "options kvm ignore_msrs=1" >> /etc/modprobe.d/kvm.conf
modprobe -r kvm
modprobe kvm