Nested Virtualization
What is
Nested virtualization is when you run a hypervisor, like PVE or others, inside a virtual machine (which is of course running on another hypervisor) instead of on real hardware. In other words, you have a host hypervisor, hosting a guest hypervisor (as a VM), which can hosts its own VMs.
This obviously adds an overhead to the nested environment, but it could be useful in some cases:
- it could let you test (or learn) how to manage hypervisors before actual implementation, or test some dangerous/tricky procedure involving hypervisors before actually doing it on the real thing.
- it could enable businesses to deploy their own virtualization environment, e.g. on public services (cloud), see also http://www.ibm.com/developerworks/cloud/library/cl-nestedvirtualization/
Requirements
In order to have the fastest possible performance, near to native, any hypervisor should have access to some (real) hardware features that are generally useful for virtualization, the so called 'hardware-assisted virtualization extensions' (see http://en.wikipedia.org/wiki/Hardware-assisted_virtualization).
In nested virtualization, also the guest hypervisor should have access to hardware-assisted virtualization extensions, and that implies that the host hypervisor should expose those extension to its virtual machines. In principle it works without those extensions too but with poor performance and it is not an option for productive environment (but maybe sufficient for some test cases). Exposing those extensions requires in case of intel CPUs kernel 3 or higher, i.e. it is available in Proxmox VE > 4.x, but not as default in older versions.
You will need to allocate plenty of CPU, RAM and disk space for those guest hypervisors.
Proxmox VE and nesting
Proxmox VE can:
- host a nested (guest) hypervisor
By default, it does not expose hardware-assisted virtualization extensions to its VMs. Do not expect optimal performance for virtual machines on the guest hypervisor, unless you configure the VM's CPU as "host" and have nested hardware-assisted virtualization extensions enabled on the physical PVE host.
Note: Microsoft Hyper-V as a nested Hypervisor on AMD CPUs should work with Proxmox VE 7. |
- be hosted as a nested (guest) hypervisor
The host hypervisor needs to expose the hardware-assisted virtualization extensions. Proxmox VE can use them to provide better performance to its guests. Otherwise, as in the PVE-inside-PVE case, any VM (KVM) needs to turn off the KVM hardware virtualization (see VM options).
Enable Nested Hardware-assisted Virtualization
Note: VMs with nesting active (vmx/svm flag) cannot be live-migrated! |
To be done on the physical PVE host (or any other hypervisor).
To have nested hardware-assisted virtualization, you have to:
- use AMD cpu or very recent Intel one
- use kernel >= 3.10 (is always the case after Proxmox VE 4.x)
- enable nested support
to check if is enabled do ("kvm_intel" for intel cpu, "kvm_amd" for AMD)
root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested N
N means it's not enabled, to activate it ("kvm-intel" for intel):
# echo "options kvm-intel nested=Y" > /etc/modprobe.d/kvm-intel.conf
(or "kvm-amd" for AMD, note the 1 instead of Y):
# echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf
and reboot or reload the kernel module
modprobe -r kvm_intel modprobe kvm_intel
check again
root@proxmox:~# cat /sys/module/kvm_intel/parameters/nested Y
(pay attention where the dash "-" is used, and where it's underscore "_" instead)
Then create a guest where you install e.g. Proxmox as nested Virtualization Environment. Set the CPU type to "host"
root@guest1# qm set <vmid> --cpu host
Once installed the guest OS, if GNU/Linux you can enter and verify that the hardware virtualization support is enabled by doing
root@guest1# egrep '(vmx|svm)' --color=always /proc/cpuinfo
Example: PVE hosts a PVE guest hypervisor
Set a cluster of self-nested PVE
In the physical host Proxmox you create 2 VM, and in each one install a new instance of PVE, so you can experiment with cluster concepts without the need of having multiple physical servers.
- log into (web gui) your host pve (running on real hardware)
=> PVE
- create two or more vm guests (kvm) in your host PVE, each with enough ram/disk and install PVE from iso on the each guest vm (same network)
=> PVE => VMPVE1 (guest PVE) => PVE => VMPVE2 (guest PVE) ...
- log into (ssh/console) the first guest vm & create cluster CLUSTERNAME
=> PVE => VMPVE1 (guest PVE) => #pvecm create CLUSTERNAME
- log into each other guest vm & join cluster <CLUSTERNAME>
=> PVE => VMPVE2 (guest PVE) => #pvecm add <IP address of VM1>
- log into (web gui) any guest vm (guest pve) and manage the new (guest) cluster
=> PVE => VMPVE1/2 (guest PVE) => #pvecm n
- create vm or ct inside the guest pve (nodes of CLUSTERNAME)
- if you did't enable hardware-assisted nested virtualization, you have to turn off KVM hardware virtualization (see VM options)
- install only CLI based, small ct or vm for those guest (do not try anything with a GUI, don't even think of running Windows...)
=> PVE => VMPVE1/2 (guest PVE) => VM/CT
- install something on a VM (for example a basic ubuntu server) from ISO
=> PVE => VMPVE2 (guest PVE) => VM (basic ubuntu server)
VM/CT performance without hardware-assisted virtualization extensions
if you can't setup hardware-assisted virtualization extensions for the guest, performance is far from optimal! Use only to practice or test!
- CT (LXC) will be faster, of course, quite usable
- VM (KVM) will be really slow, nearly unusable (you can expect 10x slower or more), since (as said above) they're running without KVM hardware virtualization
but at least you can try or test "guest pve" features or setups:
- you could create a small test cluster to practice with cluster concepts and operations
- you could test a new pve version before upgrading
- you could test setups conflicting with your production setup
Troubleshooting
Bluescreen at boot since Windows 10 1803
To fix this issue set /sys/module/kvm/parameters/ignore_msrs to Y
- temporary
echo "Y" > /sys/module/kvm/parameters/ignore_msrs
- persistent
echo "options kvm ignore_msrs=1" >> /etc/modprobe.d/kvm.conf modprobe -r kvm modprobe kvm