PCI Passthrough: Difference between revisions
(fix) |
(I think this was written for an older kernel or something... This is no longer up to date, also the script was broken) |
||
(11 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only). | PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only). | ||
If you "PCI passthrough" a device, the device is not available to the host anymore. | If you "PCI passthrough" a device, the device is not available to the host anymore. | ||
Line 8: | Line 9: | ||
PCI passthrough is an experimental feature in Proxmox VE | PCI passthrough is an experimental feature in Proxmox VE | ||
== Enable the IOMMU == | |||
You need to enable the IOMMU, by [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline editing the kernel commandline]. | |||
First open your bootloader kernel command line config file, for grub: | |||
nano /etc/default/grub | |||
or systemd-boot | |||
nano /etc/kernel/cmdline | |||
Find the line with "GRUB_CMDLINE_LINUX_DEFAULT" (for GRUB), create the file for systemd-boot (it'S format is a single line with options) | |||
=== Intel CPU === | |||
For Intel CPUs add "intel_iommu=on", for example: | |||
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on" | |||
GRUB_CMDLINE_LINUX_DEFAULT="quiet | |||
Safe the changes and update grub: | |||
update-grub | |||
or: | |||
pve-efiboot-tool refresh | |||
==Required | Then reboot, after that run "dmesg | grep -e DMAR -e IOMMU" from the command line. If there is no output, then something is wrong. | ||
=== AMD CPU === | |||
For AMD CPUs add "amd_iommu=on", for example: | |||
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on" | |||
Safe the changes and update grub: | |||
update-grub | |||
or: | |||
pve-efiboot-tool refresh | |||
Then reboot, after that run "dmesg | grep -e DMAR -e IOMMU" from the command line. If there is no output, then something is wrong. | |||
=== PT Mode === | |||
Both Intel and AMD chips can use the additional parameter "iommu=pt", added in the same way as above. | |||
This enables the IOMMU translation only when necessary, and can thus improve performance for PCIe devices '''not''' used in VMs. | |||
== Required Modules == | |||
add to /etc/modules | add to /etc/modules | ||
Line 50: | Line 64: | ||
</pre> | </pre> | ||
Note that in the 5.4 based kernel (will be used for Proxmox VE 6.2 in Q2/2020) some of those modules are already built into the kernel directly. | |||
== IOMMU Interrupt Remapping == | |||
It will not be possible to use PCI passthrough without interrupt remapping. Device assignment will fail with 'Failed to assign device "[device name]": Operation not permitted' or 'Interrupt Remapping hardware not found, passing devices to unprivileged domains is insecure.' error. | |||
All systems using an Intel processor and chipset that have support for Intel Virtualization Technology for Directed I/O (VT-d), but do not have support for interrupt remapping will see such an error. Interrupt remapping support is provided in newer processors and chipsets (both AMD and Intel). | |||
To identify if your system has support for interrupt remapping: | |||
<pre> | |||
dmesg | grep 'remapping' | |||
</pre> | |||
If you see one of the following lines: | |||
* "AMD-Vi: Interrupt remapping enabled" | |||
* "DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work) | |||
then remapping is supported. | |||
If your system doesn't support interrupt remapping, you can allow unsafe interrupts with: | |||
If your system doesn't support interrupt remapping, | |||
you can allow unsafe interrupts with: | |||
<pre> | <pre> | ||
Line 93: | Line 91: | ||
</pre> | </pre> | ||
== Verify IOMMU | == Verify IOMMU Isolation== | ||
For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM. | |||
You should have something like | You should have something like: | ||
<pre> | <pre> | ||
Line 125: | Line 122: | ||
</pre> | </pre> | ||
To have separate IOMMU groups, your processor needs to have support for a feature called ACS (Access Control Services). | |||
All Xeon processor support them (E3,E5) excluding Xeon E3-1200. | |||
All Xeon processor support them (E3,E5) excluding Xeon E3-1200 | |||
For | For Intel Core it's different, only some processors support ACS. Anything newer than listed below should support ACS, as long as VT-d is supported. See https://ark.intel.com for more info. | ||
<pre> | <pre> | ||
Line 151: | Line 146: | ||
</pre> | </pre> | ||
AMD chips from Ryzen 1st generation and newer are fine too. | |||
If you don't have dedicated IOMMU groups, you can try: | |||
1) moving the card to another pci slot | |||
2) adding "pcie_acs_override=downstream" to kernel boot commandline (grub or systemd-boot) options, which can help on some setup with bad ACS implementation. | |||
: Checkout the documentation [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline about Editing the kernel commandline] | |||
More infos: | |||
http://vfio.blogspot.be/2015/10/intel-processors-with-acs-support.html | |||
http://vfio.blogspot.be/2014/08/iommu-groups-inside-and-out.html | http://vfio.blogspot.be/2014/08/iommu-groups-inside-and-out.html | ||
== Determine your PCI card address, and configure your VM == | |||
The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's hardware tab. | |||
Alternatively, you can use the command line: | |||
Locate your card using "lspci". The address should be in the form of: 01:00.0 | |||
Edit the <vmid>.conf file. It can be located at: /etc/pve/qemu-server/vmid.conf. | |||
Add this line to the end of the file: | Add this line to the end of the file: | ||
Line 180: | Line 174: | ||
</pre> | </pre> | ||
If you have a multi-function device (like a vga card with embedded audio chipset), | If you have a multi-function device (like a vga card with embedded audio chipset), you can pass all functions manually with: | ||
you can pass all functions manually with | |||
<pre> | <pre> | ||
hostpci0: 01:00.0;01:00.1 | hostpci0: 01:00.0;01:00.1 | ||
</pre> | </pre> | ||
or | or, to pass all functions automatically: | ||
to pass all functions automatically | |||
<pre> | <pre> | ||
hostpci0: 01:00 | hostpci0: 01:00 | ||
</pre> | </pre> | ||
== PCI | == PCI Express Passthrough == | ||
Check the "PCI-E" checkbox in the GUI when adding your device, or manually add the pcie=1 parameter to your VM config: | |||
<pre> | <pre> | ||
machine: q35 | machine: q35 | ||
Line 208: | Line 192: | ||
</pre> | </pre> | ||
== GPU | PCIe passthrough is only supported on Q35 machines. | ||
Note that this does not mean that devices assigned without this setting will only have PCI speeds, it just sets a flag for the guest to tell it that the device is a PCIe device instead of a "really-fast legacy PCI device". Some guest applications benefit from this. | |||
== GPU Passthrough == | |||
{{Note|See http://blog.quindorian.org/2018/03/building-a-2u-amd-ryzen-server-proxmox-gpu-passthrough.html/ if you like an article with a HOWTO approach.}} | |||
* | * AMD RADEON 5xxx, 6xxx, 7xxx, Navi 5XXX(XT), NVIDIA GEFORCE 7, 8, GTX 4xx, 5xx, 6xx, 7xx, 9xx, 10xx and RTX 16xx/20xx have been reported working. | ||
* | * You might need to load some specific options in grub.cfg or other tuning values to get your configuration specifically working/stable | ||
* Here a good forum thread of archlinux: https://bbs.archlinux.org/viewtopic.php?id=162768 | * Here's a good forum thread of archlinux: https://bbs.archlinux.org/viewtopic.php?id=162768 | ||
For GPU, it's | For a GPU, it's often helpful if the host doesn't try to use the GPU, which avoids issues with the host driver unbinding and re-binding to the device. Sometimes making sure the host BIOS POST messages are displayed on a different GPU is helpful too. This can sometimes be acomplished via BIOS settings, moving the card to a different slot or enabling/disabling legacy boot support. | ||
First, find the device and vendor id of your vga card | First, find the device and vendor id of your vga card: | ||
<pre> | <pre> | ||
Line 224: | Line 214: | ||
</pre> | </pre> | ||
The Vendor:Device IDs for | The Vendor:Device IDs for this GPU and it's audio functions are therefore 10de:1381, 10de:0fbc. | ||
Then, create a file | Then, create a file: | ||
<pre> | <pre> | ||
echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf | echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf | ||
</pre> | </pre> | ||
blacklist the drivers: | |||
<pre> | <pre> | ||
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf | echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf | ||
Line 238: | Line 227: | ||
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf | echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf | ||
</pre> | </pre> | ||
and reboot your machine. | |||
For VM configuration, They are 4 configurations possible: | |||
=== GPU OVMF PCI Passthrough (recommended) === | |||
Select "OVMF" as "BIOS" for your VM instead of the default "SeaBIOS". | |||
You need to install your guest OS with uefi support. (for Windows, try win >=8) | |||
Using OVMF, you can also add disable_vga=1 to vfio-pci module, which try to to opt-out devices from vga arbitration if possible: | |||
<pre> | <pre> | ||
echo "options vfio-pci ids=10de:1381,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf | echo "options vfio-pci ids=10de:1381,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf | ||
</pre> | </pre> | ||
and you need to your | and you need to make sure your graphics card has an UEFI bootable rom: | ||
http://vfio.blogspot.fr/2014/08/does-my-graphics-card-rom-support-efi.html | http://vfio.blogspot.fr/2014/08/does-my-graphics-card-rom-support-efi.html | ||
Line 261: | Line 253: | ||
</pre> | </pre> | ||
=== GPU OVMF PCI | === GPU OVMF PCI Express Passthrough === | ||
Same as above, but set machine type to q35 and enable pcie=1: | |||
<pre> | <pre> | ||
bios: ovmf | bios: ovmf | ||
Line 278: | Line 265: | ||
</pre> | </pre> | ||
=== GPU Seabios PCI | === GPU Seabios PCI Passthrough === | ||
<pre> | <pre> | ||
hostpci0: 01:00,x-vga=on | hostpci0: 01:00,x-vga=on | ||
</pre> | </pre> | ||
=== GPU Seabios PCI | === GPU Seabios PCI Express Passthrough === | ||
<pre> | <pre> | ||
machine: q35 | machine: q35 | ||
Line 289: | Line 276: | ||
</pre> | </pre> | ||
=== How to | === How to know if a Graphics Card is UEFI (OVMF) compatible === | ||
Get and compile the software "rom-parser" | Get and compile the software "rom-parser": | ||
<pre> | <pre> | ||
$ git clone https://github.com/awilliam/rom-parser | $ git clone https://github.com/awilliam/rom-parser | ||
Line 298: | Line 285: | ||
</pre> | </pre> | ||
Then dump the rom of you vga card | Then dump the rom of you vga card: | ||
<pre> | <pre> | ||
# cd /sys/bus/pci/devices/0000:01:00.0/ | # cd /sys/bus/pci/devices/0000:01:00.0/ | ||
# echo 1 > rom | # echo 1 > rom | ||
Line 307: | Line 293: | ||
</pre> | </pre> | ||
and test it with | and test it with: | ||
<pre> | <pre> | ||
./rom-parser /tmp/image.rom | ./rom-parser /tmp/image.rom | ||
Line 323: | Line 309: | ||
To be UEFI compatible, you need a "type 3" in the result. | To be UEFI compatible, you need a "type 3" in the result. | ||
=== | === NVIDIA Tips === | ||
Some applications like geforce experience, Passmark Performance Test and SiSoftware Sandra crash can crash the | Some Windows applications like geforce experience, Passmark Performance Test and SiSoftware Sandra crash can crash the VM. | ||
You need to add: | |||
<pre> | <pre> | ||
echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf | echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf | ||
</pre> | </pre> | ||
User have reported that | If you see a lot of warning messages in your 'dmesg' system log, add the following instead: | ||
<pre> | |||
echo "options kvm ignore_msrs=1 report_ignored_msrs=0" > /etc/modprobe.d/kvm.conf | |||
</pre> | |||
User have reported that NVIDIA Kepler K80 GPUs need this in vmid.conf: | |||
<pre> | <pre> | ||
args: -machine pc,max-ram-below-4g=1G | args: -machine pc,max-ram-below-4g=1G | ||
</pre> | </pre> | ||
=== The 'romfile' Option === | |||
http://lime-technology.com/forum/index.php?topic=43644.msg482110#msg482110 | http://lime-technology.com/forum/index.php?topic=43644.msg482110#msg482110 | ||
Some | Some motherboards can't passthrough GPUs on the first PCI(e) slot by default, because its vbios is shadowed during bootup. You need to capture its vBIOS when its working "normally" (i.e. installed in a different slot), then you can move the card to slot 1 and start the vm using the dumped vBIOS. | ||
To dump the bios: | |||
<pre> | <pre> | ||
cd /sys/bus/pci/devices/0000:01:00.0/ | cd /sys/bus/pci/devices/0000:01:00.0/ | ||
Line 351: | Line 341: | ||
</pre> | </pre> | ||
Then you can pass the vbios file (must be located in /usr/share/kvm/) with: | |||
<pre> | <pre> | ||
hostpci0: 01:00,x-vga=on,romfile=vbios.bin | hostpci0: 01:00,x-vga=on,romfile=vbios.bin | ||
</pre> | </pre> | ||
== Troubleshooting == | |||
=== BAR 3: can't reserve [mem] error === | === BAR 3: can't reserve [mem] error === | ||
If you have this error when try to use the card | If you have this error when you try to use the card for a VM: | ||
<pre> | <pre> | ||
vfio-pci 0000:04:00.0: BAR 3: can't reserve [mem 0xca000000-0xcbffffff 64bit] | vfio-pci 0000:04:00.0: BAR 3: can't reserve [mem 0xca000000-0xcbffffff 64bit] | ||
</pre> | </pre> | ||
you can try to add | you can try to add the following kernel commandline option: | ||
<pre> | <pre> | ||
video=efifb:off | video=efifb:off | ||
</pre> | </pre> | ||
Checkout the documentation [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline about Editing the kernel commandline] | |||
=== SPICE === | === SPICE === | ||
Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that even when both cards show up. | Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that, even when both cards show up. | ||
It's always worth a try to disable SPICE and check again if something fails. | It's always worth a try to disable SPICE and check again if something fails. | ||
== | === HDMI Audio crackling/broken === | ||
Some digital audio devices (usually added via GPU functions) may require MSI (Message Signaled Interrupts) to be enabled to function correctly. If you experience any issues, try changing MSI settings in the guest and rebooting the guest. | |||
A Windows-Tool to simplify this is available here: https://github.com/CHEF-KOCH/MSI-utility/releases/latest | |||
Linux guests usually enable MSI by themselves. To force use of MSI for GPU audio devices, use the following command and reboot: | |||
<pre> | |||
echo "options snd-hda-intel enable_msi=1" >> /etc/modprobe.d/snd-hda-intel.conf | |||
</pre> | |||
Use 'lspci -vv' and check for the following line on your device to see if MSI is enabled: | |||
<pre> | |||
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ | |||
</pre> | |||
If it says 'Enable+', MSI is working, 'Enable-' means it is supported but disabled, and if the line is missing, MSI is not supported by the PCIe hardware. | |||
This can potentially also improve performance for other passthrough devices, including GPUs, but that depends on the hardware being used. | |||
=== BIOS options === | |||
Make sure you are using the most recent BIOS version for you mainboard. Often IOMMU groupings or passthrough support in general is improved in later versions. | |||
Some general BIOS options that might need changing to allow passthrough to work: | |||
* IOMMU or VT-d: Set to 'Enabled' or equivalent, often 'Auto' is not the same | |||
* 'Legacy boot' or CSM: For GPU passthrough it can help to disable this, but keep in mind that PVE has to be installed in UEFI mode, as it will not boot in BIOS mode without this enabled. The reason for disabling this is that it avoids legacy VGA initialization of installed GPUs, making them able to be re-initialized later, as required for passthrough. Most useful when trying to use passthrough in single GPU systems. | |||
== Verify Operation == | |||
Start the VM and enter the qm monitor onn the CLI: "qm monitor vmnumber" | |||
Verify that your card is listed here: "info pci" | Verify that your card is listed here: "info pci" | ||
Then install drivers on your guest OS. | |||
Then install drivers on your guest OS. | |||
NOTE: Card support might be limited to 2 or 3 devices. | NOTE: Card support might be limited to 2 or 3 devices. | ||
NOTE: | NOTE: A PCI device can only ever be attached to a single VM. | ||
NOTE: This process will remove the card from the proxmox host OS as long as the VM it's attached to is running. | |||
NOTE: Using PCI passthrough to present drives direct to a ZFS (FreeNAS, Openfiler, OmniOS) virtual machine is OK for testing, but '''not recommended''' for production use. Specific FreeNAS warnings can be found here: http://forums.freenas.org/threads/absolutely-must-virtualize-freenas-a-guide-to-not-completely-losing-your-data.12714/ | |||
== USB | == USB Passthrough == | ||
If you need to passthrough usb devices (keyboard, mouse), please follow this wiki article: | |||
please follow this wiki: | |||
https://pve.proxmox.com/wiki/USB_physical_port_mapping | https://pve.proxmox.com/wiki/USB_physical_port_mapping | ||
[[Category:HOWTO]] | [[Category:HOWTO]] |
Revision as of 14:55, 23 April 2020
Introduction
PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only).
If you "PCI passthrough" a device, the device is not available to the host anymore.
Note:
PCI passthrough is an experimental feature in Proxmox VE
Enable the IOMMU
You need to enable the IOMMU, by editing the kernel commandline.
First open your bootloader kernel command line config file, for grub:
nano /etc/default/grub
or systemd-boot
nano /etc/kernel/cmdline
Find the line with "GRUB_CMDLINE_LINUX_DEFAULT" (for GRUB), create the file for systemd-boot (it'S format is a single line with options)
Intel CPU
For Intel CPUs add "intel_iommu=on", for example:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
Safe the changes and update grub:
update-grub
or: pve-efiboot-tool refresh
Then reboot, after that run "dmesg | grep -e DMAR -e IOMMU" from the command line. If there is no output, then something is wrong.
AMD CPU
For AMD CPUs add "amd_iommu=on", for example:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
Safe the changes and update grub:
update-grub
or:
pve-efiboot-tool refresh
Then reboot, after that run "dmesg | grep -e DMAR -e IOMMU" from the command line. If there is no output, then something is wrong.
PT Mode
Both Intel and AMD chips can use the additional parameter "iommu=pt", added in the same way as above.
This enables the IOMMU translation only when necessary, and can thus improve performance for PCIe devices not used in VMs.
Required Modules
add to /etc/modules
vfio vfio_iommu_type1 vfio_pci vfio_virqfd
Note that in the 5.4 based kernel (will be used for Proxmox VE 6.2 in Q2/2020) some of those modules are already built into the kernel directly.
IOMMU Interrupt Remapping
It will not be possible to use PCI passthrough without interrupt remapping. Device assignment will fail with 'Failed to assign device "[device name]": Operation not permitted' or 'Interrupt Remapping hardware not found, passing devices to unprivileged domains is insecure.' error.
All systems using an Intel processor and chipset that have support for Intel Virtualization Technology for Directed I/O (VT-d), but do not have support for interrupt remapping will see such an error. Interrupt remapping support is provided in newer processors and chipsets (both AMD and Intel).
To identify if your system has support for interrupt remapping:
dmesg | grep 'remapping'
If you see one of the following lines:
- "AMD-Vi: Interrupt remapping enabled"
- "DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work)
then remapping is supported.
If your system doesn't support interrupt remapping, you can allow unsafe interrupts with:
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
Verify IOMMU Isolation
For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.
You should have something like:
# find /sys/kernel/iommu_groups/ -type l /sys/kernel/iommu_groups/0/devices/0000:00:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.0 /sys/kernel/iommu_groups/1/devices/0000:01:00.0 /sys/kernel/iommu_groups/1/devices/0000:01:00.1 /sys/kernel/iommu_groups/2/devices/0000:00:02.0 /sys/kernel/iommu_groups/3/devices/0000:00:16.0 /sys/kernel/iommu_groups/4/devices/0000:00:1a.0 /sys/kernel/iommu_groups/5/devices/0000:00:1b.0 /sys/kernel/iommu_groups/6/devices/0000:00:1c.0 /sys/kernel/iommu_groups/7/devices/0000:00:1c.5 /sys/kernel/iommu_groups/8/devices/0000:00:1c.6 /sys/kernel/iommu_groups/9/devices/0000:00:1c.7 /sys/kernel/iommu_groups/9/devices/0000:05:00.0 /sys/kernel/iommu_groups/10/devices/0000:00:1d.0 /sys/kernel/iommu_groups/11/devices/0000:00:1f.0 /sys/kernel/iommu_groups/11/devices/0000:00:1f.2 /sys/kernel/iommu_groups/11/devices/0000:00:1f.3 /sys/kernel/iommu_groups/12/devices/0000:02:00.0 /sys/kernel/iommu_groups/12/devices/0000:02:00.1 /sys/kernel/iommu_groups/13/devices/0000:03:00.0 /sys/kernel/iommu_groups/14/devices/0000:04:00.0
To have separate IOMMU groups, your processor needs to have support for a feature called ACS (Access Control Services).
All Xeon processor support them (E3,E5) excluding Xeon E3-1200.
For Intel Core it's different, only some processors support ACS. Anything newer than listed below should support ACS, as long as VT-d is supported. See https://ark.intel.com for more info.
Haswell-E (LGA2011-v3) i7-5960X (8-core, 3/3.5GHz) i7-5930K (6-core, 3.2/3.8GHz) i7-5820K (6-core, 3.3/3.6GHz) Ivy Bridge-E (LGA2011) i7-4960X (6-core, 3.6/4GHz) i7-4930K (6-core, 3.4/3.6GHz) i7-4820K (4-core, 3.7/3.9GHz) Sandy Bridge-E (LGA2011) i7-3960X (6-core, 3.3/3.9GHz) i7-3970X (6-core, 3.5/4GHz) i7-3930K (6-core, 3.2/3.8GHz) i7-3820 (4-core, 3.6/3.8GHz)
AMD chips from Ryzen 1st generation and newer are fine too.
If you don't have dedicated IOMMU groups, you can try:
1) moving the card to another pci slot
2) adding "pcie_acs_override=downstream" to kernel boot commandline (grub or systemd-boot) options, which can help on some setup with bad ACS implementation.
- Checkout the documentation about Editing the kernel commandline
More infos:
http://vfio.blogspot.be/2015/10/intel-processors-with-acs-support.html http://vfio.blogspot.be/2014/08/iommu-groups-inside-and-out.html
Determine your PCI card address, and configure your VM
The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's hardware tab.
Alternatively, you can use the command line:
Locate your card using "lspci". The address should be in the form of: 01:00.0 Edit the <vmid>.conf file. It can be located at: /etc/pve/qemu-server/vmid.conf.
Add this line to the end of the file:
hostpci0: 01:00.0
If you have a multi-function device (like a vga card with embedded audio chipset), you can pass all functions manually with:
hostpci0: 01:00.0;01:00.1
or, to pass all functions automatically:
hostpci0: 01:00
PCI Express Passthrough
Check the "PCI-E" checkbox in the GUI when adding your device, or manually add the pcie=1 parameter to your VM config:
machine: q35 hostpci0: 01:00.0,pcie=1
PCIe passthrough is only supported on Q35 machines.
Note that this does not mean that devices assigned without this setting will only have PCI speeds, it just sets a flag for the guest to tell it that the device is a PCIe device instead of a "really-fast legacy PCI device". Some guest applications benefit from this.
GPU Passthrough
Note: See http://blog.quindorian.org/2018/03/building-a-2u-amd-ryzen-server-proxmox-gpu-passthrough.html/ if you like an article with a HOWTO approach. |
- AMD RADEON 5xxx, 6xxx, 7xxx, Navi 5XXX(XT), NVIDIA GEFORCE 7, 8, GTX 4xx, 5xx, 6xx, 7xx, 9xx, 10xx and RTX 16xx/20xx have been reported working.
- You might need to load some specific options in grub.cfg or other tuning values to get your configuration specifically working/stable
- Here's a good forum thread of archlinux: https://bbs.archlinux.org/viewtopic.php?id=162768
For a GPU, it's often helpful if the host doesn't try to use the GPU, which avoids issues with the host driver unbinding and re-binding to the device. Sometimes making sure the host BIOS POST messages are displayed on a different GPU is helpful too. This can sometimes be acomplished via BIOS settings, moving the card to a different slot or enabling/disabling legacy boot support.
First, find the device and vendor id of your vga card:
$ lspci -n -s 01:00 01:00.0 0300: 10de:1381 (rev a2) 01:00.1 0403: 10de:0fbc (rev a1)
The Vendor:Device IDs for this GPU and it's audio functions are therefore 10de:1381, 10de:0fbc.
Then, create a file:
echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf
blacklist the drivers:
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
and reboot your machine.
For VM configuration, They are 4 configurations possible:
GPU OVMF PCI Passthrough (recommended)
Select "OVMF" as "BIOS" for your VM instead of the default "SeaBIOS". You need to install your guest OS with uefi support. (for Windows, try win >=8)
Using OVMF, you can also add disable_vga=1 to vfio-pci module, which try to to opt-out devices from vga arbitration if possible:
echo "options vfio-pci ids=10de:1381,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf
and you need to make sure your graphics card has an UEFI bootable rom: http://vfio.blogspot.fr/2014/08/does-my-graphics-card-rom-support-efi.html
bios: ovmf scsihw: virtio-scsi-pci bootdisk: scsi0 scsi0: ..... hostpci0: 01:00,x-vga=on
GPU OVMF PCI Express Passthrough
Same as above, but set machine type to q35 and enable pcie=1:
bios: ovmf scsihw: virtio-scsi-pci bootdisk: scsi0 scsi0: ..... machine: q35 hostpci0: 01:00,pcie=1,x-vga=on
GPU Seabios PCI Passthrough
hostpci0: 01:00,x-vga=on
GPU Seabios PCI Express Passthrough
machine: q35 hostpci0: 01:00,pcie=1,x-vga=on
How to know if a Graphics Card is UEFI (OVMF) compatible
Get and compile the software "rom-parser":
$ git clone https://github.com/awilliam/rom-parser $ cd rom-parser $ make
Then dump the rom of you vga card:
# cd /sys/bus/pci/devices/0000:01:00.0/ # echo 1 > rom # cat rom > /tmp/image.rom # echo 0 > rom
and test it with:
./rom-parser /tmp/image.rom Valid ROM signature found @0h, PCIR offset 190h PCIR: type 0, vendor: 10de, device: 1280, class: 030000 PCIR: revision 0, vendor revision: 1 Valid ROM signature found @f400h, PCIR offset 1ch PCIR: type 3, vendor: 10de, device: 1280, class: 030000 PCIR: revision 3, vendor revision: 0 EFI: Signature Valid Last image
To be UEFI compatible, you need a "type 3" in the result.
NVIDIA Tips
Some Windows applications like geforce experience, Passmark Performance Test and SiSoftware Sandra crash can crash the VM. You need to add:
echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf
If you see a lot of warning messages in your 'dmesg' system log, add the following instead:
echo "options kvm ignore_msrs=1 report_ignored_msrs=0" > /etc/modprobe.d/kvm.conf
User have reported that NVIDIA Kepler K80 GPUs need this in vmid.conf:
args: -machine pc,max-ram-below-4g=1G
The 'romfile' Option
http://lime-technology.com/forum/index.php?topic=43644.msg482110#msg482110
Some motherboards can't passthrough GPUs on the first PCI(e) slot by default, because its vbios is shadowed during bootup. You need to capture its vBIOS when its working "normally" (i.e. installed in a different slot), then you can move the card to slot 1 and start the vm using the dumped vBIOS.
To dump the bios:
cd /sys/bus/pci/devices/0000:01:00.0/ echo 1 > rom cat rom > /usr/share/kvm/vbios.bin echo 0 > rom
Then you can pass the vbios file (must be located in /usr/share/kvm/) with:
hostpci0: 01:00,x-vga=on,romfile=vbios.bin
Troubleshooting
BAR 3: can't reserve [mem] error
If you have this error when you try to use the card for a VM:
vfio-pci 0000:04:00.0: BAR 3: can't reserve [mem 0xca000000-0xcbffffff 64bit]
you can try to add the following kernel commandline option:
video=efifb:off
Checkout the documentation about Editing the kernel commandline
SPICE
Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that, even when both cards show up. It's always worth a try to disable SPICE and check again if something fails.
HDMI Audio crackling/broken
Some digital audio devices (usually added via GPU functions) may require MSI (Message Signaled Interrupts) to be enabled to function correctly. If you experience any issues, try changing MSI settings in the guest and rebooting the guest.
A Windows-Tool to simplify this is available here: https://github.com/CHEF-KOCH/MSI-utility/releases/latest
Linux guests usually enable MSI by themselves. To force use of MSI for GPU audio devices, use the following command and reboot:
echo "options snd-hda-intel enable_msi=1" >> /etc/modprobe.d/snd-hda-intel.conf
Use 'lspci -vv' and check for the following line on your device to see if MSI is enabled:
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
If it says 'Enable+', MSI is working, 'Enable-' means it is supported but disabled, and if the line is missing, MSI is not supported by the PCIe hardware.
This can potentially also improve performance for other passthrough devices, including GPUs, but that depends on the hardware being used.
BIOS options
Make sure you are using the most recent BIOS version for you mainboard. Often IOMMU groupings or passthrough support in general is improved in later versions.
Some general BIOS options that might need changing to allow passthrough to work:
- IOMMU or VT-d: Set to 'Enabled' or equivalent, often 'Auto' is not the same
- 'Legacy boot' or CSM: For GPU passthrough it can help to disable this, but keep in mind that PVE has to be installed in UEFI mode, as it will not boot in BIOS mode without this enabled. The reason for disabling this is that it avoids legacy VGA initialization of installed GPUs, making them able to be re-initialized later, as required for passthrough. Most useful when trying to use passthrough in single GPU systems.
Verify Operation
Start the VM and enter the qm monitor onn the CLI: "qm monitor vmnumber" Verify that your card is listed here: "info pci" Then install drivers on your guest OS.
NOTE: Card support might be limited to 2 or 3 devices.
NOTE: A PCI device can only ever be attached to a single VM.
NOTE: This process will remove the card from the proxmox host OS as long as the VM it's attached to is running.
NOTE: Using PCI passthrough to present drives direct to a ZFS (FreeNAS, Openfiler, OmniOS) virtual machine is OK for testing, but not recommended for production use. Specific FreeNAS warnings can be found here: http://forums.freenas.org/threads/absolutely-must-virtualize-freenas-a-guide-to-not-completely-losing-your-data.12714/
USB Passthrough
If you need to passthrough usb devices (keyboard, mouse), please follow this wiki article: