Difference between revisions of "Pci passthrough"

From Proxmox VE
Jump to navigation Jump to search
 
(85 intermediate revisions by 10 users not shown)
Line 1: Line 1:
PCI pass trhough allows you to use a physical PCI device ( graphic card, network card ...)  form your host directly on the VM (KVM only)
+
== Introduction ==
If you "PCI passthrough" a device, the device is not available in the host anymore, only in the VM.
 
  
To enable PCI passthrough, you need to configure:
+
PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only).
  
== INTEL CPU  ==
+
If you "PCI passthrough" a device, the device is not available to the host anymore.
  
<br> edit: <source lang="bash">
+
'''Note:'''
#vi /etc/default/grub
 
</source> change <source lang="bash">
 
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
 
</source> to <source lang="bash">
 
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
 
</source> then <source lang="bash">
 
# update-grub
 
# reboot
 
</source>
 
  
<br>
+
PCI passthrough is an experimental feature in Proxmox VE! '''VMs with passthroughed devices cannot be migrated.'''
  
Then run "dmesg | grep -e DMAR -e IOMMU" from the command line. &nbsp;If there is no output, then something is wrong.
+
== Enable the IOMMU ==
  
== AMD CPU==
+
You need to enable the IOMMU, by [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline editing the kernel commandline].
  
Edit:
+
First open your bootloader kernel command line config file.
  
<source lang="bash">
+
For '''GRUB''':
# vi /etc/default/grub
+
nano /etc/default/grub
</source>
 
  
Change: <source lang="bash">
+
Find the line with "GRUB_CMDLINE_LINUX_DEFAULT"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
 
</source> To: <source lang="bash">
 
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
 
</source> Then: <source lang="bash">
 
# update-grub
 
# echo "options kvm allow_unsafe_assigned_interrupts=1" > /etc/modprobe.d/kvm_iommu_map_guest.conf
 
# reboot
 
</source>
 
  
 +
For '''systemd-boot''':
 +
nano /etc/kernel/cmdline
  
 +
Its format is a single line with options. You can create the file for systemd-boot if not present.
  
== Determine your PCI card address, and configure your VM ==
+
=== Intel CPU ===
 +
 
 +
For Intel CPUs add
 +
  intel_iommu=on
 +
 
 +
==== GRUB ====
 +
 
 +
If you are using GRUB:
 +
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
 +
 
 +
Then save the changes and update grub:
 +
update-grub
 +
 
 +
==== systemd-boot ====
 +
If you use systemd-boot, add the following at the end of the first line:
 +
quiet intel_iommu=on
 +
 
 +
Then save the changes and update systemd-boot:
 +
proxmox-boot-tool refresh
 +
 
 +
=== AMD CPU ===
 +
 
 +
For AMD CPUs add
 +
  amd_iommu=on
 +
 
 +
==== GRUB ====
 +
 
 +
If you are using GRUB:
 +
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
 +
 
 +
Then save the changes and update grub:
 +
update-grub
 +
 
 +
==== systemd-boot ====
 +
If you are using systemd-boot, add the following at the end of the first line:
 +
quiet amd_iommu=on
 +
 
 +
Then save the changes and update systemd-boot:
 +
proxmox-boot-tool refresh
 +
 
 +
=== Verify IOMMU is enabled ===
 +
 
 +
Reboot, then run:
 +
dmesg | grep -e DMAR -e IOMMU
 +
 
 +
There should be a line that looks like "DMAR: IOMMU enabled". If there is no output, something is wrong.
 +
 
 +
=== PT Mode ===
 +
 
 +
Both Intel and AMD chips can use the additional parameter "iommu=pt", added in the same way as above to the kernel cmdline.
 +
 
 +
  iommu=pt
 +
 
 +
This enables the IOMMU translation only when necessary, the adapter does not need to use DMA translation to the memory, and can thus improve performance for '''hypervisor''' PCIe devices (which are not passthroughed to a VM)
 +
 
 +
== Required Modules ==
 +
add to /etc/modules
 +
 
 +
<pre>
 +
vfio
 +
vfio_iommu_type1
 +
vfio_pci
 +
vfio_virqfd
 +
</pre>
 +
 
 +
Note that in the 5.4 based kernels some of those modules are already built into the kernel directly.
 +
 
 +
== IOMMU Interrupt Remapping ==
 +
 
 +
It will not be possible to use PCI passthrough without interrupt remapping. Device assignment will fail with 'Failed to assign device "[device name]": Operation not permitted' or 'Interrupt Remapping hardware not found, passing devices to unprivileged domains is insecure.' error.
 +
 
 +
All systems using an Intel processor and chipset that have support for Intel Virtualization Technology for Directed I/O (VT-d), but do not have support for interrupt remapping will see such an error. Interrupt remapping support is provided in newer processors and chipsets (both AMD and Intel).
 +
 
 +
To identify if your system has support for interrupt remapping:
  
 +
<pre>
 +
dmesg | grep 'remapping'
 +
</pre>
  
 +
If you see one of the following lines:
  
Locate your card using "lspci". &nbsp;The address should be in the form of: 04:00.0
+
* "AMD-Vi: Interrupt remapping enabled"
 +
* "DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work)
  
Manually edit the node.conf file. &nbsp;It can be located at:&nbsp;/etc/pve/nodes/proxmox3/qemu-server/vmnumber.conf.
+
then remapping is supported.
  
Add this line to the end of the file: "hostpci0: 04:00.0"
+
If your system doesn't support interrupt remapping, you can allow unsafe interrupts with:
  
 +
<pre>
 +
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
 +
</pre>
  
 +
== Verify IOMMU Isolation==
  
== Verify Operation ==
+
For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.
 +
 
 +
You should have something like:
 +
 
 +
<pre>
 +
# find /sys/kernel/iommu_groups/ -type l
 +
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
 +
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
 +
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
 +
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
 +
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
 +
/sys/kernel/iommu_groups/3/devices/0000:00:16.0
 +
/sys/kernel/iommu_groups/4/devices/0000:00:1a.0
 +
/sys/kernel/iommu_groups/5/devices/0000:00:1b.0
 +
/sys/kernel/iommu_groups/6/devices/0000:00:1c.0
 +
/sys/kernel/iommu_groups/7/devices/0000:00:1c.5
 +
/sys/kernel/iommu_groups/8/devices/0000:00:1c.6
 +
/sys/kernel/iommu_groups/9/devices/0000:00:1c.7
 +
/sys/kernel/iommu_groups/9/devices/0000:05:00.0
 +
/sys/kernel/iommu_groups/10/devices/0000:00:1d.0
 +
/sys/kernel/iommu_groups/11/devices/0000:00:1f.0
 +
/sys/kernel/iommu_groups/11/devices/0000:00:1f.2
 +
/sys/kernel/iommu_groups/11/devices/0000:00:1f.3
 +
/sys/kernel/iommu_groups/12/devices/0000:02:00.0
 +
/sys/kernel/iommu_groups/12/devices/0000:02:00.1
 +
/sys/kernel/iommu_groups/13/devices/0000:03:00.0
 +
/sys/kernel/iommu_groups/14/devices/0000:04:00.0
 +
</pre>
  
 +
To have separate IOMMU groups, your processor needs to have support for a feature called ACS (Access Control Services). Make sure you enable the corresponding setting in your BIOS for this.
  
 +
All Xeon processor support them (E3,E5) excluding Xeon E3-1200.
  
Start the VM from the UI.
+
For Intel Core it's different, only some processors support ACS. Anything newer than listed below should support ACS, as long as VT-d is supported. See https://ark.intel.com for more info.
  
Enter the qm monitor. &nbsp;"qm monitor vmnumber"
+
<pre>
 +
Haswell-E (LGA2011-v3)
 +
i7-5960X (8-core, 3/3.5GHz)
 +
i7-5930K (6-core, 3.2/3.8GHz)
 +
i7-5820K (6-core, 3.3/3.6GHz)
  
Verify that your card is listed here: "info pci"
+
Ivy Bridge-E (LGA2011)
 +
i7-4960X (6-core, 3.6/4GHz)
 +
i7-4930K (6-core, 3.4/3.6GHz)
 +
i7-4820K (4-core, 3.7/3.9GHz)
  
Then install drivers on your guest OS. &nbsp;
+
Sandy Bridge-E (LGA2011)
 +
i7-3960X (6-core, 3.3/3.9GHz)
 +
i7-3970X (6-core, 3.5/4GHz)
 +
i7-3930K (6-core, 3.2/3.8GHz)
 +
i7-3820 (4-core, 3.6/3.8GHz)
 +
</pre>
  
 +
AMD chips from Ryzen 1st generation and newer are fine too.
  
 +
If you don't have dedicated IOMMU groups, you can try:
  
NOTE: Card support might be limited to 2 or 3 devices.
+
1) moving the card to another pci slot
  
NOTE: This process will remove the card from the proxmox host OS. &nbsp;
+
2) adding "pcie_acs_override=downstream" to kernel boot commandline (grub or systemd-boot) options, which can help on some setup with bad ACS implementation.
 +
: Checkout the documentation [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline about Editing the kernel commandline]
  
Editorial Note: Using PCI passthrough to present drives direct to a ZFS (FreeNAS, Openfiler, OmniOS) virtual machine is dangerous on many levels and is not recommended for production use.  Specific FreeNAS warnings can be found here:  http://forums.freenas.org/threads/absolutely-must-virtualize-freenas-a-guide-to-not-completely-losing-your-data.12714/ 
+
More infos:
  
 +
http://vfio.blogspot.be/2015/10/intel-processors-with-acs-support.html 
 +
http://vfio.blogspot.be/2014/08/iommu-groups-inside-and-out.html
  
 +
== Determine your PCI card address, and configure your VM ==
  
== PCI EXPRESS PASSTHROUGH ==
+
The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's hardware tab.
  
Since proxmox 3.3, it's possible to passthrough pci express device (including nvidia/amd graphic card)
+
Alternatively, you can use the command line:
  
 +
Locate your card using "lspci". The address should be in the form of: 01:00.0
 +
Edit the <vmid>.conf file. It can be located at: /etc/pve/qemu-server/vmid.conf.
  
you need to run pve-kernel 3.10
+
Add this line to the end of the file:
 +
<pre>
 +
hostpci0: 01:00.0
 +
</pre>
  
 +
If you have a multi-function device  (like a vga card with embedded audio chipset), you can pass all functions manually with:
 +
<pre>
 +
hostpci0: 01:00.0;01:00.1
 +
</pre>
  
 +
or, to pass all functions automatically:
 
<pre>
 
<pre>
/etc/pve/qemuserver/<vmid>.cfg
+
hostpci0: 01:00
 
</pre>
 
</pre>
  
simple pci-express passthrough
+
== PCI Express Passthrough ==
 +
 
 +
Check the "PCI-E" checkbox in the GUI when adding your device, or manually add the pcie=1 parameter to your VM config:
 
<pre>
 
<pre>
 
machine: q35
 
machine: q35
hostpci0: 04:00.0,pcie=1,driver=vfio
+
hostpci0: 01:00.0,pcie=1
 
</pre>
 
</pre>
  
vga pci-express passthrough
+
PCIe passthrough is only supported on Q35 machines.
 +
 
 +
Note that this does not mean that devices assigned without this setting will only have PCI speeds, it just sets a flag for the guest to tell it that the device is a PCIe device instead of a "really-fast legacy PCI device". Some guest applications benefit from this.
 +
 
 +
== GPU Passthrough ==
 +
 
 +
{{Note|See http://blog.quindorian.org/2018/03/building-a-2u-amd-ryzen-server-proxmox-gpu-passthrough.html/ if you like an article with a HOWTO approach. (NOTE: you usually do not need the ROM-file dumping mentioned at the end!)}}
 +
 
 +
* AMD RADEON 5xxx, 6xxx, 7xxx, NVIDIA GEFORCE 7, 8, GTX 4xx, 5xx, 6xx, 7xx, 9xx, 10xx and RTX 16xx/20xx have been reported working.
 +
* AMD Navi (5xxx(XT)/6xxx(XT)) suffer from the reset bug (see https://github.com/gnif/vendor-reset), and while dedicated users have managed to get them to run, they require a lot more effort and will probably not work entirely stable
 +
* You might need to load some specific options in grub.cfg or other tuning values to get your configuration specifically working/stable
 +
* Here's a good forum thread of archlinux: https://bbs.archlinux.org/viewtopic.php?id=162768
 +
 
 +
For starters, it's often helpful if the host doesn't try to use the GPU, which avoids issues with the host driver unbinding and re-binding to the device. Sometimes making sure the host BIOS POST messages are displayed on a different GPU is helpful too. This can sometimes be acomplished via BIOS settings, moving the card to a different slot or enabling/disabling legacy boot support.
 +
 
 +
First, find the device and vendor id of your vga card:
 +
 
 
<pre>
 
<pre>
machine: q35
+
$ lspci -n -s 01:00
hostpci0: 04:00.0,x-vga=on,pcie=1,driver=vfio
+
01:00.0 0300: 10de:1381 (rev a2)
 +
01:00.1 0403: 10de:0fbc (rev a1)
 
</pre>
 
</pre>
  
multi-function pciexpress device. (like a vga card with embedded audio chipset).
+
The Vendor:Device IDs for this GPU and it's audio functions are therefore 10de:1381, 10de:0fbc.
  
Remove the .0 in pci address.
+
Then, create a file:
 +
<pre>
 +
echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf
 +
</pre>
  
 +
blacklist the drivers:
 
<pre>
 
<pre>
machine: q35
+
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf
hostpci0: 04:00,x-vga=on,pcie=1,driver=vfio
+
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
 +
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf
 
</pre>
 
</pre>
  
== GPU PASSTHROUGH NOTES ==
+
and reboot your machine.
  
MD RADEON 5xxx, 6xxx, 7xxx and NVIDIA GEFORCE 7, 8, 4xx, 5xx, 6xx, 7xx have been reported working.
+
For VM configuration, They are 4 configurations possible:
  
intel IGD'S WONT WORK currently with proxmox kernel 3.10, try with debian kernel > 3.16.
+
=== GPU OVMF PCI Passthrough  (recommended) ===
  
Maybe you'll need to load some specific options in grub.cfg or other tunning values,
+
Select "OVMF" as "BIOS" for your VM instead of the default "SeaBIOS".
 +
You need to install your guest OS with uefi support. (for Windows, try win >=8)
  
here a good forum thread of archlinux:
+
Using OVMF, you can also add disable_vga=1 to vfio-pci module, which try to to opt-out devices from vga arbitration if possible:
 +
<pre>
 +
echo "options vfio-pci ids=10de:1381,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf
 +
</pre>
  
https://bbs.archlinux.org/viewtopic.php?id=162768
+
and you need to make sure your graphics card has an UEFI bootable rom:
 +
http://vfio.blogspot.fr/2014/08/does-my-graphics-card-rom-support-efi.html
  
 +
<pre>
 +
bios: ovmf
 +
scsihw: virtio-scsi-pci
 +
bootdisk: scsi0
 +
scsi0: .....
 +
hostpci0: 01:00,x-vga=on
 +
</pre>
  
AMD passthrough error
+
=== GPU OVMF PCI Express Passthrough ===
 +
 
 +
Same as above, but set machine type to q35 and enable pcie=1:
 
<pre>
 
<pre>
kvm: -device vfio-pci,host=01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,x-vga=on,multifunction=on: vfio: error opening /dev/vfio/1: No such file or directory
+
bios: ovmf
kvm: -device vfio-pci,host=01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,x-vga=on,multifunction=on: vfio: failed to get group 1
+
scsihw: virtio-scsi-pci
kvm: -device vfio-pci,host=01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,x-vga=on,multifunction=on: Device initialization failed.
+
bootdisk: scsi0
kvm: -device vfio-pci,host=01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,x-vga=on,multifunction=on: Device 'vfio-pci' could not be initialized
+
scsi0: .....
 +
machine: q35
 +
hostpci0: 01:00,pcie=1,x-vga=on
 
</pre>
 
</pre>
If you have this error, you need to pass "pcie_acs_override=downstream" to grub options to get iommu group working correctly
 
  
 +
=== GPU Seabios PCI Passthrough ===
 +
<pre>
 +
hostpci0: 01:00,x-vga=on
 +
</pre>
  
you can also try to add this option
+
=== GPU Seabios PCI Express Passthrough ===
 
<pre>
 
<pre>
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
+
machine: q35
 +
hostpci0: 01:00,pcie=1,x-vga=on
 
</pre>
 
</pre>
  
 +
=== How to know if a Graphics Card is UEFI (OVMF) compatible ===
 +
 +
Get and compile the software "rom-parser":
 +
git clone https://github.com/awilliam/rom-parser
 +
cd rom-parser
 +
make
 +
 +
Then dump the rom of you vga card:
 +
cd /sys/bus/pci/devices/0000:01:00.0/
 +
echo 1 > rom
 +
cat rom > /tmp/image.rom
 +
echo 0 > rom
 +
 +
and test it with:
 +
./rom-parser /tmp/image.rom
  
 +
Output should look like this:
  
== WORKING NVIDIA SETUP ==
+
Valid ROM signature found @0h, PCIR offset 190h
 +
  PCIR: type 0, vendor: 10de, device: 1280, class: 030000
 +
  PCIR: revision 0, vendor revision: 1
 +
Valid ROM signature found @f400h, PCIR offset 1ch
 +
  PCIR: type 3, vendor: 10de, device: 1280, class: 030000
 +
  PCIR: revision 3, vendor revision: 0
 +
  EFI: Signature Valid
 +
  Last image
  
I've been able to get this working with an NVIDIA GTX 750 Ti card using driver version 344.75 (newer versions inconsistently cause Code 43 Errors) by using the following setup:
+
To be UEFI compatible, you need a "type 3" in the result.
  
Install pve-kernel-3.10.0-5-pve
+
=== NVIDIA Tips ===
  
 +
Some Windows applications like geforce experience, Passmark Performance Test and SiSoftware Sandra crash can crash the VM.
 +
You need to add:
 
<pre>
 
<pre>
Add to /etc/modules:
+
echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf
pci_stub
 
vfio
 
vfio_iommu_type1
 
vfio_pci
 
kvm
 
kvm_intel
 
 
</pre>
 
</pre>
Add the following options to /etc/default/grub on the GRUB_CMDLINE_LINUX_DEFAULT line:
+
 
 +
If you see a lot of warning messages in your 'dmesg' system log, add the following instead:
 
<pre>
 
<pre>
intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 rootdelay=10 scsi_mod.scan=sync
+
echo "options kvm ignore_msrs=1 report_ignored_msrs=0" > /etc/modprobe.d/kvm.conf
 
</pre>
 
</pre>
  
 +
User have reported that NVIDIA Kepler K80 GPUs need this in vmid.conf:
 
<pre>
 
<pre>
Run: update-grub
+
args: -machine pc,max-ram-below-4g=1G
 
</pre>
 
</pre>
  
 +
=== The 'romfile' Option ===
 +
 +
http://lime-technology.com/forum/index.php?topic=43644.msg482110#msg482110
 +
 +
Some motherboards can't passthrough GPUs on the first PCI(e) slot by default, because its vbios is shadowed during bootup. You need to capture its vBIOS when its working "normally" (i.e. installed in a different slot), then you can move the card to slot 1 and start the vm using the dumped vBIOS.
 +
 +
To dump the bios:
 
<pre>
 
<pre>
Add the following to /etc/initramfs-tools/modules (find the PCI stub IDs for your card by running lspci -nn | grep NVIDIA):
+
cd /sys/bus/pci/devices/0000:01:00.0/
pci_stub ids=10de:0f02,10de:0bea
+
echo 1 > rom
 +
cat rom > /usr/share/kvm/vbios.bin
 +
echo 0 > rom
 
</pre>
 
</pre>
 +
 +
Then you can pass the vbios file (must be located in /usr/share/kvm/) with:
 
<pre>
 
<pre>
Run: update-initramfs -u and then reboot into the new kernel.
+
hostpci0: 01:00,x-vga=on,romfile=vbios.bin
 
</pre>
 
</pre>
  
Boot into the VM using the Proxmox web interface, and install the OS (I had better luck with Windows 8.1 - can't remember specifics though).
+
==  Troubleshooting ==
  
 +
=== BAR 3: can't reserve [mem] error ===
  
Add the following options to /etc/pve/host/qemu-server/vmid.conf (get the PCI address from lspci command, and I added the USB device address for my Avocent KVM DSRIQ USB module, you can do the same for a physical keyboard and mouse):
+
If you have this error when you try to use the card for a VM:
 
<pre>
 
<pre>
hostpci0: 05:00,x-vga=on,pcie=1,driver=vfio
+
vfio-pci 0000:04:00.0: BAR 3: can't reserve [mem 0xca000000-0xcbffffff 64bit]
machine: q35
+
</pre>
usb0: host=0624:0307
+
 
 +
you can try to add the following kernel commandline option:
 +
<pre>
 +
video=efifb:off
 +
</pre>
 +
 
 +
Checkout the documentation [https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline about Editing the kernel commandline]
 +
 
 +
=== SPICE ===
 +
 
 +
Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that, even when both cards show up.
 +
It's always worth a try to disable SPICE and check again if something fails.
 +
 
 +
=== HDMI Audio crackling/broken ===
 +
 
 +
Some digital audio devices (usually added via GPU functions) may require MSI (Message Signaled Interrupts) to be enabled to function correctly. If you experience any issues, try changing MSI settings in the guest and rebooting the guest.
 +
 
 +
A Windows-Tool to simplify this is available here: https://github.com/CHEF-KOCH/MSI-utility/releases/latest
 +
 
 +
Linux guests usually enable MSI by themselves. To force use of MSI for GPU audio devices, use the following command and reboot:
 +
 
 +
<pre>
 +
echo "options snd-hda-intel enable_msi=1" >> /etc/modprobe.d/snd-hda-intel.conf
 +
</pre>
 +
 
 +
Use 'lspci -vv' and check for the following line on your device to see if MSI is enabled:
 +
 
 +
<pre>
 +
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
 
</pre>
 
</pre>
  
 +
If it says 'Enable+', MSI is working, 'Enable-' means it is supported but disabled, and if the line is missing, MSI is not supported by the PCIe hardware.
 +
 +
This can potentially also improve performance for other passthrough devices, including GPUs, but that depends on the hardware being used.
 +
 +
=== BIOS options ===
 +
 +
Make sure you are using the most recent BIOS version for you mainboard. Often IOMMU groupings or passthrough support in general is improved in later versions.
 +
 +
Some general BIOS options that might need changing to allow passthrough to work:
 +
 +
* IOMMU or VT-d: Set to 'Enabled' or equivalent, often 'Auto' is not the same
 +
* 'Legacy boot' or CSM: For GPU passthrough it can help to disable this, but keep in mind that PVE has to be installed in UEFI mode, as it will not boot in BIOS mode without this enabled. The reason for disabling this is that it avoids legacy VGA initialization of installed GPUs, making them able to be re-initialized later, as required for passthrough. Most useful when trying to use passthrough in single GPU systems.
 +
* 'Resizable BAR'/'Smart Access Memory': Some AMD GPUs (Vega and up) experience 'Code 43' in Windows guests if this is enabled on the host. It's not supported in VMs either way (yet), so the recommended setting is 'off'.
 +
 +
== Verify Operation ==
 +
 +
Start the VM and enter the qm monitor onn the CLI: "qm monitor vmnumber"
 +
Verify that your card is listed here: "info pci"
 +
Then install drivers on your guest OS.
 +
 +
NOTE: Card support might be limited to 2 or 3 devices.
 +
 +
NOTE: A PCI device can only ever be attached to a single VM.
 +
 +
NOTE: This process will remove the card from the proxmox host OS as long as the VM it's attached to is running.
 +
 +
NOTE: Using PCI passthrough to present drives direct to a ZFS (FreeNAS, Openfiler, OmniOS) virtual machine is OK for testing, but '''not recommended''' for production use.  Specific FreeNAS warnings can be found here:  http://forums.freenas.org/threads/absolutely-must-virtualize-freenas-a-guide-to-not-completely-losing-your-data.12714/
 +
 +
== USB Passthrough ==
 +
If you need to passthrough usb devices (keyboard, mouse), please follow this wiki article:
 +
 +
https://pve.proxmox.com/wiki/USB_physical_port_mapping
 
[[Category:HOWTO]]
 
[[Category:HOWTO]]

Latest revision as of 11:20, 1 September 2021

Introduction

PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only).

If you "PCI passthrough" a device, the device is not available to the host anymore.

Note:

PCI passthrough is an experimental feature in Proxmox VE! VMs with passthroughed devices cannot be migrated.

Enable the IOMMU

You need to enable the IOMMU, by editing the kernel commandline.

First open your bootloader kernel command line config file.

For GRUB:

nano /etc/default/grub

Find the line with "GRUB_CMDLINE_LINUX_DEFAULT"

For systemd-boot:

nano /etc/kernel/cmdline

Its format is a single line with options. You can create the file for systemd-boot if not present.

Intel CPU

For Intel CPUs add

 intel_iommu=on

GRUB

If you are using GRUB:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

Then save the changes and update grub:

update-grub

systemd-boot

If you use systemd-boot, add the following at the end of the first line:

quiet intel_iommu=on

Then save the changes and update systemd-boot:

proxmox-boot-tool refresh

AMD CPU

For AMD CPUs add

 amd_iommu=on

GRUB

If you are using GRUB:

GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on" 

Then save the changes and update grub:

update-grub

systemd-boot

If you are using systemd-boot, add the following at the end of the first line:

quiet amd_iommu=on

Then save the changes and update systemd-boot:

proxmox-boot-tool refresh

Verify IOMMU is enabled

Reboot, then run:

dmesg | grep -e DMAR -e IOMMU

There should be a line that looks like "DMAR: IOMMU enabled". If there is no output, something is wrong.

PT Mode

Both Intel and AMD chips can use the additional parameter "iommu=pt", added in the same way as above to the kernel cmdline.

 iommu=pt

This enables the IOMMU translation only when necessary, the adapter does not need to use DMA translation to the memory, and can thus improve performance for hypervisor PCIe devices (which are not passthroughed to a VM)

Required Modules

add to /etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Note that in the 5.4 based kernels some of those modules are already built into the kernel directly.

IOMMU Interrupt Remapping

It will not be possible to use PCI passthrough without interrupt remapping. Device assignment will fail with 'Failed to assign device "[device name]": Operation not permitted' or 'Interrupt Remapping hardware not found, passing devices to unprivileged domains is insecure.' error.

All systems using an Intel processor and chipset that have support for Intel Virtualization Technology for Directed I/O (VT-d), but do not have support for interrupt remapping will see such an error. Interrupt remapping support is provided in newer processors and chipsets (both AMD and Intel).

To identify if your system has support for interrupt remapping:

dmesg | grep 'remapping'

If you see one of the following lines:

  • "AMD-Vi: Interrupt remapping enabled"
  • "DMAR-IR: Enabled IRQ remapping in x2apic mode" ('x2apic' can be different on old CPUs, but should still work)

then remapping is supported.

If your system doesn't support interrupt remapping, you can allow unsafe interrupts with:

echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf

Verify IOMMU Isolation

For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.

You should have something like:

# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.1
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/3/devices/0000:00:16.0
/sys/kernel/iommu_groups/4/devices/0000:00:1a.0
/sys/kernel/iommu_groups/5/devices/0000:00:1b.0
/sys/kernel/iommu_groups/6/devices/0000:00:1c.0
/sys/kernel/iommu_groups/7/devices/0000:00:1c.5
/sys/kernel/iommu_groups/8/devices/0000:00:1c.6
/sys/kernel/iommu_groups/9/devices/0000:00:1c.7
/sys/kernel/iommu_groups/9/devices/0000:05:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:1d.0
/sys/kernel/iommu_groups/11/devices/0000:00:1f.0
/sys/kernel/iommu_groups/11/devices/0000:00:1f.2
/sys/kernel/iommu_groups/11/devices/0000:00:1f.3
/sys/kernel/iommu_groups/12/devices/0000:02:00.0
/sys/kernel/iommu_groups/12/devices/0000:02:00.1
/sys/kernel/iommu_groups/13/devices/0000:03:00.0
/sys/kernel/iommu_groups/14/devices/0000:04:00.0

To have separate IOMMU groups, your processor needs to have support for a feature called ACS (Access Control Services). Make sure you enable the corresponding setting in your BIOS for this.

All Xeon processor support them (E3,E5) excluding Xeon E3-1200.

For Intel Core it's different, only some processors support ACS. Anything newer than listed below should support ACS, as long as VT-d is supported. See https://ark.intel.com for more info.

Haswell-E (LGA2011-v3)
i7-5960X (8-core, 3/3.5GHz)
i7-5930K (6-core, 3.2/3.8GHz)
i7-5820K (6-core, 3.3/3.6GHz)

Ivy Bridge-E (LGA2011)
i7-4960X (6-core, 3.6/4GHz)
i7-4930K (6-core, 3.4/3.6GHz)
i7-4820K (4-core, 3.7/3.9GHz)

Sandy Bridge-E (LGA2011)
i7-3960X (6-core, 3.3/3.9GHz)
i7-3970X (6-core, 3.5/4GHz)
i7-3930K (6-core, 3.2/3.8GHz)
i7-3820 (4-core, 3.6/3.8GHz)

AMD chips from Ryzen 1st generation and newer are fine too.

If you don't have dedicated IOMMU groups, you can try:

1) moving the card to another pci slot

2) adding "pcie_acs_override=downstream" to kernel boot commandline (grub or systemd-boot) options, which can help on some setup with bad ACS implementation.

Checkout the documentation about Editing the kernel commandline

More infos:

http://vfio.blogspot.be/2015/10/intel-processors-with-acs-support.html http://vfio.blogspot.be/2014/08/iommu-groups-inside-and-out.html

Determine your PCI card address, and configure your VM

The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's hardware tab.

Alternatively, you can use the command line:

Locate your card using "lspci". The address should be in the form of: 01:00.0 Edit the <vmid>.conf file. It can be located at: /etc/pve/qemu-server/vmid.conf.

Add this line to the end of the file:

hostpci0: 01:00.0

If you have a multi-function device (like a vga card with embedded audio chipset), you can pass all functions manually with:

hostpci0: 01:00.0;01:00.1

or, to pass all functions automatically:

hostpci0: 01:00

PCI Express Passthrough

Check the "PCI-E" checkbox in the GUI when adding your device, or manually add the pcie=1 parameter to your VM config:

machine: q35
hostpci0: 01:00.0,pcie=1

PCIe passthrough is only supported on Q35 machines.

Note that this does not mean that devices assigned without this setting will only have PCI speeds, it just sets a flag for the guest to tell it that the device is a PCIe device instead of a "really-fast legacy PCI device". Some guest applications benefit from this.

GPU Passthrough

Yellowpin.svg Note: See http://blog.quindorian.org/2018/03/building-a-2u-amd-ryzen-server-proxmox-gpu-passthrough.html/ if you like an article with a HOWTO approach. (NOTE: you usually do not need the ROM-file dumping mentioned at the end!)
  • AMD RADEON 5xxx, 6xxx, 7xxx, NVIDIA GEFORCE 7, 8, GTX 4xx, 5xx, 6xx, 7xx, 9xx, 10xx and RTX 16xx/20xx have been reported working.
  • AMD Navi (5xxx(XT)/6xxx(XT)) suffer from the reset bug (see https://github.com/gnif/vendor-reset), and while dedicated users have managed to get them to run, they require a lot more effort and will probably not work entirely stable
  • You might need to load some specific options in grub.cfg or other tuning values to get your configuration specifically working/stable
  • Here's a good forum thread of archlinux: https://bbs.archlinux.org/viewtopic.php?id=162768

For starters, it's often helpful if the host doesn't try to use the GPU, which avoids issues with the host driver unbinding and re-binding to the device. Sometimes making sure the host BIOS POST messages are displayed on a different GPU is helpful too. This can sometimes be acomplished via BIOS settings, moving the card to a different slot or enabling/disabling legacy boot support.

First, find the device and vendor id of your vga card:

$ lspci -n -s 01:00
01:00.0 0300: 10de:1381 (rev a2)
01:00.1 0403: 10de:0fbc (rev a1)

The Vendor:Device IDs for this GPU and it's audio functions are therefore 10de:1381, 10de:0fbc.

Then, create a file:

echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf

blacklist the drivers:

echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf 

and reboot your machine.

For VM configuration, They are 4 configurations possible:

GPU OVMF PCI Passthrough (recommended)

Select "OVMF" as "BIOS" for your VM instead of the default "SeaBIOS". You need to install your guest OS with uefi support. (for Windows, try win >=8)

Using OVMF, you can also add disable_vga=1 to vfio-pci module, which try to to opt-out devices from vga arbitration if possible:

echo "options vfio-pci ids=10de:1381,10de:0fbc disable_vga=1" > /etc/modprobe.d/vfio.conf

and you need to make sure your graphics card has an UEFI bootable rom: http://vfio.blogspot.fr/2014/08/does-my-graphics-card-rom-support-efi.html

bios: ovmf
scsihw: virtio-scsi-pci
bootdisk: scsi0
scsi0: .....
hostpci0: 01:00,x-vga=on

GPU OVMF PCI Express Passthrough

Same as above, but set machine type to q35 and enable pcie=1:

bios: ovmf
scsihw: virtio-scsi-pci
bootdisk: scsi0
scsi0: .....
machine: q35
hostpci0: 01:00,pcie=1,x-vga=on

GPU Seabios PCI Passthrough

hostpci0: 01:00,x-vga=on

GPU Seabios PCI Express Passthrough

machine: q35
hostpci0: 01:00,pcie=1,x-vga=on

How to know if a Graphics Card is UEFI (OVMF) compatible

Get and compile the software "rom-parser":

git clone https://github.com/awilliam/rom-parser
cd rom-parser
make

Then dump the rom of you vga card:

cd /sys/bus/pci/devices/0000:01:00.0/
echo 1 > rom
cat rom > /tmp/image.rom
echo 0 > rom

and test it with:

./rom-parser /tmp/image.rom

Output should look like this:

Valid ROM signature found @0h, PCIR offset 190h
 PCIR: type 0, vendor: 10de, device: 1280, class: 030000
 PCIR: revision 0, vendor revision: 1
Valid ROM signature found @f400h, PCIR offset 1ch
 PCIR: type 3, vendor: 10de, device: 1280, class: 030000
 PCIR: revision 3, vendor revision: 0
  EFI: Signature Valid
 Last image

To be UEFI compatible, you need a "type 3" in the result.

NVIDIA Tips

Some Windows applications like geforce experience, Passmark Performance Test and SiSoftware Sandra crash can crash the VM. You need to add:

echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf

If you see a lot of warning messages in your 'dmesg' system log, add the following instead:

echo "options kvm ignore_msrs=1 report_ignored_msrs=0" > /etc/modprobe.d/kvm.conf

User have reported that NVIDIA Kepler K80 GPUs need this in vmid.conf:

args: -machine pc,max-ram-below-4g=1G

The 'romfile' Option

http://lime-technology.com/forum/index.php?topic=43644.msg482110#msg482110

Some motherboards can't passthrough GPUs on the first PCI(e) slot by default, because its vbios is shadowed during bootup. You need to capture its vBIOS when its working "normally" (i.e. installed in a different slot), then you can move the card to slot 1 and start the vm using the dumped vBIOS.

To dump the bios:

cd /sys/bus/pci/devices/0000:01:00.0/
echo 1 > rom
cat rom > /usr/share/kvm/vbios.bin
echo 0 > rom

Then you can pass the vbios file (must be located in /usr/share/kvm/) with:

hostpci0: 01:00,x-vga=on,romfile=vbios.bin

Troubleshooting

BAR 3: can't reserve [mem] error

If you have this error when you try to use the card for a VM:

vfio-pci 0000:04:00.0: BAR 3: can't reserve [mem 0xca000000-0xcbffffff 64bit]

you can try to add the following kernel commandline option:

video=efifb:off

Checkout the documentation about Editing the kernel commandline

SPICE

Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that, even when both cards show up. It's always worth a try to disable SPICE and check again if something fails.

HDMI Audio crackling/broken

Some digital audio devices (usually added via GPU functions) may require MSI (Message Signaled Interrupts) to be enabled to function correctly. If you experience any issues, try changing MSI settings in the guest and rebooting the guest.

A Windows-Tool to simplify this is available here: https://github.com/CHEF-KOCH/MSI-utility/releases/latest

Linux guests usually enable MSI by themselves. To force use of MSI for GPU audio devices, use the following command and reboot:

echo "options snd-hda-intel enable_msi=1" >> /etc/modprobe.d/snd-hda-intel.conf

Use 'lspci -vv' and check for the following line on your device to see if MSI is enabled:

Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+

If it says 'Enable+', MSI is working, 'Enable-' means it is supported but disabled, and if the line is missing, MSI is not supported by the PCIe hardware.

This can potentially also improve performance for other passthrough devices, including GPUs, but that depends on the hardware being used.

BIOS options

Make sure you are using the most recent BIOS version for you mainboard. Often IOMMU groupings or passthrough support in general is improved in later versions.

Some general BIOS options that might need changing to allow passthrough to work:

  • IOMMU or VT-d: Set to 'Enabled' or equivalent, often 'Auto' is not the same
  • 'Legacy boot' or CSM: For GPU passthrough it can help to disable this, but keep in mind that PVE has to be installed in UEFI mode, as it will not boot in BIOS mode without this enabled. The reason for disabling this is that it avoids legacy VGA initialization of installed GPUs, making them able to be re-initialized later, as required for passthrough. Most useful when trying to use passthrough in single GPU systems.
  • 'Resizable BAR'/'Smart Access Memory': Some AMD GPUs (Vega and up) experience 'Code 43' in Windows guests if this is enabled on the host. It's not supported in VMs either way (yet), so the recommended setting is 'off'.

Verify Operation

Start the VM and enter the qm monitor onn the CLI: "qm monitor vmnumber" Verify that your card is listed here: "info pci" Then install drivers on your guest OS.

NOTE: Card support might be limited to 2 or 3 devices.

NOTE: A PCI device can only ever be attached to a single VM.

NOTE: This process will remove the card from the proxmox host OS as long as the VM it's attached to is running.

NOTE: Using PCI passthrough to present drives direct to a ZFS (FreeNAS, Openfiler, OmniOS) virtual machine is OK for testing, but not recommended for production use. Specific FreeNAS warnings can be found here: http://forums.freenas.org/threads/absolutely-must-virtualize-freenas-a-guide-to-not-completely-losing-your-data.12714/

USB Passthrough

If you need to passthrough usb devices (keyboard, mouse), please follow this wiki article:

https://pve.proxmox.com/wiki/USB_physical_port_mapping