[pve-devel] applied: [PATCH] backport fixes for multiple KVM vulnerabilities

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Feb 25 16:15:16 CET 2019


Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---
 ...l_create_device-reference-counting-C.patch | 60 +++++++++++++++++++
 ...tionally-cancel-preemption-timer-in-.patch | 46 ++++++++++++++
 ...und-leak-of-uninitialized-stack-cont.patch | 50 ++++++++++++++++
 3 files changed, 156 insertions(+)
 create mode 100644 patches/kernel/0012-kvm-fix-kvm_ioctl_create_device-reference-counting-C.patch
 create mode 100644 patches/kernel/0013-KVM-nVMX-unconditionally-cancel-preemption-timer-in-.patch
 create mode 100644 patches/kernel/0014-KVM-x86-work-around-leak-of-uninitialized-stack-cont.patch

diff --git a/patches/kernel/0012-kvm-fix-kvm_ioctl_create_device-reference-counting-C.patch b/patches/kernel/0012-kvm-fix-kvm_ioctl_create_device-reference-counting-C.patch
new file mode 100644
index 0000000..0829c98
--- /dev/null
+++ b/patches/kernel/0012-kvm-fix-kvm_ioctl_create_device-reference-counting-C.patch
@@ -0,0 +1,60 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Jann Horn <jannh at google.com>
+Date: Mon, 25 Feb 2019 11:48:05 +0000
+Subject: [PATCH] kvm: fix kvm_ioctl_create_device() reference counting
+ (CVE-2019-6974)
+
+kvm_ioctl_create_device() does the following:
+
+1. creates a device that holds a reference to the VM object (with a borrowed
+   reference, the VM's refcount has not been bumped yet)
+2. initializes the device
+3. transfers the reference to the device to the caller's file descriptor table
+4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real
+   reference
+
+The ownership transfer in step 3 must not happen before the reference to the VM
+becomes a proper, non-borrowed reference, which only happens in step 4.
+After step 3, an attacker can close the file descriptor and drop the borrowed
+reference, which can cause the refcount of the kvm object to drop to zero.
+
+This means that we need to grab a reference for the device before
+anon_inode_getfd(), otherwise the VM can disappear from under us.
+
+Fixes: 852b6d57dc7f ("kvm: add device control API")
+Cc: stable at kernel.org
+Signed-off-by: Jann Horn <jannh at google.com>
+Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
+
+CVE-2019-6974
+
+(cherry picked from commit cfa39381173d5f969daf43582c95ad679189cbc9)
+Signed-off-by: Tyler Hicks <tyhicks at canonical.com>
+Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
+---
+ virt/kvm/kvm_main.c | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 234d03abcb75..238ddbc127e1 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -2908,8 +2908,10 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
+ 	if (ops->init)
+ 		ops->init(dev);
+ 
++	kvm_get_kvm(kvm);
+ 	ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR | O_CLOEXEC);
+ 	if (ret < 0) {
++		kvm_put_kvm(kvm);
+ 		mutex_lock(&kvm->lock);
+ 		list_del(&dev->vm_node);
+ 		mutex_unlock(&kvm->lock);
+@@ -2917,7 +2919,6 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
+ 		return ret;
+ 	}
+ 
+-	kvm_get_kvm(kvm);
+ 	cd->fd = ret;
+ 	return 0;
+ }
diff --git a/patches/kernel/0013-KVM-nVMX-unconditionally-cancel-preemption-timer-in-.patch b/patches/kernel/0013-KVM-nVMX-unconditionally-cancel-preemption-timer-in-.patch
new file mode 100644
index 0000000..6b55182
--- /dev/null
+++ b/patches/kernel/0013-KVM-nVMX-unconditionally-cancel-preemption-timer-in-.patch
@@ -0,0 +1,46 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Peter Shier <pshier at google.com>
+Date: Mon, 25 Feb 2019 11:48:06 +0000
+Subject: [PATCH] KVM: nVMX: unconditionally cancel preemption timer in
+ free_nested (CVE-2019-7221)
+
+Bugzilla: 1671904
+
+There are multiple code paths where an hrtimer may have been started to
+emulate an L1 VMX preemption timer that can result in a call to free_nested
+without an intervening L2 exit where the hrtimer is normally
+cancelled. Unconditionally cancel in free_nested to cover all cases.
+
+Embargoed until Feb 7th 2019.
+
+Signed-off-by: Peter Shier <pshier at google.com>
+Reported-by: Jim Mattson <jmattson at google.com>
+Reviewed-by: Jim Mattson <jmattson at google.com>
+Reported-by: Felix Wilhelm <fwilhelm at google.com>
+Cc: stable at kernel.org
+Message-Id: <20181011184646.154065-1-pshier at google.com>
+Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
+
+CVE-2019-7221
+
+(backported from commit ecec76885bcfe3294685dc363fd1273df0d5d65f)
+[tyhicks: Backport to 4.18:
+ - free_nested() is in arch/x86/kvm/vmx.c]
+Signed-off-by: Tyler Hicks <tyhicks at canonical.com>
+Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
+---
+ arch/x86/kvm/vmx.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
+index 7ade6cb125d3..37b095e7f00a 100644
+--- a/arch/x86/kvm/vmx.c
++++ b/arch/x86/kvm/vmx.c
+@@ -7681,6 +7681,7 @@ static void free_nested(struct vcpu_vmx *vmx)
+ 	if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon)
+ 		return;
+ 
++	hrtimer_cancel(&vmx->nested.preemption_timer);
+ 	vmx->nested.vmxon = false;
+ 	vmx->nested.smm.vmxon = false;
+ 	free_vpid(vmx->nested.vpid02);
diff --git a/patches/kernel/0014-KVM-x86-work-around-leak-of-uninitialized-stack-cont.patch b/patches/kernel/0014-KVM-x86-work-around-leak-of-uninitialized-stack-cont.patch
new file mode 100644
index 0000000..bc12f72
--- /dev/null
+++ b/patches/kernel/0014-KVM-x86-work-around-leak-of-uninitialized-stack-cont.patch
@@ -0,0 +1,50 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Paolo Bonzini <pbonzini at redhat.com>
+Date: Mon, 25 Feb 2019 11:48:07 +0000
+Subject: [PATCH] KVM: x86: work around leak of uninitialized stack contents
+ (CVE-2019-7222)
+
+Bugzilla: 1671930
+
+Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with
+memory operand, INVEPT, INVVPID) can incorrectly inject a page fault
+when passed an operand that points to an MMIO address.  The page fault
+will use uninitialized kernel stack memory as the CR2 and error code.
+
+The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
+exit to userspace; however, it is not an easy fix, so for now just
+ensure that the error code and CR2 are zero.
+
+Embargoed until Feb 7th 2019.
+
+Reported-by: Felix Wilhelm <fwilhelm at google.com>
+Cc: stable at kernel.org
+Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
+
+CVE-2019-7222
+
+(cherry picked from commit 353c0956a618a07ba4bbe7ad00ff29fe70e8412a)
+Signed-off-by: Tyler Hicks <tyhicks at canonical.com>
+Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
+---
+ arch/x86/kvm/x86.c | 7 +++++++
+ 1 file changed, 7 insertions(+)
+
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index b3df576413cd..13804929adce 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -4632,6 +4632,13 @@ int kvm_read_guest_virt(struct kvm_vcpu *vcpu,
+ {
+ 	u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
+ 
++	/*
++	 * FIXME: this should call handle_emulation_failure if X86EMUL_IO_NEEDED
++	 * is returned, but our callers are not ready for that and they blindly
++	 * call kvm_inject_page_fault.  Ensure that they at least do not leak
++	 * uninitialized kernel stack memory into cr2 and error code.
++	 */
++	memset(exception, 0, sizeof(*exception));
+ 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, access,
+ 					  exception);
+ }
-- 
2.20.1





More information about the pve-devel mailing list