[pve-devel] [RFC ha-manager] fix #1073: do not count suspended for backup VM as running

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Feb 24 17:43:07 CET 2017


when a stopped VM managed by HA got backuped with suspend mode the HA
stack continuously tried to shut it down as the check_running call
checks only if a PID for the VM exists.
As the VM was locked the shutdown tries were blocked, but still a lot
of annoying messages and task spawns happened.

As querying the VM status through the vm monitor is not cheap check
if the VM is locked with the backup lock, the config is cached and so
this is quite cheap, only then query the VMs status over qmp, and
check for the prelaunch state.
This state gets only set if KVM was started with the `-S` option and
has not yet continued guest operation.

Some performance results, I repeated each check 1000 times, first
number is the total time spent just with the check, second time is
the the time per single check:

old check (vm runs):            87.117 ms/total =>  87.117 us/loop
new check (runs, no backup):   107.744 ms/total => 107.744 us/loop
new check (runs, backup):      760.337 ms/total => 760.337 us/loop

If the VM does not run the time is he same as previously (~6us/loop
on my host), else we normally are ~30 us per check slower if the
expectionary case of a suspend backup happens we are an order of
magnitude slower.

I assume that we quite seldom hit the ha stopped and backup mode
suspend case, and even then we can handle 1000 VMs per node quite
well.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

Another idea of mine was to add a check in PVE::Tools::check_running if the kvm
command was started with the `-S` flag, but there we could not be sure that it
is still suspended and in the prelaunch state as someone could trigger a
`continue`monitor command.

 src/PVE/HA/Resources/PVEVM.pm | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 529d8fb..e8777a7 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -120,7 +120,21 @@ sub check_running {
 
     my $nodename = $haenv->nodename();
 
-    return PVE::QemuServer::check_running($vmid, 1, $nodename);
+    if (PVE::QemuServer::check_running($vmid, 1, $nodename)) {
+
+	# avoid shutdown tries if VM is stopped but its process runs for
+	# suspend backup mode. make a fast backup lock check and only then a
+	# slower qmp query-status
+	my $conf = PVE::QemuConfig->load_config($vmid, $nodename);
+	if (defined($conf->{lock}) && $conf->{lock} eq 'backup') {
+	    my $qmpstatus = PVE::QemuServer::vm_qmp_command($vmid, {execute => 'query-status'});
+	    return ($qmpstatus->{status} eq 'prelaunch') ? 0 : 1;
+	}
+
+	return 1;
+    } else {
+	return 0;
+    }
 }
 
 sub remove_locks {
-- 
2.1.4





More information about the pve-devel mailing list