[pve-devel] [PATCH qemu-server v2] Catch qmp socket connections errors, so we can output a more specific error message

Emmanuel Kasper e.kasper at proxmox.com
Thu Jul 27 11:25:41 CEST 2017


It can happen that the qmp connection gets lost while mirroring a disk.
In that case the current block job get cancelled, but the real cause of the failure
is lost, becase we die() at a later step with the generic message
"die "$job: mirroring has been cancelled\n"

example:
...
drive-scsi0: transferred: 5524946944 bytes remaining: 918355968 bytes total: 6443302912 bytes progression: 85.75 % busy: 1 ready: 0
drive-scsi0: Cancelling block job
drive-scsi0: Done.
2017-07-26 15:39:56 ERROR: online migrate failure - mirroring error: drive-scsi0: mirroring has been cancelled
2017-07-26 15:39:56 aborting phase 2 - cleanup resources
2017-07-26 15:39:56 migrate_cancel
...

after patch applied:
2017-07-27 09:43:37 ERROR: online migrate failure - mirroring error: lost connection to qemu machine protocol: VM 600 not running
2017-07-27 09:43:37 aborting phase 2 - cleanup resources
---
changes since v1:
 * declare and assign my $stats directly. No need to have three lines here
 when one is clear enough
 PVE/QemuServer.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 1f34101..3086375 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6033,7 +6033,8 @@ sub qemu_drive_mirror_monitor {
 	while (1) {
 	    die "storage migration timed out\n" if $err_complete > 300;
 
-	    my $stats = vm_mon_cmd($vmid, "query-block-jobs");
+	    my $stats = eval { vm_mon_cmd($vmid, "query-block-jobs"); };
+	    die "lost connection to qemu machine protocol socket: $@\n" if $@;
 
 	    my $running_mirror_jobs = {};
 	    foreach my $stat (@$stats) {
-- 
2.11.0





More information about the pve-devel mailing list