[pve-devel] [PATCH qemu-server v2] Fix #2412: Missing VMs in pools

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Oct 15 11:33:53 CEST 2019


On 10/15/19 11:23 AM, Dominic Jäger wrote:
> On Mon, Oct 14, 2019 at 02:02:26PM +0200, Thomas Lamprecht wrote:
>> On 10/14/19 1:10 PM, Dominic Jäger wrote:
>>> Between calling vm_destroy and removing the ID from user.cfg (remove_vm_access)
>>> creating a new VM with this ID was possible. VMs could go missing from pools as
>>> a consequence.
>>>
>>> Adding a lock solves this for clones from the same node. Additionally,
>>> unlinking must happen at the very end of the deletion process to avoid that
>>> other nodes use the ID in the meanwhile.
>>>
>>> Co-developed-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>>> Signed-off-by: Dominic Jäger <d.jaeger at proxmox.com>
>>> ---
>>> v1->v2: Move unlink to the end of the deletion process
>>>
>>>  PVE/API2/Qemu.pm | 11 ++++++++---
>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>>> index 267a08e..f84aca7 100644
>>> --- a/PVE/API2/Qemu.pm
>>> +++ b/PVE/API2/Qemu.pm
>>> @@ -1492,9 +1492,14 @@ __PACKAGE__->register_method({
>>>  	    my $upid = shift;
>>>  
>>>  	    syslog('info', "destroy VM $vmid: $upid\n");
>>> -	    PVE::QemuServer::vm_destroy($storecfg, $vmid, $skiplock);
>>
>> so skiplock is now a dead parameter? I mean that /could/ be OK, but the
>> commit never mentions anywhere why this is the case. No users, just kept
>> for backward compat? If so I'd at least adapt the params description, so
>> people reading man pages or CLI help output can have an idea that this
>> does nothing.
>> Or is it in use and still required to be handled? If so, address that.
>> Either way, just "silently" removing it is not OK.
>>
> Removing the parameter was an accident. I'll think about it and send a v3.

thanks

> 
>>> -	    PVE::AccessControl::remove_vm_access($vmid);
>>> -            PVE::Firewall::remove_vmfw_conf($vmid);
>>> +	    PVE::QemuConfig->lock_config($vmid, sub {
>>> +		die "VM $vmid is running - destroy failed\n"
>>> +		    if (PVE::QemuServer::check_running($vmid));
>>> +		PVE::QemuServer::destroy_vm($storecfg, $vmid, 1, 0);
>>> +		PVE::AccessControl::remove_vm_access($vmid);
>>> +		PVE::Firewall::remove_vmfw_conf($vmid);
>>> +		unlink PVE::QemuConfig->config_file($vmid);
>>
>> unlink does not errors itself if the removal failed (e.g., pmxcfs went just
>> inquorate, so please do a
>> unlink ... or die "final removal of $vmid config failed: $!\n";
>>
>> to tell an user in such a case.
> 
> Ok. Is there a reason why we don't do this in PVE::QemuServer::destroy_vm or
> should we actually add the error handling there, too?

No there's no reason, it should be there too.

Also destroy_vm should be modified to only do the $keep_empty_config thing after
the "remove unused" volumes (separate patch/series) and maybe cleanup the double
nested eval there too (also separate patch)..





More information about the pve-devel mailing list