[pve-devel] [PATCH qemu-server 1/3] fix #2816: restore: remove timeout when allocating disks

Fiona Ebner f.ebner at proxmox.com
Mon Sep 25 12:30:38 CEST 2023


Am 25.09.23 um 10:57 schrieb Dominik Csapak:
> On 9/25/23 10:46, Fiona Ebner wrote:
>> Am 20.09.23 um 13:23 schrieb Dominik Csapak:
>>> On 9/12/23 11:16, Fiona Ebner wrote:
>>>> @@ -7483,14 +7483,11 @@ sub restore_vma_archive {
>>>>            $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
>>>>            } elsif ($line =~ m/^CTIME: /) {
>>>>            # we correctly received the vma config, so we can disable
>>>> -        # the timeout now for disk allocation (set to 10 minutes, so
>>>> -        # that we always timeout if something goes wrong)
>>>> -        alarm(600);
>>>> +        # the timeout now for disk allocation
>>
>> I would interpret this comment about disabling of the timeout to be
>> talking about the short 5 second timeout for reading the config.
> 
> ok, i interpreted it to be disabling *any* timeout to be able
> to allocate the disks properly, and since there is only one global
> timeout here, selectively disabling one seems strange?

With that interpretation the code would be wrong of course.

> i get what you mean, but maybe that would warrant a comment on the
> function?
> or maybe we should be able to clean up half allocated disks in there
> in case the outer timeout triggers?

AFAICS, my patch didn't change cleanup behavior and what you suggest
already happens? The allocation is within an eval and we call
restore_destroy_volumes() if there was an error during allocation (that
also applies for a timeout error).

> 
> in any case, i'd find it good to improve the comment that speaks of
> 'disabling the timeout' that it's meant to only disable the inner 5s one.
> 

It can be seen from the code, but feel free to send a patch to improve it ;)

>> AFAICS, we do similar "delay" of the outer timeout in e.g.
>> run_with_timeout(), where it can also take up to $inner_timeout +
>> $outer_timeout seconds to hit the outer timeout.
> 
> 
> exactly, only our "inner" timeout here is undefined/unlimited because
> disk allocation can take forever?
> 

Why treat the operation as having unlimited inner timeout? What is the
benefit? I'd expect our caller to have a good reason to set a timeout if
it does, so why not try to honor it?





More information about the pve-devel mailing list