[pve-devel] [PATCH] increase zfs default timeout to 30sec

Thu Apr 12 14:13:50 CEST 2018

On Thu, Apr 12, 2018 at 02:59:22PM +0300, Lauri Tirkkonen wrote:
> On Thu, Apr 12 2018 12:59:13 +0200, Fabian Grünbichler wrote:
> > On Tue, Apr 10, 2018 at 05:40:50PM +0300, Lauri Tirkkonen wrote:
> > > Hi,
> > > 
> > > On Tue, Mar 13 2018 10:25:47 +0100, Thomas Lamprecht wrote:
> > > > What Fabian meant with:
> > > > > [...] our API has a timeout per request, [...]
> > > > 
> > > > is that our API already has 30 seconds as timeout for response,
> > > > so using 30 seconds here can be problematic.
> > > > 
> > > > As a quick easy improvement we could probably increase it from
> > > > 5 to say 20-25 seconds, still below the API timeout, but nonetheless
> > > > multiple times higher than now.
> > > 
> > > Now that the CLA stuff is out of the way, here's a diff to do that then.
> > 
> > if you bump the timeout to 20, you need to check every non-worker call
> > path that ends up in zfs_request to verify that you don't call
> > zfs_request twice (because then you'd have a total timeout of 40 which
> > is over the 30s API request limit).
> >
> > unless you have done that (and somehow provide a convincing argument
> > that you did), this cannot be applied. if you do that, you will quickly
> > realize that converting API endpoints that can trigger expensive storage
> > operations to workers makes more sense, which is why I originally
> > proposed that approach.
> 
> I have not done that. I submitted this diff because Thomas implied it
> could be done: a quick and easy fix, that fixed a real problem for us.

I already stated in previous replies that we can only bump it after
careful analysis ;)

> Knowing very close to zero about PVE code in general I'm much more
> hesitant to do things like converting things to workers - I imagine that
> will end up being a change so large that I'd likely give up before
> finishing. You don't have to take my diff, of course, but I have no
> plans to write your convert-to-worker diff either.

that's okay. in this case, we already have an async API call for what
you want to do - we'd simply need to switch the GUI to use it.

@Dominik: could we (either in general, or when detecting disks are
involved) use the async POST path instead of PUT for updating VM
configs?