[pve-devel] avoiding VMID reuse

Thu Apr 12 12:32:08 CEST 2018

On Wed, Apr 11, 2018 at 03:12:17PM +0300, Lauri Tirkkonen wrote:
> On Tue, Mar 13 2018 11:15:23 +0200, Lauri Tirkkonen wrote:
> > > Sorry if I misunderstood you but VMIDs are already _guaranteed_ to be
> > > unique cluster wide, so also unique per node?
> > 
> > I'll try to clarify: if I create a VM that gets assigned vmid 100, and
> > use zfs for storage, its first disk image is called
> > <zfsstorage>/vm-100-disk-1. If I then later remove vmid 100, and create
> > another new VM, /nextid will suggest that the new vmid should be 100,
> > and its disk image will also be called vm-100-disk-1. We're backing up
> > our disk images using zfs snapshots and sends (on other ZFS hosts too,
> > not just PVE), so it's quite bad if we reuse a name for a completely
> > different dataset - it'll require manual admin intevention. So, we want
> > to avoid using any vmid that has been used in the past.
> > 
> > > Your approach, allowing different nodes from a cluster to alloc
> > > the same VMID also should not work, our clustered configuration
> > > file system (pmxcfs) does not allows this, i.e. no VMID.conf
> > > file can exist multiple times in the qemu-server and lxc folders
> > > ()i.e., the folders holding CT/VM configuration files)
> > 
> > Right. We're not actually running PVE in a cluster configuration, so I
> > might've been a little confused there - if the VMID's are unique in the
> > cluster anyway, then the storage for the counter shouldn't be under
> > "local/", I suppose.
> 
> I took another stab at this, dropping the local/ and making it generally
> less error prone. So to recap, it:
>  - stores next unused vm id in /etc/pve/nextid
>  - returns that stored id in API requests for /cluster/nextid (or
>    highest current existing vmid+1, if nextid is lower and thus out of
>    sync)
>  - PVE::Cluster::alloc_vmid allocates the requested vm id, by storing
>    it into /etc/pve/nextid if it is higher than what there is currently
>    (using lower, non-existing id's is still allowed)
> 
> Thoughts?

some general dev workflow related comments:

- please send patch series as threads (cover letter and each patch as
  separate mail) and configure the subjectprefix accordingly for each
  repository. this allows inline comments on each patch.
- please version your patches/patch series so that it's easier to keep
  track of changes, and send new versions as new threads (unless you
  only do a minor fixup for a single patch of a big series)

if I understood you correctly, your intended use case is to prevent
accidentally re-using a guest ID formerly used by a no longer existing
guest, because of backups / replicated and/or leftover disks that
reference this ID?

I assume you have some kind of client / script for managing guests,
since we don't call /cluster/nextid in our GUI or anywhere else.
wouldn't the simple solution be to keep track of "already used" guest
IDs in your client? especially if you only have single nodes?
alternatively, you could just not use /cluster/nextid and instead use
your own counter (and increment and retry in case the chosen ID is
already taken - /cluster/nextid does not guarantuee it will still be
available when you want to use it anyway..).

another approach would be to adapt your snapshot/sync scripts to remove
sync targets if the source gets removed, or do a forceful full sync if
an ID gets re-used. the latter is how PVE's builtin ZFS replication
works if it fails to find a snapshot combination that allows incremental
sending.

I am a bit hesitant to introduce such special case heuristics,
especially since we don't know if anybody relies on the current
semantics of /cluster/nextid