[PVE-User] HA Timing question

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Sep 10 08:14:02 CEST 2018


On 9/7/18 4:28 PM, Klaus Darilion wrote:
> Am 07.09.2018 um 10:35 schrieb Dietmar Maurer:
>>> But what is the timing for starting VM100 on another node? Is it
>>> guaranteed that this only happens after 60 seconds? 
>>
>> yes, that is the idea.
> 
> I miss the point how this is achieved. Is there somewhere a timer of 60s
> before starting a VM on some other node? Where exactly in case I need to
> tune this? E.g if I would like to have such reboots and VM starting only
> after 5 minutes of cluster problems.

Ha, I guess your the first whom wants to increase this delay, most want it
to be in the duration of mere seconds. 

Problem is, there's the $fence_delay in the HA::NodeStatus module which is
the delay at which point a node gets marked as dead-to-be-fenced.
Then there's the nodes watchdog, which, even if you increase the delay above,
will still trigger if it's not quorate for 60 seconds, so this would need
changing too. For the locks, they are per-node and timeout after 2 minutes
of the last update, as a node (or the current manager) can only do something
if they held this lock, a time increase here should not be too problematic -
theoretically, but is not tested at all.
I'm just telling you what is where, no encouraging, if you still want to hack
around: great, wouldn't recommend starting in production, though :)

> 
> Are there some other not yet mentioned relevant timers in Proxmox
> (besides the timers in Corosync)?

Maybe give our HA documentation, especially the "How It Works"[0] and
"Fencing"[1] chapters, a read.

[0]: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_how_it_works
[1]: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_fencing




More information about the pve-user mailing list