[PVE-User] Servers spontaneously Rebooting

Mon Nov 30 08:27:21 CET 2015

On 11/30/2015 07:44 AM, Lindsay Mathieson wrote:
> On 30/11/15 16:14, Thomas Lamprecht wrote:
>> Do you have HA resources configured?
>
> Yes, several VM's - only setup since the 4.0 upgrade
>
>> If yes do you have quorum problems (with duration > 50-60 seconds)?
>
>
> Not that I know of :) Is there a log for this?

Normally corosync also gets log in the syslog/journal but you can 
configure it to log in a own logfile.
That would need the following adapted logging entry in the corosync config:

logging {
   debug: off
   to_syslog: yes
   # this is new:
   to_logfile: yes
   logfile: /var/log/corosync.lo
}

>>
>> Also check the logs for watchdog entries.
>
>
> Sorry - which logs are those?

This should be in the standard syslog.

If you prefer to use journalctl execute `mkdir /var/log/journal` to make 
those logs persistent, then you can browse the last boot with
journalctl -b-1
where -1 denotes the previous boot

If you haven't configured that look in /var/log/syslog
>
>
>
> Can HA cause a host to reboot? I thought it was just around restarting 
> VM's
>

Yes it is around (re)starting VMs when there is a failure, but to secure 
that the VM only runs once in the cluster (to avoid race conditions/ 
multiple access to shared resources) we need fencing.
Proxmox VE 4 uses self fencing which restarts the node via the watchdog 
if it lost quorum for more than 60 seconds AND a HA resource is 
configured on this node to secure that all shared resources are free for 
the rest of the cluster.
see
# man ha-manager
and
http://pve.proxmox.com/wiki/High_Availability_Cluster_4.x#Fencing
for additional information.

Regards,
Thomas