[PVE-User] Reboot on psu failure in redundant setup

Daniel Berteaud daniel at firewall-services.com
Fri Nov 8 16:35:09 CET 2019



----- Le 8 Nov 19, à 16:22, Mark Adams mark at openvs.co.uk a écrit :

> Hi All,
> 
> This cluster is on 5.4-11.
> 
> This is most probably a hardware issue either with ups or server psus, but
> wanted to check if there is any default watchdog or auto reboot in a
> proxmox HA cluster.
> 
> Explanation of what happened:
> 
> All servers have redundant psu, being fed from separate ups in
> separate racks on separate feeds. One of the UPS went out, and when it did
> all nodes rebooted. They were functioning normally after the reboot, but I
> wasn't expecting the reboot to occur.
> 
> When the UPS went down, it also took down all of the core network because
> the power was not connected up in a redundant fashion. Ceph and "LAN"
> traffic was blocked because of this. Did a watchdog reboot each node
> because it lost contact with its cluster peers? I didn't configure it to do
> this myself, so is this an automatic feature? Everything I have read says
> it should be configured manually.
> 
> Thanks in advance.

Yes, that's expected. If all nodes are isolated from each other, they will be self-fenced (using a software watchdog) to prevent any corruption and allow services to be recovered on the quorate part of the cluster. In your case, there was no quorate part, as there was no network at all.

Cheers
Daniel

-- 
[ https://www.firewall-services.com/ ] 	
Daniel Berteaud 
FIREWALL-SERVICES SAS, La sécurité des réseaux 
Société de Services en Logiciels Libres 
Tél : +33.5 56 64 15 32 
Matrix: @dani:fws.fr 
[ https://www.firewall-services.com/ | https://www.firewall-services.com ]




More information about the pve-user mailing list