[PVE-User] System hangs / CPU 100% Windows 2008 Server

Massimo Santoro massimo at tuxel.com
Wed Sep 5 13:14:18 CEST 2012


On 05/09/2012 12:49, Martin Schuchmann wrote:
> Hi all,
>
> We have a cluster of 3 proxmox servers and one serious problem on a 
> Win 2008 Std (No R2) guest: Approximately every 5-15 days on different 
> times the CPU turns up to 100% and the systems hangs. Today at 
> 11:59:57 am this failure occurs the last time. We have had the failure 
> in the past also on a sunday, when no one was working on the machine. 
> So we do not think, that any software installed on the Win-Server 
> itself causes the problem. Also the Windows Event-Logs does not show 
> anything.
>
> The Proxmox syslog says (the nodes 301 and 501 are located at Server 1 
> (local storage), the hanging Win2008 machine runs as node 402 on 
> Server 2 - also in local storage):
>
> Sep 5 11:42:12 promo2 rrdcached[1847]: removing old journal 
> /var/lib/rrdcached/journal//rrd.journal.1346830932.227122
> Sep 5 11:59:24 promo2 pmxcfs[1869]: [dcdb] notice: data verification 
> successful
> Sep 5 12:00:01 promo2 /USR/SBIN/CRON[348613]: (root) CMD (vzdump 301 
> --quiet 1 --mode snapshot --compress lzo --maxfiles 18 --dumpdir 
> /backup_sftp/vz/host1/hourly/)
> Sep 5 12:00:01 promo2 /USR/SBIN/CRON[348614]: (root) CMD (vzdump 501 
> --quiet 1 --mode snapshot --compress lzo --maxfiles 12 --dumpdir 
> /backup_sftp/vz/elvis/hourly/)
> Sep 5 12:00:02 promo2 pmxcfs[1869]: [status] notice: received log
> Sep 5 12:00:02 promo2 pmxcfs[1869]: [status] notice: received log
> Sep 5 12:00:38 promo2 pmxcfs[1869]: [status] notice: received log
> Sep 5 12:05:01 promo2 pmxcfs[1869]: [status] notice: received log
>
>
> Also in the past there seemed to be a possible connection between 
> starting snapshots and killing the node 402.
> The destination for the backups is a SFTP Server in another datacenter.
>
> Has anyone experiences with that behaviour? 


Yes, we did, many many times. Everything solved (really!) after bios 
update (we have many hp and dell servers with Xeon 3xxx and 5xxx series 
and all suffered of a cpu microcode problem, solved at the end of 2010 / 
beginning 2011). Look for a bios update.

Massimo Santoro



More information about the pve-user mailing list