[PVE-User] System hangs / CPU 100% Windows 2008 Server

Martin Schuchmann ms at city-pc.de
Wed Sep 5 12:49:50 CEST 2012


Hi all,

We have a cluster of 3 proxmox servers and one serious problem on a Win 
2008 Std (No R2) guest: Approximately every 5-15 days on different times 
the CPU turns up to 100% and the systems hangs. Today at 11:59:57 am 
this failure occurs the last time. We have had the failure in the past 
also on a sunday, when no one was working on the machine. So we do not 
think, that any software installed on the Win-Server itself causes the 
problem. Also the Windows Event-Logs does not show anything.

The Proxmox syslog says (the nodes 301 and 501 are located at Server 1 
(local storage), the hanging Win2008 machine runs as node 402 on Server 
2 - also in local storage):

Sep 5 11:42:12 promo2 rrdcached[1847]: removing old journal 
/var/lib/rrdcached/journal//rrd.journal.1346830932.227122
Sep 5 11:59:24 promo2 pmxcfs[1869]: [dcdb] notice: data verification 
successful
Sep 5 12:00:01 promo2 /USR/SBIN/CRON[348613]: (root) CMD (vzdump 301 
--quiet 1 --mode snapshot --compress lzo --maxfiles 18 --dumpdir 
/backup_sftp/vz/host1/hourly/)
Sep 5 12:00:01 promo2 /USR/SBIN/CRON[348614]: (root) CMD (vzdump 501 
--quiet 1 --mode snapshot --compress lzo --maxfiles 12 --dumpdir 
/backup_sftp/vz/elvis/hourly/)
Sep 5 12:00:02 promo2 pmxcfs[1869]: [status] notice: received log
Sep 5 12:00:02 promo2 pmxcfs[1869]: [status] notice: received log
Sep 5 12:00:38 promo2 pmxcfs[1869]: [status] notice: received log
Sep 5 12:05:01 promo2 pmxcfs[1869]: [status] notice: received log


Also in the past there seemed to be a possible connection between 
starting snapshots and killing the node 402.
The destination for the backups is a SFTP Server in another datacenter.

Has anyone experiences with that behaviour?

Thank you in advance!

Regards, Martin


pveversion:

(we did not upgrade yet to the latest kernel, since there have been 
reported some problems with raid controller drivers on the mailing list)

pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1




More information about the pve-user mailing list