[PVE-User] HDD errors in VMs

Emmanuel Kasper e.kasper at proxmox.com
Thu Jan 7 14:03:19 CET 2016


On 01/07/2016 01:51 PM, Michael Pöllinger wrote:
> Hi Emmanuel.

Hi Michael

> Due to total crash of the VM there is no logging in that time period.
> So sadly we can´t provide any logs.
> 
> On sunday we kicked one node out of our cluster and resetup this node
> proxmox 4.1.
> Today the whole clusterserver crashed with no output on console. We just
> could reset it by ipmi.
> Now again up and running. Sadly our customers starts getting angry.
> (understandably)
> 
> Just sitting here and going through all logfiles for hours, but i can´t
> find anything.
> Only ONE strange entry in /var/log/pveproxy/access.log
> 115.231.222.14 - - [07/Jan/2016:04:05:40 +0100] "GET
> http://zc.qq.com/cgi-bin/common/attr?id=260714&r=0.4883031276012825
> HTTP/1.1" 501 -

This is unexpected indeed, but this should not crash the whole system.


> @4:24 during backup of one VM the whole node crashed.
> 
> I could understand the problem if:
> - it´s always to same times like backups or cronjobs
> - due to high load or errors on raid
> 
> but it´s always different times or on different load.
> sometimes VMs crashes by simply do nothing (idle)
> 
> Here a screenshot from the new proxmox 4.1 node
> http://imgur.com/GILhyw9
> 
> Some time ago there was a per incident ticket option. Have they gone
> completly?
> 
> Problems now:
> - Permanent sporadic crashes of debian 8.2 VMs
> - New one: whole crash of proxmox node.
> 

For this new one I very much suspect some hardware problem.

Yes, we stopped doing the per incident ticket option, but you alway can
get a standard suscription on your pve host, and downgrade that after
the one year period.

Remoting to your system would indeed allow to diagnose this much faster.


Emmanuel




More information about the pve-user mailing list