[pve-devel] need help to debug random host freeze on multiple hosts

Michael Rasmussen mir at datanom.net
Mon Dec 29 22:13:26 CET 2014


On Mon, 29 Dec 2014 20:05:35 +0100 (CET)
Alexandre DERUMIER <aderumier at odiso.com> wrote:

> 
> Here the detail of microcode patch
> 
> 815 Processor May Read Partially Updated Branch Status
> Register
> Description
> Under a highly specific and detailed set of internal timing conditions, the processor may read an internal branch
> status register (BSR) while the register is being updated resulting in an incorrect rIP.
> Potential Effect on System
> The incorrect rIP causes unpredictable program or system behavior, usually observed as a page fault.
> Suggested Workaround
> Contact your AMD representative for information on a BIOS update.
> Fix Planned
> No fix planned
> 
> 
> 
> I have another crash this afternoon, and this host was around 90% cpu usage since 12h. (But loadaverage was ok).
> So maybe more cpu give more chance to reach the case.
> 
> I have patched this bios, I'll wait to see if it's improve or not.
> 
Sounds like a race condition to me. I also think high load due to lots
of KVM processes seems to be one way to trigger this behavior.

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
"Shelter," what a nice name for for a place where you polish your cat.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.proxmox.com/pipermail/pve-devel/attachments/20141229/8ae28a5b/attachment.sig>


More information about the pve-devel mailing list