[PVE-User] OpenVZ Bug?

Marc Aymerich glicerinu at gmail.com
Fri Jan 22 04:02:15 CET 2010


On Thu, Jan 21, 2010 at 6:32 PM, CDR <venefax at gmail.com> wrote:
> The  bug is not in the paid version of OpenVZ, virtuozzo. I just tested it.
> Could you please elaborate on what those stats mean, and why it is important
> to monitor them?
> I run the script for the server as a whole and it also is consistent.
> Federico
>

Hi Fedeirco, Thanks for making the test!

/proc/stat stores information related to CPU usage, counters like the
time that CPU spends in User mode, time spent in System mode, idle,
IO, etc.. (you can see a percentile of this values with top command,
looking on the second line of the header.)

Tasks: 185 total, 1 running, 184 sleeping, 0 stopped, 0 zombie
Cpu(s): 7.0%us, 2.7%sy, 0.4%ni, 89.6%id, 0.1%wa, 0.2%hi, 0.0%si, 0.0%st
Mem: 3043516k total, 1955020k used, 1088496k free, 86044k buffers

In particular the rows I mentioned in my last mail, (with which I get
the wrong values), are IDLE time (amount of time that all CPUs spend
doing nothing) and IO time (amount of time that all cpus are waiting
for InputOutput) (*check man proc for more information). I am
interested in monitorize IO time in HN because we use iSCSI array to
store data, and IO time can be representative when measuring a
possible iSCSI bottleneck. But in Containers I'm not really interested
on it, I do the monitorization just for 'testimonial stats'. Actually
this bug is too far of being considered critical, but giving wrong
values on this it's not nice, in my opinion.


> On Thu, Jan 21, 2010 at 2:49 AM, Marc Aymerich <glicerinu at gmail.com> wrote:
>>
>> Hi all!
>>
>> I'm using Proxmox VE1.3 and while i was trying to figure out the
>> reason of abnormal cpu results from one of my Nagios checks I found a
>> bug on openvz. Around 5% of the times that you check /proc/stat it
>> gives a wrong value of the ILDE and/or IO cpu ticks (top, vmstat, ps
>> uses it). This only happens inside a container.
>>
>> I don't know if it is related only to this Proxmox version or it's a
>> general openvz bug. Before reporting the bug to the openvz bugzilla i
>> would like someone running the newer version of Proxmox to try to
>> reproduce it to ensure it is not only related to proxmox VE1.3.
>>
>> If you are interested on it you can try to execute this code inside
>> one of yours containers, (preferably one with some load)
>>
>> while true; do cat /proc/stat |head -n1|awk {'print $5 " " $6'}; sleep 1;
>> done
>>
>> it shows ILDE and IO cpu ticks stored in /proc/stat every 1 second. If
>> you're afected with the bug you'll see something like that:
>>
>> ...
>> 6646470921 23857
>> 6646471423 23857
>> 6646471924 23857
>> 6646470921 23857
>> 6646471423 23857
>> 6646471924 23857
>> 6646470921 23857
>> 6646471423 23857
>> 6646471924 23857
>> 6646470921 23857
>> 6646471423 23857
>> 6646471924 23857
>> 167687 23857 <- wrong value
>> 6646472928 23857
>> 6646473430 23857
>> 6646473932 23857
>> 6646474433 23857
>> 6646474935 23857
>> 6646475437 23857
>> 6646475939 23857
>> 167687 6646332610 <- wrong value
>> 167687 6646333112 <- wrong value
>> 6646477444 23857
>> 6646477946 23857
>> ....
>> On another container:
>> ...
>> 91704735 58747
>> 91705238 58747
>> 91705740 58747
>> 91706243 58747
>> 91706745 58747
>> 217545 91548449 <- wrong value
>> 91707749 58747
>> 91708252 58747
>> 91708754 58747
>> 91709256 58747
>> 91709759 58747
>> 91710261 58747
>> 217545 91551966 <- wrong value
>> 91711266 58747
>> 91711769 58747
>> 217545 91553473 <- wrong value
>> 91712773 58747
>> 91713276 58747
>> 217545 91554981 <- wrong value
>> 91714281 58747
>> 91714784 58747
>> 91715286 58747
>> 91715788 58747
>> 91716291 58747
>> ....
>>
>> It happens randomly, so you may need to wait one or two minutes. Be
>> patient :)
>> And thank you very much!!
>>
>> --
>> Marc
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>



-- 
Marc



More information about the pve-user mailing list