[PVE-User] Proxmox with ceph storage VM performance strangeness

Wed Mar 18 08:47:55 CET 2020

Hi Rainer,

El 17/3/20 a las 16:58, Rainer Krienke escribió:
> thanks for your answer,
Take into account I haven't used iothreads, what I told you is what I 
learned here and elsewhere. Alexandre and Alwin are experts in this 
instead ;)
> if I  understand you correctly, than iothreads can only help if the VM
> has more than one disk, hence your proposal to build a raid0 on two rbd
> devices. The disadvantage of this solution would of course be that disk
> usage would be doubled.
Not necesarilly, just create more, smaller disks. Create a stripped 
raid0 and add it as PV to LVM, then create the LVs you need.

Alwin is right that this will make disk management more complex...
> A fileserver VM I manage (not yet productive) could profit from this. I
> use LVM on it anyway and I could use striped LVs, so those volumes would
> read from more vm pv disks. Should help I guess.
>
> The hosts CPU is a AMD EPYC 7402 24-Core Processor. Does it make sense
> to select a specific CPU-type for the VM. My test machines have a
> default kvm64 processor. The number of processors should then probably
> be minimal equal to the number of disks (number of iothreads)?
If all hosts have the same CPU, then use "host" type CPU.
> Do you know if it makes any difference wheater I use the VirtIO
> SCSI-driver versus the Virtio-SCSI-single driver?
I haven't tried -single, maybe others can comment on this.

Cheers
Eneko
>
> Thank you very much
> Rainer
>
> Am 17.03.20 um 14:10 schrieb Eneko Lacunza:
>> Hi,
>>
>> You can try to enable IO threads and assign multiple Ceph disks to the
>> VM, then build some kind of raid0 to increase performance.
>>
>> Generally speaking, a SSD based Ceph cluster is considered to perform
>> well when a VM gets about 2000 IOPS, and factors like CPU 1-thread
>> performance, network and disk have to be selected with care. Also
>> server's energy saving disabled, etc.
>>
>> What CPUs in those 9 nodes?
>>
>> Ceph is built for parallel access and scaling. You're only using 1
>> thread of your VM for disk IO currently.
>>
>> Cheers
>> Eneko
>>
>> El 17/3/20 a las 14:04, Rainer Krienke escribió:
>>> Hello,
>>>
>>> I run a pve 6.1-7 cluster with 5 nodes that is attached (via 10Gb
>>> Network) to a ceph nautilus cluster with 9 ceph nodes and 144 magnetic
>>> disks. The pool with rbd images for disk storage is erasure coded with a
>>> 4+2 profile.
>>>
>>> I ran some performance tests since I noticed that there seems to be a
>>> strange limit to the disk read/write rate on a single VM even if the
>>> physical machine hosting the VM as well as cluster is in total capable
>>> of doing much more.
>>>
>>> So what I did was to run a bonnie++ as well as a dd read/write test
>>> first in parallel on 10 VMs, then on 5 VMs and at last on a single one.
>>>
>>> A value of "75" for "bo++rd" in the first line below means that each of
>>> the 10 bonnie++-processes running on 10 different proxmox VMs in
>>> parallel reported in average over all the results a value of
>>> 75MBytes/sec for "block read". The ceph-values are the peaks measured by
>>> ceph itself during the test run (all rd/wr values in MBytes/sec):
>>>
>>> VM-count:  bo++rd: bo++wr: ceph(rd/wr):  dd-rd:  dd-wr:  ceph(rd/wr):
>>> 10           75      42      540/485       55     58      698/711
>>>    5           90      62      310/338       47     80      248/421
>>>    1          108     114      111/120      130    145      337/165
>>>
>>>
>>> What I find a little strange is that running many VMs doing IO in
>>> parallel I reach a write rate of about 485-711 MBytes/sec. However when
>>> running a single VM the maximum is at 120-165 MBytes/sec. Since the
>>> whole networking is based on a 10GB infrastructure and an iperf test
>>> between a VM and a ceph node reported nearby 10Gb I would expect a
>>> higher rate for the single VM. Even if I run a test with 5 VMs on *one*
>>> physical host (values not shown above), the results are not far behind
>>> the values for 5 VMs on 5 hosts shown above. So the single host seems
>>> not to be the limiting factor, but the VM itself is limiting IO.
>>>
>>> What rates do you find on your proxmox/ceph cluster for single VMs?
>>> Does any one have any explanation for this rather big difference or
>>> perhaps an idea what to try in order to get higher IO-rates from a
>>> single VM?
>>>
>>> Thank you very much in advance
>>> Rainer
>>>
>>>
>>>
>>> ---------------------------------------------
>>> Here are the more detailed test results for anyone interested:
>>>
>>> Using bonnie++:
>>> 10 VMs (two on each of the 5 hosts) VMs: 4GB RAM, BTRFS, cd /root;
>>> bonnie++ -u root
>>>     Average for each VM:
>>>     block write: ~42MByte/sec, block read: ~75MByte/sec
>>>     ceph: total peak: 485MByte/sec write, 540MByte/sec read
>>>
>>> 5 VMs (one on each of the 5 hosts) 4GB RAM, BTRFS, cd /root; bonnie++ -u
>>> root
>>>     Average for each VM:
>>>     block write: ~62MByte/sec, block read: ~90MByte/sec
>>>     ceph: total peak: 338MByte/sec write, 310MByte/sec read
>>>
>>> 1 VM  4GB RAM, BTRFS, cd /root; bonnie++ -u root
>>>     Average for VM:
>>>     block write: ~114 MByte/sec, block read: ~108MByte/sec
>>>     ceph: total peak: 120 MByte/sec write, 111MByte/sec read
>>>
>>>
>>> Using dd:
>>> 10 VMs (two on each of the 5 hosts) VMs: 4GB RAM, write on a ceph based
>>> vm-disk "sdb" (rbd)
>>>     write: dd if=/dev/zero of=/dev/sdb bs=nnn count=kkk conv=fsync
>>> status=progress
>>>     read:  dd of=/dev/null if=/dev/sdb bs=nnn count=kkk  status=progress
>>>     Average for each VM:
>>>     bs=1024k count=12000: dd write: ~58MByte/sec, dd read: ~48MByte/sec
>>>     bs=4096k count=3000:  dd write: ~59MByte/sec, dd read: ~55MByte/sec
>>>     ceph: total peak: 711MByte/sec write, 698 MByte/sec read
>>>
>>> 5 VMs (two on each of the 5 hosts) VMs: 4GB RAM, write on a ceph based
>>> vm-disk "sdb" (rbd)
>>>     write: dd if=/dev/zero of=/dev/sdb bs=4096k count=3000 conv=fsync
>>> status=progress
>>>     read:  dd of=/dev/null if=/dev/sdb bs=4096k count=3000
>>> status=progress
>>>     Average for each VM:
>>>     bs=4096 count=3000:  dd write: ~80 MByte/sec, dd read: ~47MByte/sec
>>>     ceph: total peak: 421MByte/sec write, 248 MByte/sec read
>>>
>>> 1 VM: 4GB RAM, write on a ceph based vm-disk "sdb" (rbd-device)
>>>     write: dd if=/dev/zero of=/dev/sdb bs=4096k count=3000 conv=fsync
>>> status=progress
>>>     read:  dd of=/dev/null if=/dev/sdb bs=4096k count=3000
>>> status=progress
>>>     Average for each VM:
>>>     bs=4096k count=3000:  dd write: ~145 MByte/sec, dd read: ~130
>>> MByte/sec
>>>     ceph: total peak: 165 MByte/sec write, 337 MByte/sec read
>>

-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es