[pve-devel] ceph firefly in pve repos

Alexandre DERUMIER aderumier at odiso.com
Thu Sep 11 08:03:29 CEST 2014


Seem that dsync is super slow.
(I need to check that, I think Stefan have discussed about it some months ago)

# dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=direct
65536+0 enregistrements lus
65536+0 enregistrements écrits
268435456 octets (268 MB) copiés, 2,77433 s, 96,8 MB/s


# dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=dsync,direct
^C17228+0 enregistrements lus
17228+0 enregistrements écrits
70565888 octets (71 MB) copiés, 70,4098 s, 1,0 MB/s



----- Mail original ----- 

De: "Alexandre DERUMIER" <aderumier at odiso.com> 
À: "Dietmar Maurer" <dietmar at proxmox.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Jeudi 11 Septembre 2014 07:57:07 
Objet: Re: [pve-devel] ceph firefly in pve repos 

forget the ceph tuning 
---------------------- 
debug_lockdep = 0/0 
debug_context = 0/0 
debug_crush = 0/0 
debug_buffer = 0/0 
debug_timer = 0/0 
debug_filer = 0/0 
debug_objecter = 0/0 
debug_rados = 0/0 
debug_rbd = 0/0 
debug_journaler = 0/0 
debug_objectcatcher = 0/0 
debug_client = 0/0 
debug_osd = 0/0 
debug_optracker = 0/0 
debug_objclass = 0/0 
debug_filestore = 0/0 
debug_journal = 0/0 
debug_ms = 0/0 
debug_monc = 0/0 
debug_tp = 0/0 
debug_auth = 0/0 
debug_finisher = 0/0 
debug_heartbeatmap = 0/0 
debug_perfcounter = 0/0 
debug_asok = 0/0 
debug_throttle = 0/0 
debug_mon = 0/0 
debug_paxos = 0/0 
debug_rgw = 0/0 
osd_op_threads = 5 
filestore_op_threads = 4 

ms_nocrc = true 
cephx sign messages = false 
cephx require signatures = false 

ms_dispatch_throttle_bytes = 0 

#ceph 0.85 
throttler_perf_counter = false 
filestore_fd_cache_size = 64 
filestore_fd_cache_shards = 32 
osd_op_num_threads_per_shard = 1 
osd_op_num_shards = 25 
osd_enable_op_tracker = true 


ceph 0.85 results 
----------------- 
fio write : 1osd 
----------------- 
bw=11694KB/s, iops=2923 

fio read: 1osd 
--------------- 
bw=38642KB/s, iops=9660 (I clear the buffer cache each second to be sure it's coming from disk) 



Now enabling : osd_enable_op_tracker = false 
--------------------------------------------- 


fio read : 1 osd : optracker disable 
------------------------------------ 
bw=80606KB/s, iops=20151, (ALL cpu 100%) : GREAT ! 

fio write : 1 osd : optracker disable 
------------------------------------ 
bw=11630KB/s, iops=2907 




Now, I don't understand why write are so slow. 


fio on sdb 
---------- 
bw=257806KB/s, iops=64451 

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
sdb 0,00 0,00 0,00 63613,00 0,00 254452,00 8,00 31,24 0,49 0,00 0,49 0,02 100,00 

fio on osd through librbd 
------------------------- 
bw=11630KB/s, iops=2907 

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
sdb 0,00 355,00 0,00 5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 0,19 99,70 



----- Mail original ----- 

De: "Alexandre DERUMIER" <aderumier at odiso.com> 
À: "Dietmar Maurer" <dietmar at proxmox.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Jeudi 11 Septembre 2014 06:22:21 
Objet: Re: [pve-devel] ceph firefly in pve repos 

Hi, here my firsts results (firefly - crucial m550 (1 osd by node - replication x1) 

fio config 
----------- 
[global] 
#logging 
#write_iops_log=write_iops_log 
#write_bw_log=write_bw_log 
#write_lat_log=write_lat_log 
ioengine=rbd 
clientname=admin 
pool=test 
rbdname=test 
invalidate=0 # mandatory 
rw=randwrite 
#rw=randread 
bs=4k 
direct=1 
numjobs=4 
group_reporting=1 

[rbd_iodepth32] 
iodepth=32 



fio write - 3 osd - 1 by node 
-------------------------- 
bw=15068KB/s, iops=3767 


fio read - 3 osd - 1 by node 
----------------------------- 
bw=130792KB/s, iops=32698 


fio write : 1 osd only 
------------------------ 
bw=5009.6KB/s, iops=1252 

fio read (osd) : 1 osd only 
--------------------------- 
bw=57016KB/s, iops=14254 


Reads seem to come from buffer cache, I don't known how to bypass it ? 


about writes: 

iostat show me around 100%util 

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
sdb 0,00 355,00 0,00 5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 0,19 99,70 



But fio on local disk show me a lot more iops 

read 4k : fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio 
bw=294312KB/s, iops=73577 

write 4k : fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio 
bw=257806KB/s, iops=64451 




I'm going to try with ceph 0.85 to see improvements 

----- Mail original ----- 

De: "Dietmar Maurer" <dietmar at proxmox.com> 
À: "VELARTIS Philipp Dürhammer" <p.duerhammer at velartis.at>, "Alexandre DERUMIER (aderumier at odiso.com)" <aderumier at odiso.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Mercredi 10 Septembre 2014 13:22:46 
Objet: RE: [pve-devel] ceph firefly in pve repos 

> I would like to know your 4k results. (to see maximal iops performance) With the 
> option -b 4096 (you used the standard 4M with 16 threads) With 4MByte tests I 
> think it should be easy to max out the network... 

OK, here are the results running the fio rbd example benchmark (4K test): 

# ./fio examples/rbd.fio 
rbd_iodepth32: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=32 
fio-2.1.11-20-g9a44 
Starting 1 process 
rbd engine: RBD version: 0.1.8 
Jobs: 1 (f=1): [w(1)] [99.7% done] [0KB/5790KB/0KB /s] [0/1447/0 iops] [eta 00m:01s] 
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=494227: Wed Sep 10 13:00:56 2014 
write: io=2048.0MB, bw=5875.2KB/s, iops=1468, runt=356952msec 

So 1468 is not really very high. 

@Alexandre: what values do you get? 
_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 



More information about the pve-devel mailing list