[pve-devel] ceph firefly in pve repos

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Thu Sep 11 11:38:33 CEST 2014


Hi,
Am 11.09.2014 um 08:03 schrieb Alexandre DERUMIER:
> Seem that dsync is super slow.
> (I need to check that, I think Stefan have discussed about it some months ago)
> 
> # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=direct
> 65536+0 enregistrements lus
> 65536+0 enregistrements écrits
> 268435456 octets (268 MB) copiés, 2,77433 s, 96,8 MB/s
> 
> 
> # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=dsync,direct
> ^C17228+0 enregistrements lus
> 17228+0 enregistrements écrits
> 70565888 octets (71 MB) copiés, 70,4098 s, 1,0 MB/s

yes dsync does a flush after each write of a bs. This is awfully slow.
You can, if you include my kernel patch skip this for SSDs with
capicitors as they should write the data down even in case of a power
failure.

Stefan


> ----- Mail original ----- 
> 
> De: "Alexandre DERUMIER" <aderumier at odiso.com> 
> À: "Dietmar Maurer" <dietmar at proxmox.com> 
> Cc: pve-devel at pve.proxmox.com 
> Envoyé: Jeudi 11 Septembre 2014 07:57:07 
> Objet: Re: [pve-devel] ceph firefly in pve repos 
> 
> forget the ceph tuning 
> ---------------------- 
> debug_lockdep = 0/0 
> debug_context = 0/0 
> debug_crush = 0/0 
> debug_buffer = 0/0 
> debug_timer = 0/0 
> debug_filer = 0/0 
> debug_objecter = 0/0 
> debug_rados = 0/0 
> debug_rbd = 0/0 
> debug_journaler = 0/0 
> debug_objectcatcher = 0/0 
> debug_client = 0/0 
> debug_osd = 0/0 
> debug_optracker = 0/0 
> debug_objclass = 0/0 
> debug_filestore = 0/0 
> debug_journal = 0/0 
> debug_ms = 0/0 
> debug_monc = 0/0 
> debug_tp = 0/0 
> debug_auth = 0/0 
> debug_finisher = 0/0 
> debug_heartbeatmap = 0/0 
> debug_perfcounter = 0/0 
> debug_asok = 0/0 
> debug_throttle = 0/0 
> debug_mon = 0/0 
> debug_paxos = 0/0 
> debug_rgw = 0/0 
> osd_op_threads = 5 
> filestore_op_threads = 4 
> 
> ms_nocrc = true 
> cephx sign messages = false 
> cephx require signatures = false 
> 
> ms_dispatch_throttle_bytes = 0 
> 
> #ceph 0.85 
> throttler_perf_counter = false 
> filestore_fd_cache_size = 64 
> filestore_fd_cache_shards = 32 
> osd_op_num_threads_per_shard = 1 
> osd_op_num_shards = 25 
> osd_enable_op_tracker = true 
> 
> 
> ceph 0.85 results 
> ----------------- 
> fio write : 1osd 
> ----------------- 
> bw=11694KB/s, iops=2923 
> 
> fio read: 1osd 
> --------------- 
> bw=38642KB/s, iops=9660 (I clear the buffer cache each second to be sure it's coming from disk) 
> 
> 
> 
> Now enabling : osd_enable_op_tracker = false 
> --------------------------------------------- 
> 
> 
> fio read : 1 osd : optracker disable 
> ------------------------------------ 
> bw=80606KB/s, iops=20151, (ALL cpu 100%) : GREAT ! 
> 
> fio write : 1 osd : optracker disable 
> ------------------------------------ 
> bw=11630KB/s, iops=2907 
> 
> 
> 
> 
> Now, I don't understand why write are so slow. 
> 
> 
> fio on sdb 
> ---------- 
> bw=257806KB/s, iops=64451 
> 
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
> sdb 0,00 0,00 0,00 63613,00 0,00 254452,00 8,00 31,24 0,49 0,00 0,49 0,02 100,00 
> 
> fio on osd through librbd 
> ------------------------- 
> bw=11630KB/s, iops=2907 
> 
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
> sdb 0,00 355,00 0,00 5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 0,19 99,70 
> 
> 
> 
> ----- Mail original ----- 
> 
> De: "Alexandre DERUMIER" <aderumier at odiso.com> 
> À: "Dietmar Maurer" <dietmar at proxmox.com> 
> Cc: pve-devel at pve.proxmox.com 
> Envoyé: Jeudi 11 Septembre 2014 06:22:21 
> Objet: Re: [pve-devel] ceph firefly in pve repos 
> 
> Hi, here my firsts results (firefly - crucial m550 (1 osd by node - replication x1) 
> 
> fio config 
> ----------- 
> [global] 
> #logging 
> #write_iops_log=write_iops_log 
> #write_bw_log=write_bw_log 
> #write_lat_log=write_lat_log 
> ioengine=rbd 
> clientname=admin 
> pool=test 
> rbdname=test 
> invalidate=0 # mandatory 
> rw=randwrite 
> #rw=randread 
> bs=4k 
> direct=1 
> numjobs=4 
> group_reporting=1 
> 
> [rbd_iodepth32] 
> iodepth=32 
> 
> 
> 
> fio write - 3 osd - 1 by node 
> -------------------------- 
> bw=15068KB/s, iops=3767 
> 
> 
> fio read - 3 osd - 1 by node 
> ----------------------------- 
> bw=130792KB/s, iops=32698 
> 
> 
> fio write : 1 osd only 
> ------------------------ 
> bw=5009.6KB/s, iops=1252 
> 
> fio read (osd) : 1 osd only 
> --------------------------- 
> bw=57016KB/s, iops=14254 
> 
> 
> Reads seem to come from buffer cache, I don't known how to bypass it ? 
> 
> 
> about writes: 
> 
> iostat show me around 100%util 
> 
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util 
> sdb 0,00 355,00 0,00 5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 0,19 99,70 
> 
> 
> 
> But fio on local disk show me a lot more iops 
> 
> read 4k : fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio 
> bw=294312KB/s, iops=73577 
> 
> write 4k : fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio 
> bw=257806KB/s, iops=64451 
> 
> 
> 
> 
> I'm going to try with ceph 0.85 to see improvements 
> 
> ----- Mail original ----- 
> 
> De: "Dietmar Maurer" <dietmar at proxmox.com> 
> À: "VELARTIS Philipp Dürhammer" <p.duerhammer at velartis.at>, "Alexandre DERUMIER (aderumier at odiso.com)" <aderumier at odiso.com> 
> Cc: pve-devel at pve.proxmox.com 
> Envoyé: Mercredi 10 Septembre 2014 13:22:46 
> Objet: RE: [pve-devel] ceph firefly in pve repos 
> 
>> I would like to know your 4k results. (to see maximal iops performance) With the 
>> option -b 4096 (you used the standard 4M with 16 threads) With 4MByte tests I 
>> think it should be easy to max out the network... 
> 
> OK, here are the results running the fio rbd example benchmark (4K test): 
> 
> # ./fio examples/rbd.fio 
> rbd_iodepth32: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=32 
> fio-2.1.11-20-g9a44 
> Starting 1 process 
> rbd engine: RBD version: 0.1.8 
> Jobs: 1 (f=1): [w(1)] [99.7% done] [0KB/5790KB/0KB /s] [0/1447/0 iops] [eta 00m:01s] 
> rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=494227: Wed Sep 10 13:00:56 2014 
> write: io=2048.0MB, bw=5875.2KB/s, iops=1468, runt=356952msec 
> 
> So 1468 is not really very high. 
> 
> @Alexandre: what values do you get? 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel at pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel at pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 



More information about the pve-devel mailing list