[pve-devel] [PATCH] ceph : use jemalloc for build

Alexandre DERUMIER aderumier at odiso.com
Tue May 2 18:07:49 CEST 2017


also check here:

https://www.msi.umn.edu/sites/default/files/MN_RH_BOFSC15.pdf

this is the test from ceph dev between tcmalloc and jemalloc.
(interesting results is that tcmalloc performance decrease over time)


----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "Fabian Grünbichler" <f.gruenbichler at proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mardi 2 Mai 2017 15:27:27
Objet: Re: [pve-devel] [PATCH] ceph : use jemalloc for build

>>I did some quick fio benchmark on Friday, and did not see any real 
>>performance improvements (but the memory usage grows by ~50% for the OSD 
>>processes!). 

yes, memory increase is expected. (That's why is not enable by default, ceph guys didn't want impact servers with a lot of disks) 


>>could you share your benchmarks/test setup? 

I don't have my test cluster running for now, but when I have done test, 


My test setup was with 3 nodes (2x 10 cores 3.1ghz), with 6 osd (intel ssd s3610) each . (so 18 osd total). 

without jemalloc, I was around 400k iops randread 4k , with jemalloc 600k iops randread 4k. 
without jemalloc, I was around 150k iops randread 4k , with jemalloc 200k iops randread 4k. 

(This was with 10 fio jobs, queue depth=64, randread). 

during the test, the cpu of cluster was 100% usage. 


Maybe you don't see difference because you are not cpu limited ? 
But I think you should see latency difference. (I don't have number exactly) 



Here a benchmark with jemalloc results: 
https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150813_S303E_Zhang.pdf 



ceph.conf 
--------- 
[global] 
fsid = 49312468-47e2-47f3-87a2-ed8033f515a2 
mon_initial_members = ceph1-1, ceph1-2, ceph1-3 
mon_host = 10.5.0.34,10.5.0.35,10.5.0.36 
auth_cluster_required = none 
auth_service_required = none 
auth_client_required = none 

debug paxos = 0/0 
debug journal = 0/0 
debug mds_balancer = 0/0 
debug mds = 0/0 

debug lockdep = 0/0 
debug auth = 0/0 
debug mds_log = 0/0 
debug mon = 0/0 
debug perfcounter = 0/0 
debug monc = 0/0 
debug rbd = 0/0 
debug throttle = 0/0 
debug mds_migrator = 0/0 
debug client = 0/0 
debug rgw = 0/0 
debug finisher = 0/0 
debug journaler = 0/0 
debug ms = 0/0 
debug hadoop = 0/0 
debug mds_locker = 0/0 
debug tp = 0/0 
debug context = 0/0 
debug osd = 0/0 
debug bluestore = 0/0 
debug objclass = 0/0 
debug objecter = 0/0 

osd pool default size = 3 
osd_pool_default_min_size = 1 
osd_pool_default_pg_num = 1024 
osd_pool_default_pgp_num = 1024 


osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k,delaylog 
osd_mkfs_type = xfs 
osd_mkfs_options_xfs = -f -i size=2048 
mon_pg_warn_max_per_osd = 10000 

filestore_queue_max_ops = 5000 
osd_client_message_size_cap = 0 
objecter_infilght_op_bytes = 1048576000 
ms_dispatch_throttle_bytes = 1048576000 

filestore_wbthrottle_enable = true 
filestore_fd_cache_shards = 64 
objecter_inflight_ops = 1024000 
filestore_queue_committing_max_bytes = 1048576000 
osd_op_num_threads_per_shard = 2 
filestore_queue_max_bytes = 10485760000 
osd_op_threads = 20 
osd_op_num_shards = 10 
filestore_max_sync_interval = 10 
filestore_op_threads = 16 
osd_pg_object_context_cache_count = 10240 
journal_queue_max_ops = 3000 
journal_queue_max_bytes = 10485760000 
journal_max_write_entries = 1000 
filestore_queue_committing_max_ops = 5000 
journal_max_write_bytes = 1048576000 
osd_enable_op_tracker = False 
filestore_fd_cache_size = 10240 
osd_client_message_cap = 0 



----- Mail original ----- 
De: "Fabian Grünbichler" <f.gruenbichler at proxmox.com> 
À: "aderumier" <aderumier at odiso.com> 
Cc: "pve-devel" <pve-devel at pve.proxmox.com> 
Envoyé: Mardi 2 Mai 2017 10:37:19 
Objet: Re: [pve-devel] [PATCH] ceph : use jemalloc for build 

On Fri, Apr 28, 2017 at 02:32:07PM +0200, Alexandre DERUMIER wrote: 
> >>this flag is only for rocksdb, and only for the windows build? 
> 
> I have tested it, and it's working fine. (using perf command, I was seeing jemalloc) 

should have been more clear here - it DOES work, but not because of that 
flag (the Cmake build output even explicitly states that it will ignore 
it ;)). the Cmake build scripts use tcmalloc or jemalloc based on 
whether you have tcmalloc or jemalloc headers installed (in that order), 
and because of your added Build-Conflicts tcmalloc headers cannot be 
installed. 

ALLOCATOR explicitly sets the desired allocator, bypassing the 
autodetection. 

> 
> >>I propose "-DALLOCATOR=jemalloc" instead - what do you think? 
> 
> I think it should work, I will test it this weekend. 
> 

thanks for confirming that it works as expected. 

I did some quick fio benchmark on Friday, and did not see any real 
performance improvements (but the memory usage grows by ~50% for the OSD 
processes!). could you share your benchmarks/test setup? 

_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 




More information about the pve-devel mailing list