PBS Client CPU Benchmark: Difference between revisions
(Created page with "= Introduction = This article is using the benchmark included in the Proxmox Backup Server client and building said client from source as rough (!) comparison for different C...") |
m (→PBS Client Benchmark: update from internal data) |
||
Line 16: | Line 16: | ||
{| class="wikitable sortable" style="text-align:right;" | {| class="wikitable sortable" style="text-align:right;" | ||
|- | |- | ||
! Vendor !! CPU Model !! Arch !! Release Y/Q !! | ! Vendor !! CPU Model !! Arch !! Release Y/Q !! mTDP W !! W/Cores !! SHA256 MiB/s !! zstd l1 compr. MiB/s !! zstd l1 decompr. !! Chunk Verify MiB/s !! AES256GCM MiB/s !! Sum MiB/s !! Efficiency Score | ||
|- | |- | ||
| Intel || Xeon E5-2620 v3 || amd64 || 2014/3 || 85 || 7.08 || 407.06 || 406.54 || 890.03 || 280.57 || 1870.60 || 3447.74 || 486.97 | | Intel || Xeon E5-2620 v3 || amd64 || 2014/3 || 85 || 7.08 || 407.06 || 406.54 || 890.03 || 280.57 || 1870.60 || 3447.74 || 486.97 | ||
|- | |- | ||
| Intel || i9-9900K || amd64 || 2018/4 || 95 || 5.94 || 612.84 || 694.92 || 1518.13 || 438.38 || 4099.33 || 7363.6 || 1239.66 | | Intel || i9-9900K || amd64 || 2018/4 || 95 || 5.94 || 612.84 || 694.92 || 1518.13 || 438.38 || 4099.33 || 7363.6 || 1239.66 | ||
|- | |||
| Intel || i9-10900X || amd64 || 2019/4 || 165 || 16.5 || 570.78 || 661.73 || 953.15 || 357.14 || 2106.38 || 4649.18 || 281.76 | |||
|- | |||
| Intel || i7-12700K || amd64 || 2021/4 || 190 || 9.5 || 2442.16 || 891.44 || 1237.12 || 820.55 || 3235.00 || 8626.27 || 908.03 | |||
|- | |||
| Intel || i3-1220P || amd64 || 2022/1 || 64 || 5.3 || 2047.1 || 728.86 || 1007.77 || 673.44 || 2544.34 || 7001.51 || 1321.04 | |||
|- | |- | ||
| Intel || Celeron J4105 || amd64 || 2017/4 || 10 || 2.5 || 615.88 || 221.56 || 526.61 || 279.93 || 945.39 || 2589.37 || 1035.75 | | Intel || Celeron J4105 || amd64 || 2017/4 || 10 || 2.5 || 615.88 || 221.56 || 526.61 || 279.93 || 945.39 || 2589.37 || 1035.75 | ||
Line 26: | Line 32: | ||
| Intel || Core 2 Duo E8500 || amd64 || 2008/1 || 65 || 32.5 || 239.43 || 315.91 || 632.83 || 175.24 || 121.06 || 1484.47 || 45.68 | | Intel || Core 2 Duo E8500 || amd64 || 2008/1 || 65 || 32.5 || 239.43 || 315.91 || 632.83 || 175.24 || 121.06 || 1484.47 || 45.68 | ||
|- | |- | ||
| AMD || Ryzen 7 5800X || amd64 || 2020/4 || | | AMD || Ryzen 7 7900X || amd64 || 2022/3 || 230 || 9.58 || 2807.0 || 1173.26 || 1758.63 || 1081.55 || 3630.68 || 10451.12 || 1090.93 | ||
|- | |||
| AMD || Ryzen 7 5800X || amd64 || 2020/4 || 142 || 8.87 || 2374.35 || 952.37 || 1893.54 || 930.41 || 4640.62 || 10791.29 || 1215.92 | |||
|- | |- | ||
| AMD || Ryzen 7 3700X || amd64 || 2019/3 || | | AMD || Ryzen 7 3700X || amd64 || 2019/3 || 95 || 5.93 || 2078.08 || 683.08 || 1384.11 || 836.79 || 3705.00 || 8687.06 || 1463.08 | ||
|- | |- | ||
| AMD || EPYC 7302P || amd64 || 2019/3 || 155 || 4.84 || 1624.23 || 539.71 || 1106.47 || 656.09 || 2978.71 || 6905.21 || 1426.70 | | AMD || EPYC 7302P || amd64 || 2019/3 || 155 || 4.84 || 1624.23 || 539.71 || 1106.47 || 656.09 || 2978.71 || 6905.21 || 1426.70 |
Revision as of 12:22, 3 March 2023
Introduction
This article is using the benchmark included in the Proxmox Backup Server client and building said client from source as rough (!) comparison for different CPU performance and efficiency.
Note that modern CPUs are really complex, benchmarks should always be seen as such. We use the PBS client benchmark as it replicates a real workload close to 1:1, and we use the compilation of the client as using a modern compiler is one of the hardest stress tests which is still replicating actual workload.
It produces an efficiency score, which should be seen as "order of magnitude" score, i.e. logarithmic. A few points more or less may not be relevant and noise coming from input and architecture composition. But, a difference of factor 2 to 5, or even higher, should be seen as actual meaningfull difference of that platforms, where the lower is clearly less suited for the workload when efficiency is in one mind.
For raw performance the "Sum MiB/s" column can be used as, again rough, estimation.
Comparison
PBS Client Benchmark
Vendor | CPU Model | Arch | Release Y/Q | mTDP W | W/Cores | SHA256 MiB/s | zstd l1 compr. MiB/s | zstd l1 decompr. | Chunk Verify MiB/s | AES256GCM MiB/s | Sum MiB/s | Efficiency Score |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Intel | Xeon E5-2620 v3 | amd64 | 2014/3 | 85 | 7.08 | 407.06 | 406.54 | 890.03 | 280.57 | 1870.60 | 3447.74 | 486.97 |
Intel | i9-9900K | amd64 | 2018/4 | 95 | 5.94 | 612.84 | 694.92 | 1518.13 | 438.38 | 4099.33 | 7363.6 | 1239.66 |
Intel | i9-10900X | amd64 | 2019/4 | 165 | 16.5 | 570.78 | 661.73 | 953.15 | 357.14 | 2106.38 | 4649.18 | 281.76 |
Intel | i7-12700K | amd64 | 2021/4 | 190 | 9.5 | 2442.16 | 891.44 | 1237.12 | 820.55 | 3235.00 | 8626.27 | 908.03 |
Intel | i3-1220P | amd64 | 2022/1 | 64 | 5.3 | 2047.1 | 728.86 | 1007.77 | 673.44 | 2544.34 | 7001.51 | 1321.04 |
Intel | Celeron J4105 | amd64 | 2017/4 | 10 | 2.5 | 615.88 | 221.56 | 526.61 | 279.93 | 945.39 | 2589.37 | 1035.75 |
Intel | Core 2 Duo E8500 | amd64 | 2008/1 | 65 | 32.5 | 239.43 | 315.91 | 632.83 | 175.24 | 121.06 | 1484.47 | 45.68 |
AMD | Ryzen 7 7900X | amd64 | 2022/3 | 230 | 9.58 | 2807.0 | 1173.26 | 1758.63 | 1081.55 | 3630.68 | 10451.12 | 1090.93 |
AMD | Ryzen 7 5800X | amd64 | 2020/4 | 142 | 8.87 | 2374.35 | 952.37 | 1893.54 | 930.41 | 4640.62 | 10791.29 | 1215.92 |
AMD | Ryzen 7 3700X | amd64 | 2019/3 | 95 | 5.93 | 2078.08 | 683.08 | 1384.11 | 836.79 | 3705.00 | 8687.06 | 1463.08 |
AMD | EPYC 7302P | amd64 | 2019/3 | 155 | 4.84 | 1624.23 | 539.71 | 1106.47 | 656.09 | 2978.71 | 6905.21 | 1426.70 |
BCM | 2711B0 Cortex-A72 | arm64 | 2015/2 | 10 | 2.5 | 142.41 | 123.64 | 300.87 | 95.29 | 23.46 | 685.67 | 274.27 |
Efficiency score is calculated by dividing the sum of all benchmark result (unit MiB/(s*core)
by the W/Cores
metric, due to Watt being J/s
this means the score has the unit of MiB/Joule
and thus makes only sense when observed over time.
Note that Watt/Cores
was chosen as just comparing TDP makes one draw conclusions which will be wrong in real world - as the benchmark is mostly single core, and thus multicore systems look worse than they will in practice (where multiple backup jobs/verifications/GCs/... can run at the same time).
PBS Client Build
Vendor | CPU Model | Arch | Release Y/Q | Instructions # | Clock Cycles # | Avg. Instr./Cycle | Total Time s |
---|---|---|---|---|---|---|---|
Intel | Xeon E5-2620 v3 | amd64 | 2014/3 | 4,618,619,075,708 | 6,384,841,530,256 | 0.72 | 277.179347490 |
Intel | i9-9900K (in KVM) | amd64 | 2018/4 | 4,448,729,708,491 | 6,838,600,240,070 | 0.65 | 174.346811364 |
AMD | Ryzen 7 5800X | amd64 | 2020/4 | 4,384,063,783,832 | 5,229,554,881,150 | 0.84 | 116.030434890 |
BCM | 2711B0 Cortex-A72 | arm64 | 2015/2 | 5,002,859,357,663 | 10,670,903,671,410 | 0.47 | 1920.615051800 |
The quite modern Ryzen 7 5800X beats the Raspberry Pi 4's Cortex-A72 CPU by a factor of 16.5 (1655 % faster).
For the PBS build more core help first, but at the end linking (which makes out a significant part of the total build time) is done in a single process, so one needs all three, high core count, high clock rate and high instructions per cycle rate to "win" here.
Data
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Benchmark
SHA256 speed: 407.06 MB/s Compression speed: 406.54 MB/s Decompress speed: 890.03 MB/s AES256/GCM speed: 1870.60 MB/s Verify speed: 280.57 MB/s ┌───────────────────────────────────┬────────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪════════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼────────────────────┤ │ SHA256 checksum computation speed │ 407.06 MB/s (20%) │ ├───────────────────────────────────┼────────────────────┤ │ ZStd level 1 compression speed │ 406.54 MB/s (54%) │ ├───────────────────────────────────┼────────────────────┤ │ ZStd level 1 decompression speed │ 890.03 MB/s (74%) │ ├───────────────────────────────────┼────────────────────┤ │ Chunk verification speed │ 280.57 MB/s (37%) │ ├───────────────────────────────────┼────────────────────┤ │ AES256 GCM encryption speed │ 1870.60 MB/s (51%) │ └───────────────────────────────────┴────────────────────┘
PBS Client Build
Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
2,445,836.61 msec task-clock # 8.824 CPUs utilized 515,060 context-switches # 0.211 K/sec 11,447 cpu-migrations # 0.005 K/sec 6,545,302 page-faults # 0.003 M/sec 6,384,841,530,256 cycles # 2.610 GHz 4,618,619,075,708 instructions # 0.72 insn per cycle 978,867,153,867 branches # 400.218 M/sec 39,652,439,215 branch-misses # 4.05% of all branches 277.179347490 seconds time elapsed 2406.400671000 seconds user 40.862553000 seconds sys
Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Build
Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
1,695,478.27 msec task-clock # 9.725 CPUs utilized 197,471 context-switches # 0.116 K/sec 18,272 cpu-migrations # 0.011 K/sec 6,382,947 page-faults # 0.004 M/sec 6,838,600,240,070 cycles # 4.033 GHz 4,448,729,708,491 instructions # 0.65 insn per cycle 943,595,955,517 branches # 556.537 M/sec 35,189,700,299 branch-misses # 3.73% of all branches 174.346811364 seconds time elapsed 1662.116788000 seconds user 39.862990000 seconds sys
PBS Client Benchmark
SHA256 speed: 612.84 MB/s Compression speed: 694.92 MB/s Decompress speed: 1518.13 MB/s AES256/GCM speed: 4099.33 MB/s Verify speed: 438.38 MB/s ┌───────────────────────────────────┬─────────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪═════════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼─────────────────────┤ │ SHA256 checksum computation speed │ 612.84 MB/s (30%) │ ├───────────────────────────────────┼─────────────────────┤ │ ZStd level 1 compression speed │ 694.92 MB/s (92%) │ ├───────────────────────────────────┼─────────────────────┤ │ ZStd level 1 decompression speed │ 1518.13 MB/s (127%) │ ├───────────────────────────────────┼─────────────────────┤ │ Chunk verification speed │ 438.38 MB/s (58%) │ ├───────────────────────────────────┼─────────────────────┤ │ AES256 GCM encryption speed │ 4099.33 MB/s (112%) │ └───────────────────────────────────┴─────────────────────┘
Intel(R) Celeron(TM) J4105 CPU @ 1.50GHz
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Benchmark
SHA256 speed: 615.88 MB/s Compression speed: 221.56 MB/s Decompress speed: 526.61 MB/s AES256/GCM speed: 945.39 MB/s Verify speed: 279.93 MB/s
┌───────────────────────────────────┬───────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪═══════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼───────────────────┤ │ SHA256 checksum computation speed │ 615.88 MB/s (30%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 compression speed │ 221.56 MB/s (29%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 decompression speed │ 526.61 MB/s (44%) │ ├───────────────────────────────────┼───────────────────┤ │ Chunk verification speed │ 279.93 MB/s (37%) │ ├───────────────────────────────────┼───────────────────┤ │ AES256 GCM encryption speed │ 945.39 MB/s (26%) │ └───────────────────────────────────┴───────────────────┘
Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz
Update date | 2021-03-23 |
PBS Version | 1.0.11 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Benchmark
SHA256 speed: 239.43 MB/s Compression speed: 315.91 MB/s Decompress speed: 632.83 MB/s AES256/GCM speed: 121.06 MB/s Verify speed: 175.24 MB/s ┌───────────────────────────────────┬───────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪═══════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼───────────────────┤ │ SHA256 checksum computation speed │ 239.43 MB/s (12%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 compression speed │ 315.91 MB/s (42%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 decompression speed │ 632.83 MB/s (53%) │ ├───────────────────────────────────┼───────────────────┤ │ Chunk verification speed │ 175.24 MB/s (23%) │ ├───────────────────────────────────┼───────────────────┤ │ AES256 GCM encryption speed │ 121.06 MB/s (3%) │ └───────────────────────────────────┴───────────────────┘
AMD Ryzen 7 5800x 8-Core Processor
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.10 |
Distro Version | Debian Sid (2020-03) |
PBS Client Build
Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
1,160,289.16 msec task-clock # 10.000 CPUs utilized 166,631 context-switches # 0.144 K/sec 10,292 cpu-migrations # 0.009 K/sec 6,427,491 page-faults # 0.006 M/sec 5,229,554,881,150 cycles # 4.507 GHz (2.18%) 13,378,977,812 stalled-cycles-frontend # 0.26% frontend cycles idle (2.19%) 187,718,252,978 stalled-cycles-backend # 3.59% backend cycles idle (2.18%) 4,384,063,783,832 instructions # 0.84 insn per cycle # 0.04 stalled cycles per insn (2.18%) 882,521,754,530 branches # 760.605 M/sec (2.18%) 37,025,573,266 branch-misses # 4.20% of all branches (2.16%) 1,720,800,625,889 L1-dcache-loads # 1483.079 M/sec (2.17%) 92,596,728,614 L1-dcache-load-misses # 5.38% of all L1-dcache accesses (2.19%) <not supported> LLC-loads <not supported> LLC-load-misses 116.030434890 seconds time elapsed 1141.607730000 seconds user
PBS Client Benchmark
SHA256 speed: 2374.35 MB/s Compression speed: 952.37 MB/s Decompress speed: 1893.54 MB/s AES256/GCM speed: 4640.62 MB/s Verify speed: 930.41 MB/s
AMD Ryzen 7 3700X 8-Core Processor
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Benchmark
proxmox-backup-client benchmark SHA256 speed: 2078.08 MB/s Compression speed: 683.08 MB/s Decompress speed: 1384.11 MB/s AES256/GCM speed: 3705.00 MB/s Verify speed: 836.79 MB/s ┌───────────────────────────────────┬─────────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪═════════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼─────────────────────┤ │ SHA256 checksum computation speed │ 2078.08 MB/s (103%) │ ├───────────────────────────────────┼─────────────────────┤ │ ZStd level 1 compression speed │ 683.08 MB/s (91%) │ ├───────────────────────────────────┼─────────────────────┤ │ ZStd level 1 decompression speed │ 1384.11 MB/s (116%) │ ├───────────────────────────────────┼─────────────────────┤ │ Chunk verification speed │ 836.79 MB/s (110%) │ ├───────────────────────────────────┼─────────────────────┤ │ AES256 GCM encryption speed │ 3705.00 MB/s (102%) │ └───────────────────────────────────┴─────────────────────┘
AMD EPYC 7302P 16-Core Processor
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.4 |
Distro Version | Proxmox VE 6.3 |
PBS Client Benchmark
SHA256 speed: 1624.23 MB/s Compression speed: 539.71 MB/s Decompress speed: 1106.47 MB/s AES256/GCM speed: 2978.71 MB/s Verify speed: 656.09 MB/s ┌───────────────────────────────────┬────────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪════════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼────────────────────┤ │ SHA256 checksum computation speed │ 1624.23 MB/s (80%) │ ├───────────────────────────────────┼────────────────────┤ │ ZStd level 1 compression speed │ 539.71 MB/s (72%) │ ├───────────────────────────────────┼────────────────────┤ │ ZStd level 1 decompression speed │ 1106.47 MB/s (92%) │ ├───────────────────────────────────┼────────────────────┤ │ Chunk verification speed │ 656.09 MB/s (87%) │ ├───────────────────────────────────┼────────────────────┤ │ AES256 GCM encryption speed │ 2978.71 MB/s (82%) │ └───────────────────────────────────┴────────────────────┘
ARM64 Cortex-A72 - Raspberry Pi 4
Update date | 2021-03-05 |
PBS Version | 1.0.8 |
Linux Kernel | 5.8 |
Distro Version | Ubuntu Server 20.10 |
PBS Client Build
Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
7,116,962.15 msec task-clock # 3.706 CPUs utilized 349,207 context-switches # 0.049 K/sec 5,272 cpu-migrations # 0.001 K/sec 6,070,961 page-faults # 0.853 K/sec 10,670,903,671,410 cycles # 1.499 GHz 5,002,859,357,663 instructions # 0.47 insn per cycle <not supported> branches 56,010,658,667 branch-misses 1920.615051800 seconds time elapsed 7000.438300000 seconds user 112.304292000 seconds sys
PBS Client Benchmark
SHA256 speed: 142.41 MB/s Compression speed: 123.64 MB/s Decompress speed: 300.87 MB/s AES256/GCM speed: 23.46 MB/s Verify speed: 95.29 MB/s ┌───────────────────────────────────┬───────────────────┐ │ Name │ Value │ ╞═══════════════════════════════════╪═══════════════════╡ │ TLS (maximal backup upload speed) │ not tested │ ├───────────────────────────────────┼───────────────────┤ │ SHA256 checksum computation speed │ 142.41 MB/s (7%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 compression speed │ 123.64 MB/s (16%) │ ├───────────────────────────────────┼───────────────────┤ │ ZStd level 1 decompression speed │ 300.87 MB/s (25%) │ ├───────────────────────────────────┼───────────────────┤ │ Chunk verification speed │ 95.29 MB/s (13%) │ ├───────────────────────────────────┼───────────────────┤ │ AES256 GCM encryption speed │ 23.46 MB/s (1%) │ └───────────────────────────────────┴───────────────────┘