PBS Client CPU Benchmark

From Proxmox VE
Revision as of 06:52, 26 July 2021 by Thomas Lamprecht (talk | contribs) (Created page with "= Introduction = This article is using the benchmark included in the Proxmox Backup Server client and building said client from source as rough (!) comparison for different C...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

This article is using the benchmark included in the Proxmox Backup Server client and building said client from source as rough (!) comparison for different CPU performance and efficiency.

Note that modern CPUs are really complex, benchmarks should always be seen as such. We use the PBS client benchmark as it replicates a real workload close to 1:1, and we use the compilation of the client as using a modern compiler is one of the hardest stress tests which is still replicating actual workload.

It produces an efficiency score, which should be seen as "order of magnitude" score, i.e. logarithmic. A few points more or less may not be relevant and noise coming from input and architecture composition. But, a difference of factor 2 to 5, or even higher, should be seen as actual meaningfull difference of that platforms, where the lower is clearly less suited for the workload when efficiency is in one mind.

For raw performance the "Sum MiB/s" column can be used as, again rough, estimation.

Comparison

PBS Client Benchmark

Vendor CPU Model Arch Release Y/Q TDP W W/Cores SHA256 MiB/s zstd l1 compr. MiB/s zstd l1 decompr. Chunk Verify MiB/s AES256GCM MiB/s Sum MiB/s Efficiency Score
Intel Xeon E5-2620 v3 amd64 2014/3 85 7.08 407.06 406.54 890.03 280.57 1870.60 3447.74 486.97
Intel i9-9900K amd64 2018/4 95 5.94 612.84 694.92 1518.13 438.38 4099.33 7363.6 1239.66
Intel Celeron J4105 amd64 2017/4 10 2.5 615.88 221.56 526.61 279.93 945.39 2589.37 1035.75
Intel Core 2 Duo E8500 amd64 2008/1 65 32.5 239.43 315.91 632.83 175.24 121.06 1484.47 45.68
AMD Ryzen 7 5800X amd64 2020/4 105 6.56 2374.35 952.37 1893.54 930.41 4640.62 10791.29 1644.39
AMD Ryzen 7 3700X amd64 2019/3 65 4.06 2078.08 683.08 1384.11 836.79 3705.00 8687.06 2139.67
AMD EPYC 7302P amd64 2019/3 155 4.84 1624.23 539.71 1106.47 656.09 2978.71 6905.21 1426.70
BCM 2711B0 Cortex-A72 arm64 2015/2 10 2.5 142.41 123.64 300.87 95.29 23.46 685.67 274.27

Efficiency score is calculated by dividing the sum of all benchmark result (unit MiB/(s*core) by the W/Cores metric, due to Watt being J/s this means the score has the unit of MiB/Joule and thus makes only sense when observed over time.

Note that Watt/Cores was chosen as just comparing TDP makes one draw conclusions which will be wrong in real world - as the benchmark is mostly single core, and thus multicore systems look worse than they will in practice (where multiple backup jobs/verifications/GCs/... can run at the same time).

PBS Client Build

Vendor CPU Model Arch Release Y/Q Instructions # Clock Cycles # Avg. Instr./Cycle Total Time s
Intel Xeon E5-2620 v3 amd64 2014/3 4,618,619,075,708 6,384,841,530,256 0.72 277.179347490
Intel i9-9900K (in KVM) amd64 2018/4 4,448,729,708,491 6,838,600,240,070 0.65 174.346811364
AMD Ryzen 7 5800X amd64 2020/4 4,384,063,783,832 5,229,554,881,150 0.84 116.030434890
BCM 2711B0 Cortex-A72 arm64 2015/2 5,002,859,357,663 10,670,903,671,410 0.47 1920.615051800

The quite modern Ryzen 7 5800X beats the Raspberry Pi 4's Cortex-A72 CPU by a factor of 16.5 (1655 % faster).

For the PBS build more core help first, but at the end linking (which makes out a significant part of the total build time) is done in a single process, so one needs all three, high core count, high clock rate and high instructions per cycle rate to "win" here.

Data

Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Benchmark

SHA256 speed: 407.06 MB/s
Compression speed: 406.54 MB/s
Decompress speed: 890.03 MB/s
AES256/GCM speed: 1870.60 MB/s
Verify speed: 280.57 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ not tested         │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 407.06 MB/s (20%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 406.54 MB/s (54%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 890.03 MB/s (74%)  │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 280.57 MB/s (37%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 1870.60 MB/s (51%) │
└───────────────────────────────────┴────────────────────┘

PBS Client Build

Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
     2,445,836.61 msec task-clock                #    8.824 CPUs utilized          
          515,060      context-switches          #    0.211 K/sec                  
           11,447      cpu-migrations            #    0.005 K/sec                  
        6,545,302      page-faults               #    0.003 M/sec                  
6,384,841,530,256      cycles                    #    2.610 GHz                    
4,618,619,075,708      instructions              #    0.72  insn per cycle         
  978,867,153,867      branches                  #  400.218 M/sec                  
   39,652,439,215      branch-misses             #    4.05% of all branches        

    277.179347490 seconds time elapsed

   2406.400671000 seconds user
     40.862553000 seconds sys

Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Build

Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
     1,695,478.27 msec task-clock                #    9.725 CPUs utilized
          197,471      context-switches          #    0.116 K/sec
           18,272      cpu-migrations            #    0.011 K/sec
        6,382,947      page-faults               #    0.004 M/sec
6,838,600,240,070      cycles                    #    4.033 GHz
4,448,729,708,491      instructions              #    0.65  insn per cycle
  943,595,955,517      branches                  #  556.537 M/sec
   35,189,700,299      branch-misses             #    3.73% of all branches

    174.346811364 seconds time elapsed

   1662.116788000 seconds user
     39.862990000 seconds sys

PBS Client Benchmark

SHA256 speed: 612.84 MB/s
Compression speed: 694.92 MB/s
Decompress speed: 1518.13 MB/s
AES256/GCM speed: 4099.33 MB/s
Verify speed: 438.38 MB/s
┌───────────────────────────────────┬─────────────────────┐
│ Name                              │ Value               │
╞═══════════════════════════════════╪═════════════════════╡
│ TLS (maximal backup upload speed) │ not tested          │
├───────────────────────────────────┼─────────────────────┤
│ SHA256 checksum computation speed │ 612.84 MB/s (30%)   │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 compression speed    │ 694.92 MB/s (92%)   │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 decompression speed  │ 1518.13 MB/s (127%) │
├───────────────────────────────────┼─────────────────────┤
│ Chunk verification speed          │ 438.38 MB/s (58%)   │
├───────────────────────────────────┼─────────────────────┤
│ AES256 GCM encryption speed       │ 4099.33 MB/s (112%) │
└───────────────────────────────────┴─────────────────────┘

Intel(R) Celeron(TM) J4105 CPU @ 1.50GHz

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Benchmark

SHA256 speed: 615.88 MB/s
Compression speed: 221.56 MB/s
Decompress speed: 526.61 MB/s
AES256/GCM speed: 945.39 MB/s
Verify speed: 279.93 MB/s
┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ not tested        │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 615.88 MB/s (30%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 221.56 MB/s (29%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 526.61 MB/s (44%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 279.93 MB/s (37%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 945.39 MB/s (26%) │
└───────────────────────────────────┴───────────────────┘

Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz

Update date 2021-03-23
PBS Version 1.0.11
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Benchmark

SHA256 speed: 239.43 MB/s
Compression speed: 315.91 MB/s
Decompress speed: 632.83 MB/s
AES256/GCM speed: 121.06 MB/s
Verify speed: 175.24 MB/s

┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ not tested        │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 239.43 MB/s (12%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 315.91 MB/s (42%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 632.83 MB/s (53%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 175.24 MB/s (23%) │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 121.06 MB/s (3%)  │
└───────────────────────────────────┴───────────────────┘

AMD Ryzen 7 5800x 8-Core Processor

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.10
Distro Version Debian Sid (2020-03)

PBS Client Build

Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
     1,160,289.16 msec task-clock                #   10.000 CPUs utilized
          166,631      context-switches          #    0.144 K/sec
           10,292      cpu-migrations            #    0.009 K/sec
        6,427,491      page-faults               #    0.006 M/sec
5,229,554,881,150      cycles                    #    4.507 GHz                      (2.18%)
   13,378,977,812      stalled-cycles-frontend   #    0.26% frontend cycles idle     (2.19%)
  187,718,252,978      stalled-cycles-backend    #    3.59% backend cycles idle      (2.18%)
4,384,063,783,832      instructions              #    0.84  insn per cycle
                                                 #    0.04  stalled cycles per insn  (2.18%)
  882,521,754,530      branches                  #  760.605 M/sec                    (2.18%)
   37,025,573,266      branch-misses             #    4.20% of all branches          (2.16%)
1,720,800,625,889      L1-dcache-loads           # 1483.079 M/sec                    (2.17%)
   92,596,728,614      L1-dcache-load-misses     #    5.38% of all L1-dcache accesses  (2.19%)
  <not supported>      LLC-loads
  <not supported>      LLC-load-misses

   116.030434890 seconds time elapsed

   1141.607730000 seconds user


PBS Client Benchmark

SHA256 speed: 2374.35 MB/s
Compression speed: 952.37 MB/s
Decompress speed: 1893.54 MB/s
AES256/GCM speed: 4640.62 MB/s
Verify speed: 930.41 MB/s

AMD Ryzen 7 3700X 8-Core Processor

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Benchmark

proxmox-backup-client benchmark
SHA256 speed: 2078.08 MB/s
Compression speed: 683.08 MB/s
Decompress speed: 1384.11 MB/s
AES256/GCM speed: 3705.00 MB/s
Verify speed: 836.79 MB/s
┌───────────────────────────────────┬─────────────────────┐
│ Name                              │ Value               │
╞═══════════════════════════════════╪═════════════════════╡
│ TLS (maximal backup upload speed) │ not tested          │
├───────────────────────────────────┼─────────────────────┤
│ SHA256 checksum computation speed │ 2078.08 MB/s (103%) │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 compression speed    │ 683.08 MB/s (91%)   │
├───────────────────────────────────┼─────────────────────┤
│ ZStd level 1 decompression speed  │ 1384.11 MB/s (116%) │
├───────────────────────────────────┼─────────────────────┤
│ Chunk verification speed          │ 836.79 MB/s (110%)  │
├───────────────────────────────────┼─────────────────────┤
│ AES256 GCM encryption speed       │ 3705.00 MB/s (102%) │
└───────────────────────────────────┴─────────────────────┘

AMD EPYC 7302P 16-Core Processor

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.4
Distro Version Proxmox VE 6.3

PBS Client Benchmark

SHA256 speed: 1624.23 MB/s
Compression speed: 539.71 MB/s
Decompress speed: 1106.47 MB/s
AES256/GCM speed: 2978.71 MB/s
Verify speed: 656.09 MB/s
┌───────────────────────────────────┬────────────────────┐
│ Name                              │ Value              │
╞═══════════════════════════════════╪════════════════════╡
│ TLS (maximal backup upload speed) │ not tested         │
├───────────────────────────────────┼────────────────────┤
│ SHA256 checksum computation speed │ 1624.23 MB/s (80%) │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 compression speed    │ 539.71 MB/s (72%)  │
├───────────────────────────────────┼────────────────────┤
│ ZStd level 1 decompression speed  │ 1106.47 MB/s (92%) │
├───────────────────────────────────┼────────────────────┤
│ Chunk verification speed          │ 656.09 MB/s (87%)  │
├───────────────────────────────────┼────────────────────┤
│ AES256 GCM encryption speed       │ 2978.71 MB/s (82%) │
└───────────────────────────────────┴────────────────────┘

ARM64 Cortex-A72 - Raspberry Pi 4

Update date 2021-03-05
PBS Version 1.0.8
Linux Kernel 5.8
Distro Version Ubuntu Server 20.10

PBS Client Build

Performance counter stats for 'cargo build --release --bin proxmox-backup-client --bin pxar --bin dump-catalog-shell-cli':
       7,116,962.15 msec task-clock                #    3.706 CPUs utilized          
            349,207      context-switches          #    0.049 K/sec                  
              5,272      cpu-migrations            #    0.001 K/sec                  
          6,070,961      page-faults               #    0.853 K/sec                  
 10,670,903,671,410      cycles                    #    1.499 GHz                    
  5,002,859,357,663      instructions              #    0.47  insn per cycle         
    <not supported>      branches                                                    
     56,010,658,667      branch-misses                                               

   1920.615051800 seconds time elapsed

   7000.438300000 seconds user
    112.304292000 seconds sys

PBS Client Benchmark

SHA256 speed: 142.41 MB/s
Compression speed: 123.64 MB/s
Decompress speed: 300.87 MB/s
AES256/GCM speed: 23.46 MB/s
Verify speed: 95.29 MB/s
┌───────────────────────────────────┬───────────────────┐
│ Name                              │ Value             │
╞═══════════════════════════════════╪═══════════════════╡
│ TLS (maximal backup upload speed) │ not tested        │
├───────────────────────────────────┼───────────────────┤
│ SHA256 checksum computation speed │ 142.41 MB/s (7%)  │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 compression speed    │ 123.64 MB/s (16%) │
├───────────────────────────────────┼───────────────────┤
│ ZStd level 1 decompression speed  │ 300.87 MB/s (25%) │
├───────────────────────────────────┼───────────────────┤
│ Chunk verification speed          │ 95.29 MB/s (13%)  │
├───────────────────────────────────┼───────────────────┤
│ AES256 GCM encryption speed       │ 23.46 MB/s (1%)   │
└───────────────────────────────────┴───────────────────┘