AWS ARM64 vs. X86_64 – Bitcoin Performance Comparison

I’m running some experiments with bitcoin core software on AWS. I’m intrigued by 2 questions:

  • Which performs better, ARM64 vs X86_64?
  • What is performance loss of 32bit ARM vs 64bit ARM?

The Experiment

  • I fired up a t3.medium instance with ubuntu desktop 20.04(x86_64) image
  • I fired a second instance of t4g.medium type with ubuntu desktop 20.04 (arm64) image
  • To run 32bit ARM program, I followed the chroot approach documented in my previous post and set up an armhf Ubuntu 20.04 (focal) chroot environment
  • Then I download all 3 versions of bitcoin core software from its download site : 32bit ARM, 64bit ARM and 64bit Intel.
  • Run bitcoin-qt from scratch (delete ~/.bitcoin directory if any) and count the time duration for the verifying first 200,000 blocks.
    • Each configuration runs 2 times. We then take the average.
    • Other system variables are monitored to ensure similar running environment. Specifically networking or disk don’t seem to be a factor.

The Results

See results listed below. It seems for bitcoin related workload 64bit ARM version performs 20% faster than 32bit ARM version and 64bit Intel instance.

category64bit ARM 32bit ARM64bit Intel
Instance typet4g.mediumt4g.mediumt3.medium
CPU Graviton2Graviton2Intel Xeon Platinum 8000 series
# CPU cores222
RAM (GB)444
run duration #18’44” (524″)10’24” (624″)11’16″(676″)
run duration #27’30” (450″)9’58″(598″)9’04″(544″)
run duration average487″611″610″
relative performance to 64bit ARM100%80%80%
Pricing (us-west-2/Orgon) ($/hr)0.03360.03360.0416

Geekbench Performance on AWS Graviton 2

There is much touting about the new AWS Graviton 2 (ARM64) offering as a game changer. Let us run some benchmark to test it out.

Settings

We pick 3 EC2 instance types to compare:

  • a1 – First generation of ARM64 AWS Graviton CPU
  • m6g – Second generation of ARM64 Graviton 2 CPU
  • m5 – Intel Xeon Platinum 8259CL CPU

We run Geekbench 4 on all xlarge instances of these EC2 types. We mostly focus on 64bit performance, but we will also touch 32bit performance as well.

Overview

Instance typea1.xlargem6g.xlargem5.xlarge
vCPU444
Memory(GB)81616
Hourly price(us-ea-1,Linux)0.1020.1540.192
64bit single-core score189936093647
64bit multi-core score5227111428017

From the above table, several observations are obvious:

  • Graviton 2 has doubled the performance of Graviton 1.
  • For single core performance Graviton 2 is similar to Intel Xeon CPU
  • For multi-core performance, Graviton 2 scales up much better, likely because Intel uses hyper-threading technology, where vCPU count is only 1/2 of true CPU core count. By contrast, vCPU count in Graviton CPU is true CPU core count.

I also listed the pricing. It looks like Graviton 2 is a good deal!

Look into the Details

This link gives detailed scores for each test suite and each instance type. A few highlighted cells indicate interesting contrast between Intel Xeon and Graviton 2:

  • Intel Xeon is 20 times faster than Graviton 2 in AES test! This is *very* likely due to non-optimized implementation for ARM64, i.e., it is not using NEON instructions. Otherwise the performance should be more comparable.
  • Intel Xeon are 2x better in SGEMM and 50% better in SFFT, both heavily relying Intel AVS/SSE instructions while ARM64 using NEON instructions.
  • Graviton 2 shines in memory area, 2x better in Memory Copy and 4x better in Memory bandwidth.

32bit Performance

While 32bit performance is probably not interesting on those servers, it is still interesting to take a look. Below is the overview comparison table and this link gives detail scores.

Instance typea1.xlargem6g.xlargem5.xlarge
32bit single-core score170729753053
32bit multi-core score469290166886

Overall we see similar patterns in 64bit case:

  • Graviton 2 is about 2x faster than Graviton 1
  • Intel Xeon performs relatively same as Graviton 2 in single core and wanes in multi-core performance.
  • Detail scores also reflect similar pattern as 64bit case.