Memcached x86_64 VS arm64 性能对比

译者: wangxiyuan
作者: Martin Grigorov
原文链接: https://medium.com/@martin.grigorov/compare-memcached-performance-on-x86-64-and-arm64-cpu-architectures-7fe781e34ab8

Tomcat PMC Martin Grigorov带来的另一篇ARM64 VS X86性能对比文章。

上周,我分享了在 x86_64和 ARM64 CPU 架构上测试 Apache Tomcat 的结果。 在这篇文章中,我将测试 Memcached

什么是 Memcached?

摘自 Wikipedia: Memcached 是一个通用的分布式内存缓存系统。 它通常用于加速动态数据库驱动的网站,方法是在 RAM 中缓存数据和对象,以减少必须读取外部数据源(如数据库或 API)的次数。

与 Apache Tomcat 不同的是,Apache Tomcat 是用 Java 编写的,因此是多平台通用的。 而Memcached 是用 c 编写的,需要为不同的 CPU 体系构建它。 正如在其硬件 Wiki 页面 ARM64中所说的那样,它是官方支持的体系结构之一,并且有一个 BuildBot 构建器来测试所有的代码更改! 如果您遇到任何问题,只要在项目的问题跟踪工具中报告它! 项目的维护者 Dormando 会非常友好和积极响应!

在我第一次尝试为 Memcached 找到一个好的负载测试工具时,我无意中发现了 RedisLabs Memtier Benchmark 工具。 在 Apache Tomcat 的文章中提到的同一个 vm 上运行它,结果如下:

1
2
3
4
5
6
7
8
9
ASCII protocol on ARM64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 985.28 --- --- 20.02700 67.22
Gets 9842.00 0.00 9842.00 20.01900 248.83
Waits 0.00 --- --- 0.00000 ---
Totals 10827.28 0.00 9842.00 20.02000 316.05
1
2
3
4
5
6
7
8
9
ASCII protocol on x86_64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 931.04 --- --- 20.06800 63.52
Gets 9300.21 0.00 9300.21 20.32600 235.13
Waits 0.00 --- --- 0.00000 ---
Totals 10231.26 0.00 9300.21 20.30200 298.66

上面我们可以看到,Memcached 服务运行在 ARM64虚拟机会稍微快一点!

注意: Memcached 服务器运行的默认设置(最大1024连接,4线程和64M 内存) ,即没有指定自定义值。

对于二进制协议,数字几乎是一样的:

1
2
3
4
5
6
7
8
9
10

Binary protocol on ARM64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 829.68 --- --- 23.46500 63.90
Gets 8287.69 0.00 8287.69 23.56100 314.75
Waits 0.00 --- --- 0.00000 ---
Totals 9117.37 0.00 8287.69 23.55200 378.65
1
2
3
4
5
6
7
8
9
Binary protocol on x86_64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 829.32 --- --- 23.63600 63.87
Gets 8284.10 0.00 8284.10 23.58600 314.61
Waits 0.00 --- --- 0.00000 ---
Totals 9113.42 0.00 8284.10 23.59100 378.48

在我与 Memcached 社区分享了这些结果之后,社区建议我使用 MC Crusher 工具代替。 实际上,结果比之相关的数据要好得多:

ASCII protocol GET operations per second

image

ASCII protocol SET operations per second

image

在第一个图表中,你可以看到在 x86_64和 ARM64上,每秒大约有150万次 get 操作!

在第二张图表中,在 ARM64上每秒运行90万次,在 x86_64上每秒运行84万次。

注意: 由于 mc-crusher 工具不提供任何统计数据。因而我使用 Memcached 的统计命令,以获得执行的操作的数量。

下面是用于负载测试的设置:

  1. 服务器的启动方式如下:
1
$ memcached -t 16 -c 256 -m 2048

即16线程,最多256个并发连接和2Gb 内存。

  1. MC Crusher
    a. GET配置

    1
    2
    send=ascii_get,recv=blind_read,conns=100,key_prefix=foobar,pipelines=10
    send=ascii_set,recv=blind_read,conns=10,key_prefix=foobar,pipelines=4,stop_after=200000,usleep=1000,value_size=10

    b. SET配置

    1
    send=ascii_set,recv=blind_read,conns=100,key_prefix=foobar,value_size=2,value=hi,pipelines=10

注意: 在 GET 操作的图表中,你可以看到在2020年5月13日,这个数字从每秒950K 次左右上升到每秒160万次左右。 在那一天,我升级了我用作客户机的 VM,也就是我运行负载测试工具(mc-crusher)的地方,因为我注意到在测试运行期间,当客户机本身超载时会出现峰值。

我们再一次看到,ARM64服务器可以快到和 x86_64一样!

如果你对如何改善这个Memcached 测试或如何衡量一些其他方面有任何的想法,随意与我分享您的意见!

祝你黑客生活愉快,注意安全!

Last week I’ve shared with you the results of load testing Apache Tomcat on x86_64 and ARM64 CPU architecture. In this article I will test Memcached.

What is Memcached ?

From Wikipedia: Memcached is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read.

In contrast to Apache Tomcat which is written in Java and thus is multi-platform Memcached is written in C and one needs to build it especially for the your CPU architecture. As stated at its Hardware Wiki page ARM64 is one of the officially supported architectures and there is a BuildBot builder testing all code changes! If you face any issue just report it at the project’s issue tracker! Dormando, the project maintainer, is very friendly and responsive!

In my first attempt to find a good load testing tool for Memcached I stumbled upon RedisLabs Memtier Benchmark tool. Running it on the same VMs as in the article for Apache Tomcat and the results were:

1
2
3
4
5
6
7
8
9
ASCII protocol on ARM64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 985.28 --- --- 20.02700 67.22
Gets 9842.00 0.00 9842.00 20.01900 248.83
Waits 0.00 --- --- 0.00000 ---
Totals 10827.28 0.00 9842.00 20.02000 316.05
1
2
3
4
5
6
7
8
9
ASCII protocol on x86_64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 931.04 --- --- 20.06800 63.52
Gets 9300.21 0.00 9300.21 20.32600 235.13
Waits 0.00 --- --- 0.00000 ---
Totals 10231.26 0.00 9300.21 20.30200 298.66

Above we see that the Memcached server running on the ARM64 VM was slightly faster!

Note: the Memcached server was running with default settings (maximum 1024 connections, 4 threads and 64M memory), i.e. without specifying custom values.

For binary protocol the numbers are almost the same:

1
2
3
4
5
6
7
8
9
Binary protocol on ARM64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 829.68 --- --- 23.46500 63.90
Gets 8287.69 0.00 8287.69 23.56100 314.75
Waits 0.00 --- --- 0.00000 ---
Totals 9117.37 0.00 8287.69 23.55200 378.65
1
2
3
4
5
6
7
8
9
Binary protocol on x86_64

=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 829.32 --- --- 23.63600 63.87
Gets 8284.10 0.00 8284.10 23.58600 314.61
Waits 0.00 --- --- 0.00000 ---
Totals 9113.42 0.00 8284.10 23.59100 378.48

After sharing these results with Memcached community it was recommended to me to use MC Crusher tool instead. And indeed the numbers are much better with it:

ASCII protocol GET operations per second

image

ASCII protocol SET operations per second

image

In the first chart you may see that both on x86_64 and ARM64 it makes around 1.5 million get operations per second!

On the second chart it makes a little bit more than 900 thousand set operations per second on ARM64 and around 840 thousand ops per second on x86_64.

Note: Since mc-crusher tool does not provide any statistics from its execution I used Memcached’s stats command to get the number of executed operations.

Here are the settings used for the load test:

  1. The servers are started with:
1
$ memcached -t 16 -c 256 -m 2048

i.e. with 16 threads, maximum of 256 simultaneous connections and 2Gb memory.

  1. MC Crusher
    a. GET config

    1
    2
    send=ascii_get,recv=blind_read,conns=100,key_prefix=foobar,pipelines=10
    send=ascii_set,recv=blind_read,conns=10,key_prefix=foobar,pipelines=4,stop_after=200000,usleep=1000,value_size=10

    b. SET config

    1
    send=ascii_set,recv=blind_read,conns=100,key_prefix=foobar,value_size=2,value=hi,pipelines=10

Note: In the chart for the GET operation you see that the number rises at May 13th 2020 from around 950K operations per second to around 1.6 million ops/s. At that day I’ve upgraded the VM that I use as a client, i.e. where I run the load testing tools (mc-crusher) because I’ve noticed that during the test run there were spikes when the client itself was overloaded.

Once again we saw that ARM64 on the server could be as fast as x86_64!

If you have ideas how to improve this test or how to measure some other aspect of Memcached feel free to share it with me in the comments!

Happy hacking and stay safe!

#

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×