2020-12-08

数据库

Hybrid deployment of MariaDB cluster on x86 and arm64

Author: zhaorenhai

This article attempts to deploy MariaDB clusters on x86 and arm64 platforms to see if the deployment can be successful.
The deployment environment is carried out on HUAWEI CLOUD. The OS is openEuler 20.03 version and MariaDB version is 10.3.9.
In the official MariaDB documentation, there are two types of high-availability environment deployment methods. The simpler one is replication, and then the Galera cluster.
Let’s try these two high-availability environments separately, whether they support mixed deployment on x86 and arm64 platforms.

Replication
As the name implies, replication is that one or more slave databases replicate data from a master database in real time, so as to ensure that the data of the two databases are the same. This article will only try one primary - one replica scenario.

The purpose of replication:
• You can distribute read requests to multiple databases, thereby dispersing the pressure on the primary database, the most commonly used scenario is the scenario of more reads and less writes.
• Data analysis scenarios. Analyzing the data in the database generally requires more complex SQL. If it is executed on the main database, it will cause a lot of pressure on the main database, because the primary database is generally designed as an OLTP database, which is not suitable for analytical needs. With replication, data analysis can be performed only on the replica database, reducing the pressure on the primary database.
• Backup scenes. The backup itself will also have a relatively large performance impact on the database. With replication, and backup on the replica database can avoid the impact on the primary database. For databases that are frequently updated and inserted, you can disconnect them from the main database when you back up, so that the backup is faster and you can restore it after the backup.
• Data distribution, which can be replicated to distribute data to different regions.

The technical principle of replication: first back up the primary database data, and then restore it on the replica database to ensure that the basic data of the two databases are consistent, and then configure the replica database to apply the binlog of the primary database in real time to ensure the two database data consistency.

The deployment steps of Replication
First, purchase two cloud servers in the same network area of Huawei Cloud, one x86 and one arm64, ensure that the intranet can communicate with each other, and openEuler is selected for OS.

Then deploy the primary database on x86:

1	yum install mariadb-server

Create data file directory

1	mkdir /var/db/mariadbdatadir

Configure the database configuration file:
Modify /etc/my.cnf.d/mariadb-server.cnf, add the following configuration under [mariadb] (specific parameters can be adjusted according to your own environment):

datadir = /var/db/mariadbdatadir
innodb_buffer_pool_size = 6G
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
bind-address = 0.0.0.0
log-bin
server_id = 1
log-basename = primary1
binlog-format = mixed

Start the database:

1	systemctl start mariadb.service

Check whether the database started successfully

1	systemctl status mariadb.service

Load test data:
For simplicity, we use tpcc-mysql tool to load some tpcc data for testing

yum install mariadb-devel
git clone https://github.com/Percona-Lab/tpcc-mysql
cd tpcc-mysql/src
make
cd ..
mysqladmin create tpcc10
mysql tpcc10 < create_table.sql
./tpcc_load -h127.0.0.1 -d tpcc10  -u root -p "" -w 10

Create a user:

1
2
3

mysql -u root
Create user 'replication_user'@'%' identified by 'bigs3cret';
grant replication slave on *.* to 'replication_user'@'%';

Now a working primary database with some data is configured.

Next we configure a replica database on the arm machine.

1	yum install mariadb-server

Create data file directory:

1	mkdir /var/db/mariadbdatadir

Configure the database:
modify /etc/my.cnf.d/mariadb-server.cnf, add the following configuration under [mariadb] (specific parameters can be adjusted according to your own environment):

datadir = /var/db/mariadbdatadir
innodb_buffer_pool_size = 6G
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
server_id = 2

Next, copy the basic data on the primary to replica.
Log in to the primary :

1	mysql -u root

Execute the following sql:

flush tables with read lock;
show master status;
+---------------------+----------+--------------+------------------+
| File                | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------------+----------+--------------+------------------+
| primary1-bin.000011 |      687 |              |                  |
+---------------------+----------+--------------+------------------+

Record the results of File and Position in the above output.
Note that this session cannot be interrupted until the data backup of the primary below is completed.
Create a backup directory on primary:

1	mkdir /var/db/mariadbbackup

Perform backup:

1 2	mariabackup --backup --target-dir=/var/db/mariadbbackup/ --user=root --password="" mariabackup --prepare --target-dir=/var/db/mariadbbackup/

Then you can now unlock the above session, and execute the following sql in the above session:

1	unlock tables;

Copy the backed up data to the replica:
On primary:

1 2	cd /var/db tar -czvf mariadbbackup.tar.gz mariadbbackup

Operations on replica:

scp root@192.168.0.196: /var/db/mariadbbackup.tar.gz /var/db/ 
cd /var/db/
tar -zxvf mariadbbackup.tar.gz 
mariabackup --copy-back --target-dir=/var/db/mariadbbackup/ --datadir=/var/db/mariadbdatadir/
cd mariadbdatadir 
chown -R mysql: mysql ./* 
systemctl start mariadb.service

Configure real-time replication:

1 2	cd /var/db/mariadbbackup cat xtrabackup_binlog_info

The output is as follows, remember these values, and use them for subsequent operations on replica.

1	primary1-bin.000011 687 0-1-437

Operations on replica:

mysql -u root
set global gtid_slave_pos = '0-1-437';
CHANGE MASTER TO
  MASTER_HOST='192.168.0.196',
  MASTER_USER='replication_user',
  MASTER_PASSWORD='bigs3cret',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='primary1-bin.000011 ',
  MASTER_LOG_POS=687,
  MASTER_CONNECT_RETRY=10,
  MASTER_USE_GTID = slave_pos;
START SLAVE;

Use the following command to view the replication status

1	show slave status\G

If the values of the following two items are both Yes, it is normal.

1 2	Slave_IO_Running: Yes Slave_SQL_Running: Yes

On the primary, load another database and perform the following test:

1
2
3

cd tpcc-mysql/
mysql tpcc-test < create_table.sql
./tpcc_load -h127.0.0.1 -d tpcc-test  -u root -p "" -w 10

Query on replica:

1 2	mysql -u root show databases;

The output is as follows:

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| tpcc-test          |
| tpcc10             |
+--------------------+

You can see that the tpcc-test database is also generated normally, and
you can execute the following sql for further check:

1
2
3

use database tpcc-test;
show tables;
select count(*) from customer;

If everything is normal, it means that the data is synchronized normally.

After the above test, it can be basically proved that the replication can be mixed deployment on x86 and arm64 environments.

Galera cluster
Galera cluster is a multi-master, multi-active cluster, data is synchronized between multiple nodes, to ensure that the data of multiple database nodes are consistent, unlike Oracle RAC, Galera cluster does not share storage, each database use an independent storage.
This has many advantages. Any database node will not affect the business if it is hung up. For applications with more read requests, it can also share the pressure well. Of course, for write requests, due to the existence of synchronization requests, the performance will be a little worse than that of a stand-alone database.
The data synchronization between multiple nodes of the Galera cluster is synchronous, not asynchronous, so that the consistency of the database on multiple nodes can be ensured, and there is no data divergence between multiple nodes. For performance considerations, it is not real-time. The synchronization technology uses a kind of virtual synchronization. The principle of virtual synchronization technology is more complicated. You can refer to the following article:
https://blog.csdn.net/wzy0623/article/details/102522268

Galera cluster deployment steps
We plan to deploy a three-node x86 Galera cluster, and then deploy a GLB load balancer on the front end.

First, purchase four x86 and one arm cloud server instances on Huawei Cloud, specifications: 4vCPUs 16GB memory, CPU frequency 2.6GHz. The OS still uses the openEuler20.03 version.
Four x86 cloud servers, one is used to deploy GLB, and the other three are used to deploy MariaDB Galera cluster. An arm cloud server is used to deploy MariaDB nodes that will later join the cluster.

Perform the following operations on the three x86 cluster nodes:

1	yum install mariadb-server

Create data file directory

1	mkdir /var/db/mariadbdatadir

Configure the database configuration file:
modify /etc/my.cnf.d/mariadb-server.cnf, add the following configuration under [mariadb] (specific parameters can be adjusted according to your own environment):

datadir = /var/db/mariadbdatadir
innodb_buffer_pool_size = 6G
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
bind-address = 0.0.0.0

Then execute systemctl start mariadb, start the database

Each node then performs the following operations:

1	yum install mariadb-server-galera

Modify the /etc/my.cnf.d/galera.cnf.
wsrep_cluster_address item should be configured to the following values, where the value of each IP address is the intranet IP of the three cluster nodes:

1	wsrep_cluster_address="gcomm://192.168.0.159,192.168.0.78,192.168.0.23"

On the first node, stop the database:

1	systemctl stop mariadb

Execute the following command on the first node to initialize the cluster:

1	galera_new_cluster

Then restart the database on the other two nodes:

1	systemctl restart mariadb

You can log in to the database and use the following command to view the cluster status:

1	show status like 'wsrep%';

Now that the cluster is successfully built.

We are going to build a GLB load balancer next.
GLB is a pure TCP layer load balancer, simple and easy to use.

yum install gcc* libtool
git clone https://github.com/codership/glb
cd glb/
./bootstrap.sh
./configure
make
make install
cp files/glbd.cfg /etc/default/glbd

Edit the /etc/default/glbd
configuration as follows:

LISTEN_ADDR="8010"
CONTROL_ADDR="127.0.0.1:8011"
CONTROL_FIFO="/var/run/glbd.fifo"
THREADS="16"
MAX_CONN=256
DEFAULT_TARGETS="192.168.0.159:3306 192.168.0.78:3306 192.168.0.23:3306"

Then start the load balancer

1 2	cd files ./glbd.sh

Then log in to any database of the cluster node and execute the following sql to create a user

1 2	create user 'tpcc'@'192.168.0.%' identified by '123456'; grant all privileges on . to 'tpcc'@'192.168.0.%' with grant option;

Now start to load some test data.
On the node where the load balancer is located:

yum install mariadb
yum install mariadb-devel
git clone https://github.com/Percona-Lab/tpcc-mysql
cd tpcc-mysql/src
make
cd ..

Load test data

mysqladmin -h 127.0.0.1 -P 8010 -u tpcc -p123456 create tpcc10
mysql -h 127.0.0.1 -P 8010 -u tpcc -p123456 tpcc10 < create_table.sql
./tpcc_load -h127.0.0.1 -P8010 -d tpcc10  -u tpcc -p "123456" -w 10
mysql -h 127.0.0.1 -P 8010 -u tpcc -p123456 tpcc10 < add_fkey_idx.sql

Next, stop one of the x86 cluster nodes and prepare to replace it with an arm64 node.
Log in to one of the cluster nodes:
systemctl stop mariadb
and then log in to the other two cluster nodes, change the value of wsrep_cluster_address in /etc/my.cnf.d/galera.cnf from the IP of the stopped node to the IP of the am64 node.

Perform the following operations on the arm64 node:

1	yum install mariadb-server

Create data file directory

1	mkdir /var/db/mariadbdatadir

datadir = /var/db/mariadbdatadir
innodb_buffer_pool_size = 6G
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
bind-address = 0.0.0.0

Then execute systemctl start mariadb, start the database

1	yum install mariadb-server-galera

Modify the /etc/my.cnf.d/galera.cnf,wsrep_cluster_address should be configured to the following values, which is consistent with the other two x86 nodes.

1	wsrep_cluster_address="gcomm://192.168.0.159,192.168.0.78,192.168.0.173"

Then execute systemctl restart mariadb and wait for the restart to succeed, which may be slower.
Then log in to the database and enter the following command to view the cluster status

1	show status like 'wsrep%';

Then log in to the database on the arm64 node to check whether the data is synchronized normally.
If everything is normal, we log in to the load balancer node, we can unload the previous data, then reload the data, and test the large-scale deletion and new data scenarios.

cd tpcc-mysql
mysqladmin -h 127.0.0.1 -P 8010 -u tpcc -p123456 drop tpcc10
mysqladmin -h 127.0.0.1 -P 8010 -u tpcc -p123456 create tpcc10
mysql -h 127.0.0.1 -P 8010 -u tpcc -p123456 tpcc10 < create_table.sql
./tpcc_load -h127.0.0.1 -P8010 -d tpcc10  -u tpcc -p "123456" -w 10
mysql -h 127.0.0.1 -P 8010 -u tpcc -p123456 tpcc10 < add_fkey_idx.sql

Then log in to each database to query whether the data is consistent.
After the above test, it can be proved that the Galera cluster can also be mixed deployment on x86 and arm64.

# 数据库

Hybrid deployment of MariaDB cluster on x86 and arm64

Comments

Links

Categories

Tags

Categories

Tags

Your browser is out-of-date!