Canal is an open source product provided by Alibaba Group. Canal can parse incremental log data in MySQL and allows you to subscribe to and consume incremental data. You can use Canal to synchronize incremental data from a MySQL database to an Alibaba Cloud Elasticsearch cluster. This topic describes the procedure in detail. An ApsaraDB RDS for MySQL database is used in this topic.
Background information
Canal is an open source extract, transform, load (ETL) software provided in GitHub. For more information about work principles and other details about Canal, see Canal.
Procedure
- Make preparationsCreate an ApsaraDB RDS for MySQL instance, an Alibaba Cloud Elasticsearch cluster, and an Elastic Compute Service (ECS) instance that reside in the same virtual private cloud (VPC).
- ApsaraDB RDS for MySQL instance: stores source data and incremental data.
- Canal: used to parse database logs, obtain incremental data, and synchronize the incremental data to the Alibaba Cloud Elasticsearch cluster.
- Alibaba Cloud Elasticsearch cluster: receives incremental data.
- ECS instance: used to deploy the Canal server and Canal adapter.
- Step 1: Prepare the MySQL data sourcePrepare the data that you want to synchronize in the ApsaraDB RDS for MySQL instance.
- Step 2: Create an index and configure mappings for the indexCreate an index and configure mappings for the index in the Elasticsearch cluster. The field names and types that are defined in the mappings must be the same as those of the data that you want to synchronize.
- Step 3: Install the JDKInstall the Java Development Kit (JDK) before you use Canal. The version of the JDK must be 1.8.0 or later.
- Step 4: Install and start the Canal serverInstall the Canal server and modify its configuration file to associate the Canal server with the ApsaraDB RDS for MySQL instance. For a MySQL cluster, the Canal server simulates a slave node in the cluster to obtain the binary logs on the master node in the cluster. Then, the Canal server sends the logs to the Canal adapter.
- Step 5: Install and start the Canal adapterInstall the Canal adapter and modify its configuration file to associate the Canal adapter with the ApsaraDB RDS for MySQL instance and Elasticsearch cluster. Then, define the mappings between the fields in the ApsaraDB RDS for MySQL instance and those in the Elasticsearch cluster for data synchronization.
- Step 6: Verify the synchronization result of incremental dataAdd, modify, or delete data in the ApsaraDB RDS for MySQL instance and view the data synchronization result.
Make preparations
- Create an ApsaraDB RDS for MySQL instance. For more information, see Create an ApsaraDB RDS for MySQL instance. In this example, an ApsaraDB RDS instance that run MySQL 5.7 is created. The following figure shows the configuration of the instance.Important In this example, an ApsaraDB RDS instance that run MySQL 5.7 is used. If you use an ApsaraDB RDS instance that run MySQL 8.0, you must replace the MySQL 5.7 driver with the MySQL 8.0 driver. For more information, see FAQ.
- Create an Alibaba Cloud Elasticsearch cluster.
For more information, see Create an Alibaba Cloud Elasticsearch cluster. An Elasticsearch V6.7 cluster of the Standard Edition is created in this topic.
- Create an Alibaba Cloud ECS instance.
For more information, see Create an instance by using the wizard. The ECS instance runs an image of CentOS 7.6 64-bit in this topic.
Step 1: Prepare the MySQL data source
Log on to the ApsaraDB RDS console, and create an ApsaraDB RDS for MySQL database and a table. For more information, see General workflow to use ApsaraDB RDS for MySQL.

Step 2: Create an index and configure mappings for the index
- Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted. For more information about how to log on to the Kibana console, see Log on to the Kibana console.Note In this example, an Elasticsearch V6.7.0 cluster is used. Operations on clusters of other versions may differ. The actual operations in the console prevail.
- In the left-side navigation pane of the page that appears, click Dev Tools.
- On the Console tab of the page that appears, run the following command to create an index. In this example, an index named es_test is created. The index contains the following fields: count, id, name, and color.Important The field names and types defined in the mappings in the command must be the same as those in the table created in Step 1: Prepare the MySQL data source.
PUT es_test?include_type_name=true { "settings" : { "index" : { "number_of_shards" : "5", "number_of_replicas" : "1" } }, "mappings" : { "_doc" : { "properties" : { "count": { "type": "text" }, "id": { "type": "integer" }, "name": { "type" : "text", "analyzer": "ik_smart" }, "color" : { "type" : "text" } } } } }
After the index is created and the mappings are configured, the following result is returned:{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "es_test" }
Step 3: Install the JDK
- Connect to the ECS instance. For more information, see Connect to a Linux instance by using a password or key.Note In this example, a common user is used.
- View available JDK packages.
sudo yum search java | grep -i --color JDK
- Install the JDK of the required version. In this example, java-1.8.0-openjdk-devel.x86_64 is used.
sudo yum install java-1.8.0-openjdk-devel.x86_64
- Configure environment variables.
- Run the following commands to check whether the JDK is installed:
java
javac
java -version
If the JDK is installed, the result shown in the following figure is returned.
Step 4: Install and start the Canal server
- Download the Canal server package. A Canal 1.1.4 server is used in this topic.
wget https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.deployer-1.1.4.tar.gz
Note Canal 1.1.5 supports only Alibaba Cloud Elasticsearch V7.0 clusters. If your Elasticsearch is of V7.0, you must download the Canal 1.1.5 package. For more information, see Canal release note. - Run the following command to decompress the package:
tar -zxvf canal.deployer-1.1.4.tar.gz
- Run the following command to modify the
conf/example/instance.properties
file:vi conf/example/instance.properties
Parameter Description canal.instance.master.address Configure this parameter in the format of <Internal endpoint of the ApsaraDB RDS for MySQL instance>:<Internal port>. You can obtain the required information on the Basic Information page of the ApsaraDB RDS for MySQL instance. Example: rm-bp1u1xxxxxxxxx6ph.mysql.rds.aliyuncs.com:3306. canal.instance.dbUsername The username that is used to log on to the ApsaraDB RDS for MySQL database. You can obtain the username on the Accounts tab of the ApsaraDB RDS for MySQL instance. canal.instance.dbPassword The password that is used to log on to the ApsaraDB RDS for MySQL database. - Press Esc, and run the
:wq
command to save the file and exit from the vi mode. - Start the Canal server and query the log.
./bin/startup.sh cat logs/canal/canal.log
Step 5: Install and start the Canal adapter
- Download the Canal adapter package. A Canal 1.1.4 server is used in this topic.
wget https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.adapter-1.1.4.tar.gz
Note Canal 1.1.5 supports only Alibaba Cloud Elasticsearch V7.0 clusters. If your Elasticsearch is of V7.0, you must download the Canal 1.1.5 package. For more information, see Canal release note. - Run the following command to decompress the package:
tar -zxvf canal.adapter-1.1.4.tar.gz
- Run the following command to modify the
conf/application.yml
file:vi conf/application.yml
Parameter Description canal.conf.canalServerHost The address of the Canal deployer. Retain the default value 127.0.0.1:11111. canal.conf.srcDataSources.defaultDS.url Configure this parameter in the format of jdbc:mysql://<Internal endpoint of the ApsaraDB RDS for MySQL instance>:<Internal port>/<Database name>?useUnicode=true. You can obtain the required information on the Basic Information page of the ApsaraDB RDS for MySQL instance. Example: jdbc:mysql://rm-bp1xxxxxxxxxnd6ph.mysql.rds.aliyuncs.com:3306/elasticsearch?useUnicode=true. canal.conf.srcDataSources.defaultDS.username The username that is used to log on to the ApsaraDB RDS for MySQL database. You can obtain the username on the Accounts tab of the ApsaraDB RDS for MySQL instance. canal.conf.srcDataSources.defaultDS.password The password that is used to log on to the ApsaraDB RDS for MySQL database. canal.conf.canalAdapters.groups.outerAdapters.hosts Find name:es and replace hosts with <Internal endpoint of the Elasticsearch cluster>:<Internal port>. You can obtain the required information on the Basic Information page of the Elasticsearch cluster. Example: es-cn-v64xxxxxxxxx3medp.elasticsearch.aliyuncs.com:9200. canal.conf.canalAdapters.groups.outerAdapters.mode Set the value to rest. canal.conf.canalAdapters.groups.outerAdapters.properties.security.auth Configure this parameter in the format of <Username of the Elasticsearch cluster>:<Password>. Example: elastic:es_password. canal.conf.canalAdapters.groups.outerAdapters.properties.cluster.name The ID of the Elasticsearch cluster. You can obtain the cluster ID on the Basic Information page of the Elasticsearch cluster. Example: es-cn-v64xxxxxxxxx3medp. - Press Esc, and run the
:wq
command to save the file and exit from the vi mode. - Repeat the preceding steps to modify the
conf/es/*.yml
file and specify the fields that you want to map from the ApsaraDB RDS for MySQL database to the Elasticsearch cluster.Parameter Description esMapping._index Set the value to the name of the index created in the Elasticsearch cluster in Step 2: Create an index and configure mappings for the index. es_test is used in this topic. esMapping._type Set the value to the type of the index created in the Elasticsearch cluster in Step 2: Create an index and configure mappings for the index. _doc is used in this topic. esMapping._id The ID of the document generated for the fields that you want to synchronize to the Elasticsearch cluster. You can customize this parameter. _id is used in this topic. esMapping.sql The SQL statement that is used to query the fields that you want to synchronize to the Elasticsearch cluster. The select t.id as _id,t.id,t.count,t.name,t.color from es_test t
statement is used in this topic. - Start the Canal adapter and query logs.
./bin/startup.sh cat logs/adapter/adapter.log
If the Canal adapter is started, the result shown in the following figure is returned.
Step 6: Verify the synchronization result of incremental data
- In the ApsaraDB RDS for MySQL database, add, modify, or remove data in the es_test table.
insert `elasticsearch`.`es_test`(`count`,`id`,`name`,`color`) values('11',2,'canal_test2','red');
- Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted. For more information about how to log on to the Kibana console, see Log on to the Kibana console.Note In this example, an Elasticsearch V6.7.0 cluster is used. Operations on clusters of other versions may differ. The actual operations in the console prevail.
- In the left-side navigation pane of the page that appears, click Dev Tools.
- On the Console tab of the page that appears, run the following command to query the synchronized data:
GET /es_test/_search
If the incremental data is synchronized, the result shown in the following figure is returned.Important You can use Canal to synchronize only incremental data.
FAQ
- Download the package of the MySQL 8.0 driver.
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.29.zip
- Decompress the package.
unzip mysql-connector-java-8.0.29.zip
- Copy the obtained file to the lib directory of the Canal adapter.
mv mysql-connector-java-8.0.29/mysql-connector-java-8.0.29.jar lib/
- Add permissions.
chmod 777 mysql-connector-java-8.0.29.jar chmod +st mysql-connector-java-8.0.29.jar
- Delete the MySQL 5.x driver.
rm -rf lib/mysql-connector-java-5.1.40.jar