Synchronize data to RDS
Prerequisites
1. Prepare an RDS instance and a table schema. Create an RDS instance in the RDS console. You can select either the classic network or a VPC. When you create a DataHub sync task, you must select the correct network type. When DataHub synchronizes data, it maps the data types of the DataHub fields to the corresponding RDS data types. The following table describes the mappings.
DataHub | RDS |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
STRING | VARCHAR |
BOOLEAN | BOOLEAN / TINYINT |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
TIMESTAMP | TIMESTAMP / BIGINT |
DECIMAL | DECIMAL |
The TINYINT, SMALLINT, INTEGER, and FLOAT data types in DataHub are supported starting from Java SDK version 2.16.1-public. 2. RDS whitelist and internal endpoint. When you use the DataHub sync feature, you must configure an IP whitelist in the RDS console to allow the DataHub service to access your RDS instance. For the IP address ranges of the DataHub service, see Overview. Additionally, when you create the DataHub sync task, you must enter the internal endpoint of the RDS instance to ensure network connectivity.
3. Sync notes
DataHub only supports synchronizing TUPLE data to the RDS service.
By default, TIMESTAMP data in DataHub is converted to the RDS TIMESTAMP type with
microsecondprecision. Ensure that you control the data precision.To avoid lock contention from concurrent reads and writes on the same primary key, ensure that data with the same primary key is written to the same DataHub shard.
When you use a VPC, ensure that the DataHub topic and the RDS instance are in the same region.
Create a sync task
Go to the
Project List > Project Details > Topic Detailspage.Click the
+ Syncbutton in the upper-right corner to create a sync task.
The following section describes some of the configuration parameters for creating a sync task in the console. For more flexible operations, you can use the SDK.
Host: The endpoint of the RDS service. You must enter the
internal endpointto ensure connectivity.Import fields: You can configure DataHub to synchronize the content of specific columns to the RDS table.
Write mode: This includes the IGNORE and OVERWRITE modes.
IGNORE: Ignores duplicate data. This mode uses an
INSERT IGNORE INTOstatement.OVERWRITE: Updates duplicate data. This mode uses a
REPLACE INTOstatement.
VpcId and Instance ID: If your RDS instance is in a VPC, you must provide the VpcId and RDS instance information.
Sync example
Create an RDS instance and a data table in the RDS console, as shown in the following figure:

Create a DataHub topic. In this example, a TUPLE topic is created.
Create a sync task. For this example, set the write mode to IGNORE and import all fields.
Write TUPLE data to the DataHub topic. The following figure shows the four data records.

Confirm the synchronized data. Use a MySQL client to connect to the RDS service and view the data. The following figure shows the result:

Synchronize data to MySQL
Prerequisites
1. Prepare a MySQL instance and a table schema. Create a MySQL instance in the MySQL console.
When DataHub synchronizes data, it maps the data types of the DataHub fields to the corresponding MySQL data types. The following table describes the mappings.
DataHub | MySQL |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
STRING | VARCHAR |
BOOLEAN | BOOLEAN / TINYINT |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
TIMESTAMP | TIMESTAMP / BIGINT |
DECIMAL | DECIMAL |
The TINYINT, SMALLINT, INTEGER, and FLOAT data types in DataHub are supported starting from Java SDK version 2.16.1-public.
2. Sync notes. Note the following when you perform a sync operation: DataHub only supports synchronizing TUPLE data to the MySQL service. By default, TIMESTAMP data in DataHub is converted to the MySQL TIMESTAMP type with microsecond precision. Ensure that you control the data precision. To avoid lock contention from concurrent reads and writes on the same primary key, ensure that data with the same primary key is written to the same DataHub shard.
Create a sync task
Go to the
Project List > Project Details > Topic Detailspage.Click the
+ Syncbutton in the upper-right corner to create a sync task.
The following section describes some of the configuration parameters for creating a sync task in the console. For more flexible operations, you can use the SDK.
Host: The endpoint of the MySQL service. You must enter the
internal endpointto ensure connectivity.Import fields: You can configure DataHub to synchronize the content of specific columns to the MySQL table.
Write mode: This includes the IGNORE and OVERWRITE modes.
IGNORE: Ignores duplicate data. This mode uses an
INSERT IGNORE INTOstatement.OVERWRITE: Updates duplicate data. This mode uses a
REPLACE INTOstatement.
Sync example
Create a MySQL instance and a data table in the MySQL console.
Create a DataHub topic. In this example, a TUPLE topic is created. The following figure shows its schema.
Create a sync task. For this example, set the write mode to IGNORE and import all fields.
Write TUPLE data to the DataHub topic. The following figure shows the four data records.
Confirm the synchronized data. Use a MySQL client to connect to the MySQL service and view the data. The following figure shows the result.
Synchronize data to AnalyticDB for MySQL 3.0
Prerequisites
1. Prepare an AnalyticDB for MySQL instance and a table schema. Create an AnalyticDB for MySQL instance in the AnalyticDB for MySQL console. You can select either the classic network or a VPC. When you create a DataHub sync task, you must select the correct network type. When DataHub synchronizes data, it maps the data types of the DataHub fields to the corresponding AnalyticDB for MySQL data types. The following table describes the mappings.
DataHub | ADS |
TINYINT | TINYINT |
SMALLINT | SMALLINT |
INTEGER | INT |
BIGINT | BIGINT |
STRING | VARCHAR |
BOOLEAN | BOOLEAN / TINYINT |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
TIMESTAMP | TIMESTAMP / BIGINT |
DECIMAL | DECIMAL |
The TINYINT, SMALLINT, INTEGER, and FLOAT data types in DataHub are supported starting from Java SDK version 2.16.1-public. 2. AnalyticDB for MySQL whitelist and internal endpoint. When you use the DataHub sync feature, you must configure an IP whitelist in the AnalyticDB for MySQL console to allow the DataHub service to access your AnalyticDB for MySQL service. For the IP address ranges of the DataHub service, see FAQ. Additionally, when you create the DataHub sync task, you must enter the internal endpoint of the AnalyticDB for MySQL instance to ensure network connectivity. 3. Sync notes. Note the following when you perform a sync operation: DataHub only supports synchronizing TUPLE data to the AnalyticDB for MySQL service. By default, TIMESTAMP data in DataHub is converted to the AnalyticDB for MySQL TIMESTAMP type with microsecond precision. Ensure that you control the data precision. To avoid lock contention from concurrent reads and writes on the same primary key, ensure that data with the same primary key is written to the same DataHub shard. When you use a VPC, ensure that the DataHub topic and the AnalyticDB for MySQL instance are in the same region.
Create a sync task
Go to the
Project List > Project Details > Topic Detailspage.Click the
+ Syncbutton in the upper-right corner to create a sync task.
The following section describes some of the configuration parameters for creating a sync task in the console. For more flexible operations, you can use the SDK.
Enter the
internal network addressfor the HostRDS service endpoint to ensure service connectivity.DataHub can import data into an RDS table by synchronizing the content of specified fields based on your settings.
Write mode: This includes the IGNORE and OVERWRITE modes.
IGNORE: Ignores duplicate data. This mode uses an
INSERT IGNORE INTOstatement.OVERWRITE: Updates duplicate data. This mode uses a
REPLACE INTOstatement.
How to obtain the instance ID
Click DescribeDBClusterAttribute - Query the detailed properties of a cluster.
Click the Debug button. On the debug page, select an endpoint, enter the DBClusterId (the cluster ID of your AnalyticDB for MySQL Data Warehouse Edition instance), and initiate the call.
On the results page, the value of the VPCCloudInstanceId field is the required instance ID.
Sync example
Create an AnalyticDB for MySQL instance and table schema.
Create a DataHub topic. In this example, a TUPLE topic is created.
Create a sync task. For this example, set the write mode to IGNORE and import all fields.
Write TUPLE data to the DataHub topic. The following figure shows the four data records.
Confirm the synchronized data. Use a MySQL client to connect to the AnalyticDB for MySQL service and view the data.