Synchronize data to AnalyticDB for MySQL - DataHub - Alibaba Cloud Documentation Center

Preparations

1.Prepare an AnalyticDB for MySQL instance and define a table schema. When you create an AnalyticDB for MySQL cluster in the AnalyticDB for MySQL console and synchronize data to the cluster from DataHub, the fields in DataHub are synchronized to the fields of the mapped data types in AnalyticDB for MySQL. The following table describes the data type mappings.

DataHub	MySQL
TINYINT	TINYINT
SMALLINT	SMALLINT
INTEGER	INT
BIGINT	BIGINT
STRING	VARCHAR
BOOLEAN	BOOLEAN / TINYINT
FLOAT	FLOAT
DOUBLE	DOUBLE
TIMESTAMP	TIMESTAMP / BIGINT
DECIMAL	DECIMAL

The TINYINT, SMALLINT, INTEGER, and FLOAT types in DataHub are supported in DataHub SDK for Java V2.16.1-public and later. 2.Synchronization description: (1) DataHub allows you to synchronize data only of the TUPLE type to AnalyticDB for MySQL. (2) ReplaceInto and IgnoreInto modes: In ReplaceInto mode, the REPLACE INTO statement is executed to insert data into the database. In IgnoreInto mode, the INSERT IGNORE INTO statement is executed to insert data into the database. The REPLACE INTO statement overwrites the data when a primary key conflict occurs, and the INSERT IGNORE INTO statement ignores conflicts and writes data to the database. (3) A normal latency of data synchronization is in seconds. In other words, data is synchronized to AnalyticDB for MySQL within multiple seconds after the data is written to DataHub. Network exceptions may cause duplicate data updates. However, this ensures that data is synchronized at least once. (4) The write performance of AnalyticDB for MySQL affects the synchronization performance. If the performance of an AnalyticDB for MySQL instance is poor, the synchronization process may be slow, and data may accumulate. In extreme cases, data may be lost because the synchronization latency exceeds the lifecycle.

Create a DataConnector

In the left-side navigation pane of the DataHub console, click Project Manager. On the Project List page, find a project and click View in the Actions column. On the details page of the project, find a topic and click View in the Actions column.

The following part describes a few parameters that are used to create a DataConnector in the DataHub console. For more information about synchronization configurations, see the descriptions of DataHub SDKs.

Host: the endpoint of AnalyticDB for MySQL. To ensure that the service can be connected, you must enter the internal endpoint.
Import Fields: the fields to be synchronized to AnalyticDB for MySQL. You can synchronize all the fields of the DataHub topic or part of them based on your business requirements.
Write Mode: the write mode. Valid values:
- IGNORE: This mode ignores duplicate data. The INSERT IGNORE INTO statement is executed to write data.
- OVERWRITE: This mode updates duplicate data. The REPLACE INTO statement is executed to write data.

Example

Create an AnalyticDB for MySQL instance and a table with the defined table schema in the AnalyticDB for MySQL console.
Create a topic in the DataHub console. In this example, the created topic is of the TUPLE type. The following figure shows the Schema Details tab of the created topic.

3. Create a DataConnector. Select IGNORE from the Write Mode drop-down list and all fields from the Import Fields drop-down list.

4. Write data of the TUPLE type to the created topic. The following figure shows the four records that are written to the topic.

Synchronize data to AnalyticDB for MySQL V3.0

For more information, see the "Synchronize data to AnalyticDB for MySQL V3.0"section of the Synchronize data to ApsaraDB RDS, ApsaraDB RDS for MySQL, and AnalyticDB for MySQL V3.0 topic.