edit-icon download-icon

Configure AnalyticDB writer

Last Updated: Sep 14, 2018

AnalyticDB Writer provides the ability to write data into Analytic DB (ADS) in two modes:

  • Load Data (batch import): Transfers and loads data from the data source to ADS.

    • Advantage: Imports large volume of data (> 1 KW) at a high speed.

    • Disadvantage: Authorization from the third party is required.

  • Insert Ignore (real‑time insert): Directly writes data to ADS.

    • Advantage: Writes a small volume of data (< 1 KW) at a high speed.

    • Disadvantage: Not suitable for writing a large volume of data because of low speed.

You must configure the data source before configuring the AnalyticDB Writer plug‑in. For more information, see AnalyticDB data source config.

AnalyticDB Writer supports the following data types in AnalyticDB:

Type AnalyticDB data type
Integer int, tinyint, smallint, int, bigint
Floating point float, double, decimal
String varchar
Date date
Boolean bool

Prerequisites

  • Before importing data in Load Data mode with a MaxCompute table as the data source, you must grant Describe and Select permissions to the table to the import account of AnalyticDB in MaxCompute.

    Public cloud accounts are garuda_build@aliyun.com and garuda_data@aliyun.com.(Authorization is required for both accounts.) For import accounts of private clouds, see the configuration documents of relevant private clouds. Generally, the import account of a private cloud is test1000000009@aliyun.com.

    Command for granting permissions:

    1. USE projectname;– The MaxCompute project to which the table belongs.
    2. ADD USER ALIYUN$xxxx@aliyun.com;– Enter a correct cloud account (when adding user for the first time).
    3. GRANT Describe,Select ON TABLE table_name TO USER ALIYUN$xxxx@aliyun.com;– Enter the table for which permissions are granted and a correct cloud account.

    To make sure your data security, only the data from the MaxCompute Project in which operator is Project Owner or MaxCompute table owner can be imported to AnalyticDB. (For most of private clouds, no such limitation is imposed.)

  • ADS Writer only supports being written from MaxCompute data source. If other data sources need to be written to ADS, write to MaxCompute first and then write to ADS.

Parameter description

  • url

    • Description: The ADS URL. Format: ip:port.
    • Required: Yes

    • Default value: None

  • schema

    • Description: ADS schema name.

    • Required: Yes

    • Default value: None
  • username

    • Description: ADS username. It is accessId currently.
    • Required: Yes

    • Default value: None

  • password

    • Description: ADS password. It is accessKey currently.
    • Required: Yes

    • Default value: None

  • datasource

    • Description: Data source name. It must be identical to the data source name added. Adding data source is supported in script mode.

    • Required: Yes

    • Default value: None
  • table

    • Description: Target table name.

    • Required: Yes

    • Default value: None
  • partition

    • Description: Partition name of a target table. If the target table is a partition table, this field is required. If the Reader is MaxCompute, and AnalyticDB Writer imports data in Load mode, partition of MaxCompute only supports the following three configurations (take two-level partitions as an example):

      • “partition” :[“ pt=*, ds=*“] (reads the data from all the partitions under the table)
      • “partition”:[“pt=1,ds=*”] (reads the data from all the secondary partitions under the primary partition pt=1 under the table)
      • “partition”:[“pt=1,ds=hangzhou”] (reads the data from the secondary partition ds=hangzhou under the primary partition pt=1 under the table)
    • Required: No

    • Default value: None
  • writeMode

    • Description: Supports both of Load Data (batch import) and Insert Ignore (real-time insert) modes.

    • Required: Yes

    • Default value: None
  • column

    • Description: The list of fields in the target table. The value can be [“*”] or a list of specific fields, such as [“a”, “b”, “c”].
    • Required: Yes

    • Default value: None

  • overWrite

    • Description: Whether to overwrite the current target table when writing data to ADS. A value of “true” means “Overwrite”, and “false” means “Do not overwrite (append)”. This value takes effect only if the writeMode is Load.

    • Required: Yes

    • Default value: None
  • lifeCycle

    • Description: The lifecycle of an ADS temporary table. This value takes effect only if the writeMode is Load.

    • Required: Yes

    • Default value: None
  • suffix

    • Description: The ADS URL configuration item takes a format of ip:port, which changes to a JDBC database connection string upon access to the ADS database. This parameter is a custom connection string and is optional.(See the JDBC control parameters supported by MySQL.) For example, configure the suffix to autoReconnect=true&failOverReadOnly=false&maxReconnects=10.

    • Required: No

    • Default value: None
  • opIndex

    • Description: Subscript of the Operation Type column of ADS peer storage, which starts from 0. This value takes effect only if the writeMode is stream.

    • Required: It is required if the writeMode is stream.

    • Default value: None

  • batchSize

    • Description: Number of data items of each batch committed to ADS. This value takes effect only if the writeMode is insert.

    • Required: It is required if the writeMode is insert.

    • Default value: None
  • bufferSize

    • Description: Size of DataX data buffer. The data from the source is collected to this buffer for a sorting before being committed to ADS. The sorting is done by ADS partition columns so that data is organized in an order that is more friendly for ADS server to improve performance. The data in the buffer with a size of BufferSize is committed to ADS in batches with a size of batchSize. The bufferSize value must be set to a multiple of batchSize. This value takes effect only if the writeMode is insert.

    • Required: It is required if the writeMode is insert.

    • Default value: This feature is not enabled by default.

Development in wizard mode

TargetRDS

  • Data Sources: datasource in the preceding parameter description. Select the ADS data source.

  • Table: table in the preceding parameter description. Select the target table.

  • Preparations before import statement: before the implementation of data synchronization tasks who took the lead in the execution of sql.

  • When we include prepared statements: the implementation of data synchronization tasks after the implementation of sql.

  • key conflict:

    • Before data importing, all the data in the table or partition must be cleared, which is equivalent to Replace into.

    • No data is cleared before data importing and new data is always appended with each run, which is equivalent to Insert Into.

  • Field Mapping: column in the preceding parameter description.

    fieldmapping

Development in script mode

Import data in Load Data mode:

  1. {
  2. "type": "job",
  3. "version": "1.0",
  4. "configuration": {
  5. "reader": {
  6. },
  7. "writer": {
  8. "plugin": "ads",
  9. "parameter": {
  10. "datasource": "datasourceName",
  11. "writeMode": "load",
  12. "table": "table",
  13. "partition": "",
  14. "overWrite": true
  15. }
  16. }
  17. }
  18. }

Import data in Insert Ignore mode:

  1. {
  2. "type": "job",
  3. "version": "1.0",
  4. "configuration": {
  5. "reader": {
  6. },
  7. "writer": {
  8. "plugin": "ads",
  9. "parameter": {
  10. "datasource": "datasourceName",
  11. "writeMode": "insert",
  12. "table": "table",
  13. "column": ["*"]
  14. }
  15. }
  16. }
  17. }
Thank you! We've received your feedback.