This topic describes the FAQ about batch synchronization.

What can I do if an error is returned when I edit a batch sync node?

  • Symptom: When a batch sync node is edited, the data store fails to be accessed, the table schema cannot be obtained, and the error message table XXX does not exists error with code: CDP_DATASOURCE_ERROR is returned. In fact, the table exists and the auto triggered node runs properly.
  • Workaround: In a workspace in the standard mode, you must create a connection separately in the production and development environments. Create a connection in the development environment on the Data Source page and commit the MaxCompute table to the development environment.

What can I do if a batch sync node is abnormal?

  • Symptom: A node scheduled to run from 00:00 to 23:59 depends on the ancestor node scheduled to run from 08:00 to 23:59. After 08:00, nine node instances are run simultaneously. One instance is abnormal, whereas the other eight instances are normal.
  • Workaround: The duplicate entry error may occur if the instances of a node conflict with each other during running. If a node is frequently scheduled to run, we recommend that you enable the node to depend on its instance in the last cycle.

How do I handle a database connection failure?

  • Symptom: An Oracle connection is configured in Data Integration and the connectivity test is passed. However, when a Data Integration node is configured and run, the following error message is returned, indicating that the database connection failed:
    ErrorMessage:Code:[DBUtilErrorCode-10], Description:[The database connection failed.  Check the username, password, database name, IP address, port number, and network environment or ask the database administrator for help.] 
    • The database connection failed because the Java Database Connectivity (JDBC) URL specified by jdbc:oracle:thin:@X.X.X.X:XXXXj cannot be accessed. Check and modify your configurations.
    • java.lang.Exception: Data Integration cannot connect to the database. Possible causes are as follows:
      • The IP address, port number, database name, or JDBC URL is incorrect.
      • The authorization failed because the username or password is incorrect.

      Confirm with the database administrator that the configurations of the database are correct.

  • Workaround: Synchronize data from Oracle to MySQL and check whether the MySQL data store connects to the Internet. If the MySQL data store connects to the Internet, change the connection type to JDBC Connection Mode. Data Integration selects an Elastic Compute Service (ECS) instance for batch synchronization based on the JDBC URL you set.

    If you set the connection type to ApsaraDB for RDS, an ECS instance that cannot access the Internet is selected to run the sync node, which may lead to the Oracle connection failure.

How do I handle a node failure caused by reserved keywords in columns?

  • Symptom: The relevant log indicates that an SQL statement failed because columns contain reserved keywords.
    2017-05-31 14:15:20.282 [33881049-0-0-reader] ERROR ReaderRunner - Reader runner Received Exceptions:com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-07]
  • Analysis: Data cannot be read from the database. Check the column, table, where, and querySql parameters you have configured or ask the database administrator for help.
    SQL statement:
    select **index**,plaid,plarm,fget,fot,havm,coer,ines,oumes from xxx
    Error message:
    You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near **index**,plaid,plarm,fget,fot,havm,coer,ines,oumes from xxx
  • Troubleshooting:
    1. Execute the following SQL statement on the local server: select **index**,plaid,plarm,fget,fot,havm,coer,ines,oumes from xxx. Check the result and the error messages that may be returned.
    2. Enclose each keyword in single quotation marks (' ') or modify the column.

What can I do if a node fails because a table name contains single quotation marks (' ') enclosed in double quotation marks (" ")?

  • Symptom: The relevant log indicates that an SQL statement failed because a table name contains single quotation marks (' ') enclosed in double quotation marks (" ").
    com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-07]
  • Analysis: Data cannot be read from the database. Check the column, table, where, and querySql parameters you have configured or ask the database administrator for help.
    SQL statement:
    select /_+read_consistency(weak) query_timeout(100000000)_/ _ from** 'ql_ddddd_[0-31]' **where 1=2
    Error message:
    You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near "ql_live_speaks[0-31]' where 1=2' at line 1 - com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near **"ql_ddddd_[0-31]' where 1=2' **
  • Troubleshooting: The error message is returned because the table name contains single quotation marks (' ') enclosed in double quotation marks (" "), for example, "table":["'qlddddd[0-31]'"]. In this case, delete the single quotation marks (' ').

What can I do if a connection fails the connectivity test with the error message that starts with "Access denied for" returned?

  • Symptom: The database cannot be accessed.

    Database URL: jdbc:mysql://xx.xx.xx.x:3306/t_demo. Username: fn_test. Error message: Access denied for user 'fn_test'@'%' to database 't_demo'.

  • Troubleshooting:
    • An error message that starts with "Access denied for" is returned because the information you specified is incorrect. Check the configurations you specified.
    • Check whether the whitelist is properly configured and whether the specified account has the permission to access the database. You can configure the whitelist and grant required permissions in the ApsaraDB for RDS console.

How do I deal with a routing policy error?

  • Symptom: The routing policy for an OXS cluster is different from that for an ECS cluster.
    2017-08-08 15:58:55 : Start Job[xxxxxxx], traceId **running in Pipeline[basecommon_group_xxx_cdp_oxs]**ErrorMessage:Code:[DBUtilErrorCode-10]
  • Analysis: The database connection failed. Check the username, password, database name, IP address, port number, and network environment or ask the database administrator for help.
  • Troubleshooting: The database connection failed because the JDBC URL specified by jdbc:oracle:thin:@xxx.xxxxx.x.xx:xxxx:prod cannot be accessed. Check and modify your configurations.

What can I do if Data Integration cannot connect to a database?

  • Symptom: Data Integration cannot connect to the target database. The error java.lang.Exception is reported.
  • Analysis:
    • The IP address, port number, database name, or JDBC URL is incorrect.
    • The authorization failed because the username or password is incorrect. Confirm with the database administrator that the configurations of the database are correct.
  • Troubleshooting:
    Scenario 1:
    • If you need to synchronize data from an Oracle database to an ApsaraDB RDS for PostgreSQL instance, you can only click Run to run manually triggered nodes. Auto triggered nodes are not supported for such synchronization because different clusters are required.
    • When you create a connection for an ApsaraDB RDS for PostgreSQL instance, you can use a normal JDBC URL. Then, auto triggered nodes can be used to synchronize data from Oracle to ApsaraDB RDS for PostgreSQL.
    Scenario 2:
    • You cannot run nodes related to ApsaraDB RDS for PostgreSQL instances that reside on custom resource groups in Virtual Private Clouds (VPCs). This is because ApsaraDB RDS for PostgreSQL instances in VPCs use the reverse proxy feature, which can lead to network issues between the instances and custom resource groups. We recommend that you run such nodes on the default resource group. If you need to run such nodes on a custom resource group, create a connection for an ApsaraDB RDS for PostgreSQL instance by using a JDBC URL and create an ECS instance in the same Classless Inter-Domain Routing (CIDR) block.
    • The URL of an ApsaraDB for RDS instance that resides in a VPC contains an IP address. A sample URL is jdbc:mysql://100.100.70.1:4309/xxx, where 100.100.70.1 is an IP address. In contrast, the URL of an ApsaraDB for RDS instance that does not reside in a VPC contains a domain name.

What can I do if an error occurs when HBase Writer is configured to write data of the DATE type?

  • Symptom: HBase Writer is configured to write data of the DATE type.
    Synchronize data from an HBase database to another HBase database: 2017-08-15 11:19:29 : State: 4(FAIL) | Total: 0R 0B | Speed: 0R/s 0B/s | Error: 0R 0B | Stage: 0.0% ErrorMessage:Code:[Hbasewriter-01]
  • Analysis: You specified an invalid data type.

    HBase Writer does not support writing data of the DATE type. Currently, it only supports the following data types: STRING, BOOLEAN, SHORT, INT, LONG, FLOAT, and DOUBLE.

  • Troubleshooting:
    • Do not configure HBase Writer to write data of the DATE type.
    • Change the data type to STRING. This is because HBase does not support typed values but stores all data as byte arrays.

What can I do if the specified configurations are not in the correct JSON format?

  • Symptom: The column configurations are incorrect.
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[Framework-02]
  • Analysis: The Data Integration engine encountered an error when running. The following diagnostic information is prompted when Data Integration stops running:
    java.lang.ClassCastException:com.alibaba.fastjson. JSONObject cannot be cast to java.lang.String
  • Troubleshooting: The specified configurations are not in the correct JSON format. Modify the configurations.
    For the writer: 
    "column":[ 
    { 
    "name":"busino", 
    "type":"string" 
    } 
    ] 
    Use the following correct format:
    "column":[ 
    { 
    "busino" 
    } 
    ]

What can I do if brackets ([ ]) are missing in a JSON list?

  • Symptom: Brackets ([ ]) are missing in the JSON list.
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[Framework-02]
  • Analysis: The Data Integration engine encountered an error when running. The following diagnostic information is prompted when Data Integration stops running:
    java.lang.String cannot be cast to java.util.List - java.lang.String cannot be cast to java.util.List  
    at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:41)
  • Troubleshooting: If brackets ([ ]) are missing, a list will be recognized as another data type. Add brackets ([ ]) if necessary.

How do I handle a node failure caused by lack of permissions?

  • Lack of the permission to delete tables
    • Symptom: An error message is returned when data is synchronized from MaxCompute to ApsaraDB RDS for MySQL. The error message is as follows:
      ErrorMessage:Code:[DBUtilErrorCode-07]
    • Analysis: Data cannot be read from the database. Check the column, table, where, and querySql parameters you have configured or ask the database administrator for help.
      SQL statement:
      delete from fact_xxx_d where sy_date=20170903
      Error message:
      **DELETE command denied** to user 'xxx_odps'@'[xx.xxx.xxx.xxx](http://xx.xxx.xxx.xxx)' for table 'fact_xxx_d' - com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: DELETE command denied to user 'xxx_odps'@'[xx.xxx.xxx.xxx](http://xx.xxx.xxx.xxx)' for table 'fact_xxx_d'
    • Troubleshooting: The error message that starts with "DELETE command denied to" indicates that you are not authorized to delete the table. Obtain the required permission.
  • Lack of the permission to drop tables
    • Symptom: Data cannot be read from the database.
      Code:[DBUtilErrorCode-07]
    • Analysis: Check the column, table, where, and querySql parameters you have configured or ask the database administrator for help.

      SQL statement: truncate table be_xx_ch

      Error message:
      **DROP command denied to user** 'xxx'@'[xxx.xx.xxx.xxx](http://xxx.xx.xxx.xxx)' for table 'be_xx_ch' - com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: DROP command denied to user 'xxx'@'[xxx.xx.xxx.xxx](http://xxx.xx.xxx.xxx)' for table 'be_xx_ch'
    • Troubleshooting: The error message is returned because the TRUNCATE statement is specified in the preSql parameter and you are not authorized to drop the table. Obtain the required permission.
  • Lack of permissions on AnalyticDB for MySQL
    2016-11-04 19:49:11.504 [job-12485292] INFO  OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://100.98.249.103:3306/AnalyticDB for MySQL_rdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.  
    2016-11-04 19:49:11.505 [job-12485292] WARN  OriginalConfPretreatmentUtil
    The column configurations in your configuration file may lead to errors. You have not specified the columns that the reader needs to read from the source table. Then, if the number of columns or the data type in your table changes, the node may fail. Check and modify your configurations.
    2016-11-04 19:49:11.528 [job-12485292] INFO Writer$Job
    If you need to synchronize data from MaxCompute to AnalyticDB for MySQL, the following permissions are required:
    • The Alibaba Cloud account for AnalyticDB for MySQL must have at least the DESCRIBE and SELECT permissions on the MaxCompute table. This is because AnalyticDB for MySQL requires the schema and data information of the MaxCompute table.
    • The owner of the AccessKey that you configured for accessing AnalyticDB for MySQL must have the permission to load data to the specified AnalyticDB for MySQL database. You can log on to the AnalyticDB for MySQL console to perform the authorization.
    2016-11-04 19:49:11.528 [job-12485292] INFO Writer$Job
    If you need to synchronize data from RDS or another non-MaxCompute data store to an AnalyticDB for MySQL data store, data is first loaded as a temporary table in an intermediate MaxCompute project and then transferred to AnalyticDB for MySQL. The intermediate MaxCompute project is named example_project and its owner is someone@example.com. The following permissions are required:
    • The Alibaba Cloud account for AnalyticDB for MySQL must have at least the DESCRIBE and SELECT permissions on the MaxCompute table. This is because AnalyticDB for MySQL requires the schema and data information of the MaxCompute table. This authorization is performed by default when you deploy the database.
    • The owner of the intermediate MaxCompute project is someone@example.com. The owner must have the permission to load data to the specified AnalyticDB for MySQL database. You can log on to the AnalyticDB for MySQL console to perform the authorization.

    Troubleshooting: The error occurs because the MaxCompute project owner is not authorized to load data to the specified AnalyticDB for MySQL database.

    The owner of the intermediate MaxCompute project is someone@example.com. The owner must have at least the DESCRIBE and SELECT permissions on the MaxCompute table. This is because AnalyticDB for MySQL requires the schema and data information of the MaxCompute table. This authorization is performed by default when you deploy the database. Therefore, you need to log on to the AnalyticDB for MySQL console to grant the owner the permission to load data to the specified AnalyticDB for MySQL database.

How do I handle a node failure related to the whitelist?

  • Symptom: A connection failed the connectivity test because the whitelist is not properly configured.
    error message: **Timed out after 5000** ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{[address:3717=dds-bp1afbf47fc7e8e41.mongodb.rds.aliyuncs.com](http://address:3717=dds-bp1afbf47fc7e8e41.mongodb.rds.aliyuncs.com), type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadException: Prematurely reached end of stream}}, {[address:3717=dds-bp1afbf47fc7e8e42.mongodb.rds.aliyuncs.com](http://address:3717=dds-bp1afbf47fc7e8e42.mongodb.rds.aliyuncs.com), type=UNKNOWN, state=CONNECTING,** exception={com.mongodb.MongoSocketReadException: Prematurely reached end of stream**}}]
    Troubleshooting: When you create MongoDB connections that are not in a VPC, the error message that starts with "Timed out after 5000" is returned because the whitelist is not properly configured. In this case, configure the whitelist.
    Note If you use ApsaraDB for MongoDB, the MongoDB database has a root account by default. For security concerns, Data Integration only supports access to a MongoDB database by using a MongoDB database account. When you create a MongoDB connection, do not use the root account for access.
  • Symptom: The whitelist is incomplete.
    for Code:[DBUtilErrorCode-10]

    Analysis: The database connection failed. Check the username, password, database name, IP address, port number, and network environment or ask the database administrator for help.

    Error message:
    java.sql.SQLException: Invalid authorization specification,  message from server: "#**28000ip not in whitelist, client ip is xx.xx.xx.xx". **  
    2017-10-17 11:03:00.673 [job-xxxx] ERROR RetryUtil - Exception when calling callable

    Troubleshooting: The whitelist is incomplete because you have not added your server IP address to the whitelist. Add your server IP address to the whitelist.

How do I handle a node failure caused by incorrect connection information?

  • Symptom: The connection information specified in the code editor is incomplete.
    2017-09-06 12:47:05 [INFO] Success to fetch meta data for table with **projectId [43501]** **project ID **and instanceId **[mongodb]connection name. **  
    2017-09-06 12:47:05 [INFO] Data transport tunnel is CDP.  
    2017-09-06 12:47:05 [INFO] Begin to fetch alisa account info for 3DES encrypt with parameter account: [zz_683cdbcefba143b7b709067b362d4385].  
    2017-09-06 12:47:05 [INFO] Begin to fetch alisa account info for 3DES encrypt with parameter account: [zz_683cdbcefba143b7b709067b362d4385].  
    [Error] Exception when running task, message:** Configuration property [accessId] could not be blank! **

    Troubleshooting: If the error message indicates that no AccessKey ID is specified, the sync node is usually configured in the code editor. View the JSON code to check whether you have specified the connection name.

  • Symptom: The connection configurations are incorrect or the connection is not configured.
    2017-10-10 10:30:08 INFO =================================================== 
    File "/home/admin/synccenter/src/Validate.py", line 16, in notNone 
    raise Exception("Configuration property [%s] could not be blank!" % (context)) 
    **Exception: Configuration property [username] could not be blank! **
    Troubleshooting:
    • Check with normal logs:
      [56810] and instanceId(instanceName) [spfee_test_mysql]... 
      2017-10-09 21:09:44 [INFO] Success to fetch meta data for table with projectId [56810] and instanceId [spfee_test_mysql].
    • The logs of ApsaraDB RDS for MySQL show that an error occurs while loading data from data stores and the username parameter returns an empty value. The connection location is incorrect or the connection is not configured.
  • Symptom: The connection to a Distributed Relational Database Service (DRDS) data store times out.
    When you synchronize data from MaxCompute to DRDS, the following error may occur:
    [2017-09-11 16:17:01.729 [49892464-0-0-writer] WARN CommonRdbmsWriter$Task
    Roll back the synchronization, and enable the writer to write only one row each time.
    com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: **Communications link failure  **  
    The last packet successfully received from the server was 529 milliseconds ago.  The last packet sent successfully to the server was** 528 milliseconds ago**.

    Troubleshooting: The error occurs because the connection from the Data Integration client to the server times out. When you create a DRDS connection, append the string ?useUnicode=true&characterEncoding=utf-8&socketTimeout=3600000 to the JDBC URL.

    Example:
    jdbc:mysql://10.183.80.46:3307/ae_coupon?useUnicode=true&characterEncoding=utf-8&socketTimeout=3600000
  • Symptom: An internal system error occurs.

    Troubleshooting: Generally, an internal system error occurs because the configurations are not in the correct JSON format in the development environment. If the user interface is displayed as blank, you can directly provide the workspace name and node name for consultation.

How do I handle a node failure caused by dirty data?

  • Symptom: Empty strings (String[""]) cannot be converted into the LONG data type.
    2017-09-21 16:25:46.125 [51659198-0-26-writer] ERROR WriterRunner - Writer Runner Received Exceptions:  
    com.alibaba.datax.common.exception.DataXException: Code:[Common-01]

    Analysis: Dirty data occurs during data synchronization because of infeasible data type conversion. Empty strings cannot be converted into the LONG data type.

    Troubleshooting: Empty strings cannot be converted into the LONG data type.

    The two tables use the same table creation statement. The error is returned because the empty strings cannot be converted into the LONG data type. Configure the data type as STRING.

  • Symptom: Data is out of the valid value range.
    2017-11-07 13:58:33.897 [503-0-0-writer] ERROR StdoutPluginCollector 
    Dirty data: 
    {"exception":"Data truncation: Out of range value for column 'id' at row 1","record":[{"byteSize":2,"index":0,"rawData":-3,"type":"LONG"},{"byteSize":2,"index":1,"rawData":-2,"type":"LONG"},{"byteSize":2,"index":2,"rawData":"Other","type":"STRING"},{"byteSize":2,"index":3,"rawData":"Other","type":"STRING"}],"type":"writer"}

    Troubleshooting: The SMALLINT(5) data type allows negative values whereas the UNSIGNED INT(11) data type does not. For data synchronization between MySQL data stores, dirty data occurs if the source table has a field of the SMALLINT(5) data type and the destination table has a field of the UNSIGNED INT(11) data type.

  • Symptom: Emoticons are synchronized.

    Dirty data occurs during data synchronization that involves a table with emoticons.

    Troubleshooting: Dirty data occurs during data synchronization that involves a table with emoticons. Change the encoding format.
    • Create connections by using the JDBC URL.
      jdbc:mysql://xxx.x.x.x:3306/database?characterEncoding=utf8&com.mysql.jdbc.faultInjection.serverCharsetIndex=45
    • Create connections by using the ID of the instance.

      Append the string ?characterEncoding=utf8&com.mysql.jdbc.faultInjection.serverCharsetIndex=45 to the end of the database name in the JDBC URL.

  • Symptom: Dirty data occurs because of empty columns.
    {"exception":"Column 'xxx_id' cannot be null","record":[{"byteSize":0,"index":0,"type":"LONG"},{"byteSize":8,"index":1,"rawData":-1,"type":"LONG"},{"byteSize":8,"index":2,"rawData":641,"type":"LONG"}
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[Framework-14]

    Analysis: Data Integration reports more dirty data than expected, which is usually because the source data contains much dirty data. For example, you limit the number of dirty data records to one but seven dirty data records are found.

    In this case, check the dirty data log information or raise the limit.

    Troubleshooting: According to the code Column 'xxx_id' cannot be null, the xxx_id field cannot be left blank. Dirty data occurs if an xxx_id field value is unspecified. Change the value or modify the field.

  • Symptom: The data length exceeds the limit imposed by the field.
    2017-01-02 17:01:19.308 [16963484-0-0-writer] ERROR StdoutPluginCollector 
    Dirty data:  
    {"exception":"Data truncation: Data too long for column 'flash' at row 1","record":[{"byteSize":8,"index":0,"rawData":1,"type":"LONG"},{"byteSize":8,"index":3,"rawData":2,"type":"LONG"},{"byteSize":8,"index":4,"rawData":1,"type":"LONG"},{"byteSize":8,"index":5,"rawData":1,"type":"LONG"},{"byteSize":8,"index":6,"rawData":1,"type":"LONG"}

    Troubleshooting: According to the code Data too long for column 'flash', the flash field imposes a limit on the data length and a field value exceeds the limit. Change the value or modify the field.

  • Symptom: The database is read-only.
    2016-11-02 17:27:38.288 [12354052-0-8-writer] ERROR StdoutPluginCollector 
    Dirty data:  
    {"exception":"The MySQL server is running with the --read-only option so it cannot execute this statement","record":[{"byteSize":3,"index":0,"rawData":201,"type":"LONG"},{"byteSize":8,"index":1,"rawData":1474603200000,"type":"DATE"},{"byteSize":8,"index":2,"rawData":"12:00 on September 23","type":"STRING"},{"byteSize":5,"index":3,"rawData":"12:00","type":"STRING"}

    Troubleshooting: If the database is read-only, all the data to be synchronized is dirty data. Change the read-only mode of the database to read/write.

What can I do if a node fails because the target files cannot be found or the table does not exist?

  • Symptom: The target files cannot be found.
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[HdfsReader-08]

    Analysis: The specified directory is empty. The files to be read cannot be found. Check your configurations.

    path: /user/hive/warehouse/tmp_test_map/*  
    at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)

    Troubleshooting: Check the files based on the directory provided. If files still cannot be found, configure the files.

  • Symptom: The table does not exist.
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[MYSQLErrCode-04]

    Analysis: The table does not exist. Check the table name or contact the database administrator to check whether the table exists.

    Table name: xxxx. SQL statement: select * from xxxx where 1=2;.

    Error message:

    Table 'darkseer-test.xxxx' doesn't exist - com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'darkseer-test.xxxx' doesn't exist

    Troubleshooting: Execute the select * from xxxx where 1=2 statement to check whether the table has any errors. Perform appropriate operations on the table if any errors exist.

How do I handle a node failure caused by a partition error?

  • Symptom: An error occurs with the partition.
    The setting of the $[yyyymm] parameter is invalid. The log is provided as follows:
    [2016-09-13 17:00:43]2016-09-13 16:21:35.689 [job-10055875] ERROR Engine
    The intelligent analysis results of Data Integration show that the most possible cause is as follows:
    com.alibaba.datax.common.exception.DataXException: Code:[OdpsWriter-13]
  • Analysis: If an error occurs while executing a MaxCompute SQL statement, you can try again. If an error occurs while executing a MaxCompute SQL statement in the destination MaxCompute table, contact the MaxCompute administrator. The SQL statement is as follows:
    alter table db_rich_gift_record add IF NOT EXISTS
          partition(pt='${thismonth}');
  • Troubleshooting: The relative time parameter ${thismonth} becomes invalid because it is enclosed in single quotation marks (' ').
  • Workaround: Delete the single quotation marks (' ').

What can I do if a node fails because the column parameter is not organized in a JSON array?

  • Symptom: The column parameter is not organized in a JSON array.
    Run command failed.  
    com.alibaba.cdp.sdk.exception.CDPException: com.alibaba.fastjson.JSONException: syntax error, **expect {,** actual error, pos 0  
    at com.alibaba.cdp.sdk.exception.CDPException.asCDPException(CDPException.java:23)
  • Troubleshooting: The specified configurations are not in the correct JSON format. Example:
    "plugin": "mysql",** 
    "parameter": { 
    "datasource": "xxxxx", 
    ** "column": "uid",** 
    "where": "", 
    "splitPk": "", 
    "table": "xxx" 
    } 
    **"column": "uid",-----Not organized in an array**

How do I handle a node failure caused by a JDBC URL in an incorrect format?

  • Symptom: The JDBC URL is in an incorrect format.
  • Troubleshooting: The JDBC URL is in an incorrect format. The correct format is jdbc:mysql://ServerIP:Port/Database.

What can I do if a connection fails the connectivity test?

  • Symptom: A connection fails the connectivity test.
  • Troubleshooting:
    • Check whether the firewall limits the IP address and port in use.
    • Check the security group of the port.

What can I do if an error related to uid[xxxxxxxx] is reported in logs?

  • Symptom: An error related to uid[xxxxxxxx] is reported in logs.
    Run command failed.  
    com.alibaba.cdp.sdk.exception.CDPException: RequestId[F9FD049B-xxxx-xxxx-xxx-xxxx] Error: CDP server encounter problems, please contact us, reason: An error occurs while retrieving the network information about an instance. Check the ID of the user who purchases the RDS instance and the RDS instance name,uid[xxxxxxxx],instance[rm-bp1cwz5886rmzio92]ServiceUnavailable : The request has failed due to a temporary failure of the server.  
    RequestId : F9FD049B-xxxx-xxxx-xxx-xxxx
  • Troubleshooting: If the preceding error occurs when you synchronize data from RDS to MaxCompute, you can copy RequestId : F9FD049B-xxxx-xxxx-xxx-xxxx to RDS engineers.

What can I do if the query parameter is invalid for MongoDB?

  • Symptom: The query parameter is invalid for MongoDB.
    The following error message is returned when you synchronize data from MongoDB to MySQL:
    Exception in thread "taskGroup-0" com.alibaba.datax.common.exception.DataXException: Code:[Framework-13]

    The reason is that the query parameter is not in the correct JSON format.

  • Analysis: The Data Integration engine encountered an error when running. The following diagnostic information is prompted when Data Integration stops running:
    org.bson.json.JsonParseException: Invalid JSON input. Position: 34. Character: '.'.
  • Troubleshooting:
    • Invalid example: "query":"{'update_date':{'$gte':new Date().valueOf()/1000}}". Parameters such as new Date() are not supported.
    • Valid example: "query":"{'operationTime'{'$gte':ISODate('${last_day}T00:00:00.424+0800')}}".

What can I do if the memory is insufficient?

  • Symptom: The memory is insufficient.
    2017-10-11 20:45:46.544 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[358] attemptCount[1] is started  
    Java HotSpot™ 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f15ceaeb000, 12288, 0) failed; error='**Cannot allocate memory'** (errno=12)
  • Troubleshooting: The memory is insufficient. If you run a node on a custom resource group, you need to add memory. If you run a node on the resource group provided by Alibaba Cloud, submit a ticket.

What can I do if the max_allowed_packet parameter is set to an improper value?

  • Symptom: The max_allowed_packet parameter is set to an improper value.
    Error message:
    Packet for query is too large (70 > -1). You can change this value on the server by setting the max_allowed_packet' variable. - **com.mysql.jdbc.PacketTooBigException: Packet for query is too large (70 > -1). You can change this value on the server by setting the max_allowed_packet' variable. **
  • Troubleshooting:
    • The max_allowed_packet parameter defines the maximum length of the communication buffer. A MySQL data store drops packets whose size is larger than the value of this parameter. Therefore, large inserts and updates will fail.
    • Usually, you can set max_allowed_packet to 10 MB (10 × 1024 × 1024 Bytes). We recommend that you do not set max_allowed_packet to a value that is too large.

What can I do if HTTP status code 500 is logged because logs cannot be retrieved?

  • Symptom: HTTP status code 500 is logged because logs cannot be retrieved.
    Unexpected Error:  
    Response is com.alibaba.cdp.sdk.util.http.Response@382db087[proxy=HTTP/1.1 500 Internal Server Error [Server: Tengine, Date: Fri, 27 Oct 2017 16:43:34 GMT, Content-Type: text/html;charset=utf-8, Transfer-Encoding: chunked, Connection: close,  
    **HTTP Status 500** - Read timed out**type** Exception report**message**++Read timed out++**description**++The server encountered an internal error that prevented it from fulfilling this request.++**exception**  
    java.net.SocketTimeoutException: Read timed out
  • Troubleshooting: If you run a node on the default resource group and HTTP status code 500 is logged because logs cannot be retrieved, submit a ticket for consultation. If you run a node on a custom resource group, rerun the alisa command.
    Note If you refresh the page and the node is still stopped, you can switch to the admin account and rerun the following alisa command: /home/admin/alisatatasknode/target/alisatatasknode/bin/serverct1 restart.

What can I do if the setting of a parameter of HBase Writer is invalid?

  • Symptom: The setting of the hbase.zookeeper.quorum parameter is invalid for HBase Writer.
    2017-11-08 09:29:28.173 [61401062-0-0-writer] INFO  ZooKeeper - Initiating client connection, connectString=xxx-2:2181,xxx-4:2181,xxx-5:2181,xxxx-3:2181,xxx-6:2181 sessionTimeout=90000 watcher=hconnection-0x528825f50x0, quorum=node-2:2181,node-4:2181,node-5:2181,node-3:2181,node-6:2181, baseZNode=/hbase  
    Nov 08, 2017 9:29:28 AM org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper checkZk  
    WARNING: **Unable to create ZooKeeper Connection**
  • Troubleshooting:
    • Invalid example: "hbase.zookeeper.quorum":"xxx-2,xxx-4,xxx-5,xxxx-3,xxx-6"
    • Valid example: "hbase.zookeeper.quorum":"Your ZooKeeper IP address"