This topic describes common connectivity test failures and provides solutions for these failures based on the network conditions and resource groups used in different scenarios.

Data store connectivity testing

A sync node only uses one resource group. Therefore, you must test the connectivity of all the resource groups for Data Integration that your sync nodes use to connect to the data store so that sync nodes can run properly.Test connectivity
Note
  • Connectivity testing is not supported for custom resource groups. Make sure that your custom resource group can connect to the desired data stores over the network where it resides.
  • Sync nodes running on a custom resource group cannot obtain the metadata information of tables. If a custom resource group is used, configure the corresponding sync node in the code editor. For more information, see Create a sync node by using the code editor.

Possible causes of connectivity test failures

A data store may fail the connectivity test due to the following possible causes:
  • The data store is not started. Check whether the data store is started.
  • DataWorks cannot access the network where the data store resides. Make sure that the network where the data store resides is connected to Alibaba Cloud.
  • DataWorks is prohibited from accessing the network where the data store resides by a network firewall. Add the IP addresses or Classless Inter-Domain Routing (CIDR) blocks used by DataWorks to the whitelist configured for the data store. For more information, see Configure a whitelist.
  • The domain name of the data store cannot be resolved. Make sure that the domain name of the data store can be properly resolved.
  • An exclusive resource group is used but the data store is deployed in a virtual private cloud (VPC) or an Internet data center (IDC). Make sure that the exclusive resource group can access the VPC or IDC, or add a route when you bind the exclusive resource group to a VPC in the DataWorks console. For more information, see Use exclusive resource groups for data integration.

Solutions for ensuring connectivity in the data store dimension

Scenario Shared resource group Exclusive resource group for Data Integration Custom resource group
The data store is accessible over the Internet.
Notice Pay attention to the Internet traffic cost. For more information, see Internet traffic generated by Data Integration.
Shared resource groups can directly connect to the data store. For more information, see Use the default resource group. Exclusive resource groups for Data Integration can directly connect to the data store. For more information, see Use exclusive resource groups for data integration. Custom resource groups can directly connect to the data store. For more information, see Add a custom resource group.
Notice You must activate DataWorks Professional Edition to use custom resource groups.
The data store is deployed in a VPC. The VPC and the DataWorks workspace are in the same region. N/A To ensure connectivity between an exclusive resource group for Data Integration and the data store, perform the following steps:
  1. Bind the resource group to the VPC where the data store resides.
  2. If the resource group and VPC are in different zones, add a route between the zones in the DataWorks console.
To ensure connectivity between a custom resource group and the data store, perform one of the following operations based on specific scenarios:
  • If the data store is on the classic network or in a VPC or an IDC, specify a host that is accessible over the Internet on the classic network or in the VPC or IDC as the custom resource group.
  • If the data store and resource group share the same IP address on the classic network or are in the same VPC or IDC, directly connect the resource group to the data store.
  • If the data store and resource group use different IP addresses on the classic network or are in different VPCs or IDCs, connect the resource group to the data store by using Express Connect circuits or VPN gateways.
The VPC and the DataWorks workspace are in different regions. N/A To ensure connectivity between an exclusive resource group for Data Integration and the data store, perform the following steps:
  1. Create a VPC in the region where the DataWorks workspace resides.
  2. Connect the VPC created in the previous step to the VPC where the data store resides by using Express Connect circuits or VPN gateways.
  3. Bind the exclusive resource group to the VPC where the DataWorks workspace resides.
  4. Add a route in the DataWorks console to connect the VPC where the DataWorks workspace resides to the VPC where the data store resides.
The data store is deployed in an IDC. N/A To ensure connectivity between an exclusive resource group for Data Integration and the data store, perform the following steps:
  1. Create a VPC in the region where the DataWorks workspace resides.
  2. Connect the VPC created in the previous step to the IDC where the data store resides by using Express Connect circuits or VPN gateways.
  3. Bind the exclusive resource group to the VPC where the DataWorks workspace resides.
  4. Add a route in the DataWorks console to connect the VPC where the DataWorks workspace resides to the IDC where the data store resides.
The data store is deployed on the classic network.
Note We recommend that you migrate the data store to a VPC.
Shared resource groups can directly connect to the data store. N/A

Solutions for ensuring connectivity in the resource group dimension

  • Shared resource groups

    If a data store is accessible over the Internet or deployed on the classic network, a shared resource group can directly access the data store.

  • Exclusive resource groups
    • If a data store is accessible over the Internet, an exclusive resource group can directly access the data store.
    • If a data store is deployed in a VPC, which is in the same region as the exclusive resource group:
      To ensure connectivity between the exclusive resource group and the data store, perform the following steps:
      1. Bind the resource group to the VPC where the data store resides.
      2. If the resource group and VPC are in different zones, add a route between the zones in the DataWorks console.
    • If a data store is deployed in a VPC, but the VPC and the exclusive resource group are in different regions.
      To ensure connectivity between the exclusive resource group and the data store, perform the following steps:
      1. Create a VPC in the region where the DataWorks workspace resides.
      2. Connect the VPC created in the previous step to the VPC where the data store resides by using Express Connect circuits or VPN gateways.
      3. Bind the exclusive resource group to the VPC where the DataWorks workspace resides.
      4. Add a route in the DataWorks console to connect the VPC where the DataWorks workspace resides to the VPC where the data store resides.
    • If a data store is deployed in an IDC:
      1. Create a VPC in the region where the DataWorks workspace resides.
      2. Connect the VPC created in the previous step to the IDC where the data store resides by using Express Connect circuits or VPN gateways.
      3. Bind the exclusive resource group to the VPC where the DataWorks workspace resides.
      4. Add a route in the DataWorks console to connect the VPC where the DataWorks workspace resides to the IDC where the data store resides.
  • Custom resource groups
    • If a data store is accessible over the Internet, a custom resource group can directly access the data store.
    • If a data store and a custom resource group are deployed on the same network, the custom resource group can directly access the data store.
    • If the data store and the custom resource group are deployed on different networks, for example, the data store is deployed in a VPC and the custom resource group is deployed in an IDC, the custom resource group can connect to the data store by using Express Connect circuits or VPN gateways.

Additional information

  • Services for enabling network access:
    • For more information about how to enable network access by using Enterprise Cloud Network, see Enterprise Cloud Network.
    • For more information about how to enable network access by using Express Connect, see Express Connect.
    • For more information about how to enable network access by using VPN Gateway, see VPN Gateway.
  • Note on the scheduling cluster:
    • Alibaba Cloud has deployed scheduling clusters in the China (Shanghai), China (Shenzhen), China (Hong Kong), and Singapore (Singapore) regions. The connectivity test fails if the scheduling cluster and your data store are in different regions.

      For example, if the scheduling cluster is deployed in the China (Shanghai) region and your MongoDB data store is deployed in the China (Beijing) region, DataWorks determines that the scheduling cluster cannot access the data store due to the region difference.

    • The OXS cluster and the Elastic Compute Service (ECS) cluster cannot communicate with each other over the internal network.

      The scheduling cluster for Relational Database Service (RDS) databases is an OXS cluster. The OXS cluster can communicate with RDS databases in all regions in mainland China over the internal network. An ECS cluster on the classic network serves as the scheduling cluster for other data stores.

      For example, when you synchronize data from an RDS database to a user-created database, the connectivity test can be passed for the connections to both databases. However, during node scheduling, the RDS database uses the OXS cluster to schedule the sync node, whereas the user-created database uses the ECS cluster to schedule the sync node. The ECS cluster cannot access the RDS database, and the synchronization fails. We recommend that you add the connection for the RDS database as a MySQL connection in the Java Database Connectivity (JDBC) URL mode. This guarantees that both databases can be accessed by the ECS cluster, and the synchronization is successful.

  • View the resource group on which a sync node is run:
    • If the logs contain information similar to the following, the sync node is run on the default resource group:
      running in Pipeline[basecommon_ group_xxxxxxxxx]
      - RDS databases use the OXS cluster to schedule sync nodes: running in Pipeline[basecommon_ group_xxx_oxs]
      - Other databases use the ECS cluster to schedule sync nodes: running in Pipeline[basecommon_ group_xxx_ecs]
    • If the logs contain information similar to the following, the sync node is run on an exclusive resource group for Data Integration:
      running in Pipeline[basecommon_S_res_group_xxx]
    • If the logs contain information similar to the following, the sync node is run on a custom resource group:
      running in Pipeline[basecommon_xxxxxxxxx]