When you use DataWorks to synchronize data, you can use only exclusive resource groups for data integration to run data integration nodes. In addition, you can select a public or exclusive resource group for resource scheduling based on your business requirements. This topic describes the resources that are used for data synchronization and how to configure the resources.

Background information

  • Resource planning and preparation

    When you synchronize data, the data integration node runs by using resources in a resource group for data integration and a resource group for scheduling. You can use only exclusive resource groups for data integration. Before you synchronize data, you must purchase an exclusive resource group for data integration and add the exclusive resource group to your DataWorks workspace.

    For more information about exclusive resource groups for data integration, see Exclusive resources for Data Integration.

  • Network connections

    An exclusive resource group for data integration is essentially a group of resource instances. After you purchase such an exclusive resource group, it is isolated from other services. You must bind the resource group to a virtual private cloud (VPC) to ensure the network connectivity between the resource group and data sources during subsequent data synchronization.

Purchase an exclusive resource group for Data Integration

  1. Log on to the DataWorks console.
  2. Select a region. In the left-side navigation pane, click Resource Groups.
  3. On the Exclusive Resource Groups tab, click Create a dedicated resource group.
  4. In the Create a dedicated resource group panel, click Purchase next to Order Number. The buy page appears.
  5. On the buy page, set the Region, Type, Units, and Duration parameters. Then, click Buy Now.
    Note You must set the Type parameter to Exclusive Resource Groups for Data Integration.
  6. After you confirm that the order information is correct, read and agree to DataWorks Exclusive Resources Agreement of Service by selecting the check box and click Pay.

Create an exclusive resource group for Data Integration

  1. On the Exclusive Resource Groups tab of the Resource Groups page, click Create a dedicated resource group.
  2. In the Create a dedicated resource group panel, set the parameters as required.
    Parameter Description
    Resource Group Type The type of the exclusive resource group. Valid values: Exclusive Resource Groups and Exclusive Resource Groups for Data Integration. The former type is used to schedule general nodes, whereas the latter type is used to schedule sync nodes.
    Resource Group Name The name of the resource group. The name must be unique within all resource groups of a tenant.
    Note A tenant indicates an Alibaba Cloud account. Each tenant may have multiple RAM users.
    Resource Group Description The description of the exclusive resource group.
    Order Number The order number of the exclusive resources that you purchase. If you have not purchased exclusive resources, click Purchase next to Order Number to go to the buy page and purchase exclusive resources.
  3. After you complete the configuration, click OK.
    Note The exclusive resource group is initialized within 20 minutes. Wait until its status changes to Running.

Configure network settings

Exclusive resource groups are deployed in a VPC that is managed by DataWorks and are disconnected from other network environments. To use an exclusive resource group, you must configure network settings for the exclusive resource group. Bind the exclusive resource group to a VPC that can connect to data stores. This way, the exclusive resource group can connect to the data stores by using the VPC.

  1. Find the required resource group and click Network Settings in the Actions column. The VPC Binding tab appears.
    Note Before you bind the exclusive resource group to a VPC, authorize DataWorks to access your cloud resources in the Resource Access Management (RAM) console.
  2. Bind the exclusive resource group to a VPC.
    1. Click Add Binding in the upper-left corner of the VPC Binding tab. In the Add VPC Binding panel, set the parameters based on the network environment.
      The following table describes the parameters.
      Parameter Description used when the data store and exclusive resource group reside in the same VPC Description used when the data store and exclusive resource group reside in different VPCs
      VPC If your data store is deployed in a VPC of Alibaba Cloud, we recommend that you set this parameter to the VPC. Assume that your data store is not deployed in a VPC of Alibaba Cloud or the data store and exclusive resource group need to be deployed in different VPCs. You can click Create VPC to create a VPC for the exclusive resource group. After the VPC is created, set this parameter to the created VPC.
      VSwitch If you set the VPC parameter to the VPC where the data store resides, we recommend that you select the vSwitch that is associated with the data store. If you set the VPC parameter to another VPC or no vSwitch can be used, you can click Create VSwitch to create a vSwitch for the exclusive resource group. After the vSwitch is created, set this parameter to the created vSwitch.
      Note After you bind the exclusive resource group to the VPC where the data store resides and then to a vSwitch that resides in the VPC, a route is automatically added. The destination of this route is the CIDR block of the VPC. This ensures that the exclusive resource group can access all data stores in this VPC.
      Security Groups Security groups allow or deny access to your exclusive instances from the Internet or an internal network. You can select an existing security group based on your business needs, or click Create Security Group to create a security group for the exclusive resource instances. For more information about how to add a security group rule, see Add security group rules.
    2. Click OK to bind the exclusive resource group to the VPC.
  3. Optional: Add host configurations.
    Your data store may fail to be accessed by using IP addresses. For example, the data store must be accessed by using hostnames. In this case, you must add host configurations by performing the following steps. Otherwise, the connectivity test fails when you create a connection to the data store by using the hostnames of the data store.
    1. Click the Hostname-to-IP Mapping tab. Click Add in the upper-left corner of the tab. In the Create Hostname-to-IP Mapping dialog box, set the parameters. The following table describes the parameters.
      Parameter Description
      IP Address The actual IP address of the data store.
      The hostname The hostname that is used to access the data store. If you want to specify multiple hostnames, place each hostname on a separate line.
      Note The domain name can contain digits, letters, hyphens (-), and periods (.). It must start with a letter and end with a letter or digit.
    2. If the data source has multiple IP addresses, click Add to add more host configurations.
      Note
      • The IP address or hostnames that are added in a host configuration must be different from those in existing host configurations.
      • You can map one IP address to multiple hostnames in a host configuration. However, one hostname can point to only one IP address.
  4. Optional: Add DNS configurations.
    Your data store may fail to be accessed by using IP addresses. For example, the data store must be accessed by using the domain name of a Server Load Balancer (SLB) instance, and an internal Domain Name System (DNS) server resolves the domain name to IP addresses of the data store. In this case, you must add DNS configurations by performing the following steps. Otherwise, the connectivity test fails when you create a connection to the data store by using the hostnames of the data store.
    Note If a hostname that is added in a host configuration is also configured in a DNS configuration, the system preferentially uses the host configuration to look for the data store.
    1. Click the DNS Configuration tab. Click Add in the lower-left corner of the tab. After you set the parameters for a DNS configuration, click Save. The following table describes the parameters.
      Parameter Description
      Domain Optional. If your data stores are accessed by using the same top-level domain, set this parameter to the top-level domain.

      For example, the domain name that is used to access Data store 1 is domain1.example.com, and the domain name that is used to access Data store 2 is domain2.example.com. We recommend that you set this parameter to example.com.

      Note The domain name can contain digits, letters, hyphens (-), and periods (.). It must start with a letter and end with a letter or digit.
      NameServer Enter the IP address of the DNS server that resolves the domain name of the data store. If you want to specify multiple DNS servers, place the IP address of each DNS server on a separate line.
    2. To modify the existing DNS configuration, click Modify in the lower-left corner.
After you complete the network configuration for the exclusive resource group for Data Integration, you must perform the following step. Add the EIP of the exclusive resource group and the elastic network interface (ENI) IP address of the VPC to the whitelist of the data store.

What to do next

After you plan and configure resources, you can configure data sources. You must connect the exclusive resource group for data integration to the source data source and destination data source. You must also create an account and grant the required permissions to the account. This account is used to access the data sources. The preceding operations help create a data synchronization node. For more information about how to configure data sources, see Configure a source PolarDB data source.