When you use DataWorks to synchronize data, you can use only exclusive resource groups for data integration to run real-time sync nodes. This topic describes the resources and configurations required to run real-time sync nodes.

Background information

  • Resource planning and preparation

    Before you use a data synchronization node to synchronize data, you must purchase an exclusive resource group for data integration and add the resource group to DataWorks for subsequent use.

    For more information about exclusive resource groups for Data Integration, see Exclusive resource groups for Data Integration.

  • Network connectivity

    An exclusive resource group for Data Integration is essentially a group of Elastic Compute Service (ECS) instances. After you purchase an exclusive resource group for Data Integration, it is isolated from other services. You must associate the resource group with a virtual private cloud (VPC) to ensure network connectivity between the resource group and data sources during subsequent data synchronization.

Associate the exclusive resource group with a VPC

Exclusive resource groups reside in the VPC in which DataWorks is hosted. Exclusive resource groups are disconnected from other network environments. To use an exclusive resource group, you must associate the exclusive resource group with a VPC that can connect to data sources. This way, the exclusive resource group can access the data sources over the VPC. To associate the exclusive resource group for Data Integration with a VPC, perform the following steps:
Important You can associate an exclusive resource group for Data Integration that uses the specifications of 4 vCPUs and 8 GiB of memory with a maximum of two VPCs. You can associate an exclusive resource group for Data Integration that uses the other specifications with a maximum of three VPCs.
  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Resource Groups. On the Exclusive Resource Groups tab of the Resource Groups page, find the created resource group and click Network Settings in the Actions column. On the page that appears, you can associate the resource group with a VPC.
    Before you associate the exclusive resource group with a VPC, you must log on to the RAM console with your Alibaba Cloud account and authorize DataWorks to access your cloud resources. You can go to the Cloud Resource Access Authorization page to authorize DataWorks to access your cloud resources. You can also authorize DataWorks to access your cloud resources by clicking the related button in the dialog box that is displayed the first time you log on to the DataWorks console with your Alibaba Cloud account.
  3. Associate the exclusive resource group with a VPC.
    1. On the VPC Binding tab, click Add Binding. In the Add VPC Binding panel, configure the parameters.
      Note If you want to use the resource group to access a data source, such as an Alibaba Cloud data source or a self-managed data source hosted on an ECS instance, you can select a network connectivity solution and configure network settings based on whether the resource group and data source belong to the same Alibaba Cloud account.
      ParameterDescription (same region and Alibaba Cloud account)Description (different regions or Alibaba Cloud accounts)
      VPCIf your data source and the exclusive resource group belong to the same Alibaba Cloud account, we recommend that you select the VPC in which your data source resides.

      If your data source and the exclusive resource group belong to different Alibaba Cloud accounts, configure this parameter based on the description for the scenario where your data source and the exclusive resource group reside in different regions.

      If your data source and the exclusive resource group belong to different Alibaba Cloud accounts or reside in different regions, you must select a VPC that connects to the data source. For example, if your data source does not reside in a VPC, you can click Create VPC to create a VPC for the exclusive resource group. After the VPC is created, you can select it from the VPC drop-down list. You can also select a VPC that connects to your data source.
      Note If your data source and the exclusive resource group reside in different regions or belong to different Alibaba Cloud accounts, you must use VPN Gateway or Express Connect to establish a connection between the VPC with which the exclusive resource group is associated and the VPC in which the data source resides and add a route that points to the IP address of the data source for the exclusive resource group. For more information, see Establish a network connection between a resource group and a data source.
      ZoneSelect the zone in which your data source resides. Select a zone from which a network connection to your data source is established.
      VSwitchIf you set the VPC parameter to the VPC in which your data source resides, we recommend that you select the vSwitch to which the data source is connected.
      Note After you associate the exclusive resource group with the VPC in which the data source resides and a vSwitch that resides in the VPC, a route that points to the CIDR block of the VPC is automatically added. This ensures that the exclusive resource group can access the data sources in this VPC.
      Select the vSwitch to which the data source is connected. If no vSwitch is available, you can click Create VSwitch to create a vSwitch for the exclusive resource group. After a vSwitch is created, select the vSwitch.
      Security GroupsSecurity groups allow or deny access to the exclusive resource group over the Internet or an internal network. You can select an existing security group based on your business requirements, or click Create Security Group to create a security group for the resources in the exclusive resource group. For more information about how to create a security group, see Add a security group rule.
    2. Click OK.
    Note If your data source and the exclusive resource group reside in different regions or belong to different Alibaba Cloud accounts, you must add a route that points to the IP address of your data source after you associate the exclusive resource group with a VPC.
  4. Optional:Add host configurations.
    You may fail to access your data source by using IP addresses. For example, you can access your data source only by using hostnames. In this case, you must perform the following steps to add host configurations. Otherwise, the connectivity test fails when you add the data source by using its hostnames.
    1. Click the Hostname-to-IP Mapping tab. Then, click Add. In the Create Hostname-to-IP Mapping dialog box, configure the parameters. The following table describes the parameters.
      ParameterDescription
      IP AddressThe actual IP address of the data source.
      HostnameThe hostname that is used to access the data source. If you want to specify multiple hostnames, place each hostname on a separate line.
    2. If the data source has multiple IP addresses, click Add to add more host configurations.
      Note
      • The IP address or hostnames that are added in a host configuration must be different from the IP addresses or hostnames in existing host configurations.
      • You can map one IP address to multiple hostnames in a host configuration. However, one hostname can point to only one IP address.

What to do next

After you plan and configure resources, you can configure data sources. You must configure network connectivity for the data sources and permissions to access the data sources. This facilitates the creation of a real-time sync node. You can synchronize data to an AnalyticDB for MySQL data source only from a PolarDB or MySQL data source. You can select a PolarDB or MySQL data source based on your business requirements. For more information about how to configure a PolarDB or MySQL data source, see Configure a source PolarDB data source or Configure data sources for data synchronization from MySQL.