DataWorks provides the default resource group for you to migrate large amounts of data to the cloud for free. However, the default resource group does not work if a high transmission speed is required or your data stores are deployed in complex environments. You can use custom resource groups to run your sync nodes. This guarantees connections to your data stores and enables a higher transmission speed.

A workspace administrator can add or modify custom resource groups on the Custom Resource Group page of Data Integration.

If the default resource group cannot access your data stores deployed in complex network environments, you can add custom resource groups to enable data synchronization between any network environments.

Note
  • Custom resource groups added on the Custom Resource Group page of Data Integration can only run sync nodes in the current workspace. They do not appear in the resource group list. Currently, custom resource groups added on the Custom Resource Group page cannot run sync nodes in a manually triggered workflow.
  • You can add only one custom resource group on an Elastic Compute Service (ECS) instance. You can select only one network type for each custom resource group.
  • When you register an ECS instance for hosting a custom resource group, you can set the network type to classic network only when the ECS instance is in the China (Shanghai) region. In this case, you must enter the hostname of the ECS instance. We recommend that you set the network type to Virtual Private Cloud (VPC) preferentially. You can only set the network type to VPC for ECS instances in other regions. In this case, you must enter the universally unique identifier (UUID) of the ECS instance to be registered.
  • The admin permission is required to access some files on the ECS instance that hosts a custom resource group. For example, the admin permission is required to call shell or SQL files on the ECS instance when you write a shell script for a node.
  • A resource group for scheduling is mainly used to schedule nodes. They have limited resources and are not suitable for computing nodes. Therefore, we recommend that you do not add custom resource groups on ECS instances of a resource group for scheduling. MaxCompute can process large amounts of data. We recommend that you use MaxCompute for big data computing.

Limits

  • The difference between the time of the ECS instance where a custom resource group resides and the current Internet time must be within 2 minutes. Otherwise, service requests may time out and nodes may fail to be run on the custom resource group.
  • If the timeout error message response code is not 200 exists in the log file of alisatasknode, the custom resource group was inaccessible within the specific time period. The ECS server hosting the custom resource group can continue working if the exception persists for no more than 10 minutes. To find the exception details, view the heartbeat.log file in the /home/admin/alisatasknode/logs directory.

Purchase an ECS instance

Purchase an ECS instance
Note
  • CentOS V6, CentOS V7, or AliOS is recommended.
  • If the added ECS instance needs to run MaxCompute nodes or sync nodes, verify that the current Python version of the ECS instance is V2.6 or V2.7. (The Python version of CentOS V5 is V2.4 while those of other operating systems are later than V2.6.)
  • Make sure that the ECS instance can access the Internet. Ping www.alibabacloud.com on the ECS instance and check whether the URL can be pinged.
  • We recommend that you configure the ECS instance with an 8-core CPU and 16 GB memory.

View the hostname and internal IP address of the ECS instance

Log on to the ECS console. In the left-side navigation pane, choose Instances & Images > Instances. On the page that appears, view the hostname and IP address of the purchased ECS instance.

Enable port 8000 for reading logs

Note You do not need to enable port 8000 if your ECS instance is in a VPC. Steps in this section apply to ECS instances on classic networks only.
  1. Add a security group rule.

    Log on to the ECS console. In the left-side navigation pane, choose Network & Security > Security Groups. On the page that appears, find the target security group and click Add Rules in the Actions column.

  2. On the Security Group Rules > Inbound page that appears, click Add Security Group Rule in the upper-right corner.
  3. In the Add Security Group Rule dialog box that appears, set the parameters. Set Authorization Object to the fixed IP address of Data Integration and Port Range to 8000/8000.

Add a custom resource group

  1. In the left-side navigation pane, click Custom Resource Group. The Custom Resource Group page appears.
  2. Click Add Resource Group in the upper-right corner.
    Note By default, the Custom Resource Group page lists only your custom resource groups, but not the default resource group.
  3. In the Add Resource Group dialog box that appears, set Resource Group Name and click Next.
  4. In the Add Server dialog box that appears, set the parameters and click Next.
    Parameter Description
    Network Type The network type of the ECS instance. Currently, the classic network type is supported only for ECS instances in the China (Shanghai) region. You can only set this parameter to VPC for ECS instances in other regions.
    Server Name or ECS UUID
    • The hostname of the ECS instance when Network Type is set to Classic Network. To obtain the hostname, log on to the ECS instance and run the hostname command.
    • The UUID of the ECS instance when Network Type is set to VPC. To obtain the UUID, log on to the ECS instance and run the dmidecode | grep UUID command.
    Server IP Address The internal IP address of the ECS instance.
    Server CPU (Cores) The number of CPU cores on the ECS instance. We recommend that you configure at least four CPU cores for an ECS instance hosting a custom resource group.
    Server RAM (GB) The memory of the ECS instance. We recommend that you configure at least 8 GB RAM and 80 GB disk space for an ECS instance hosting a custom resource group.
    Note To add an ECS instance that resides in a VPC, enter the UUID of the ECS instance. To obtain the UUID, log on to the ECS instance and run the dmidecode | grep UUID command.

    For example, if the dmidecode | grep UUID command returns UUID: 713F4718-8446-4433-A8EC-6B5B62D7****, the UUID is 713F4718-8446-4433-A8EC-6B5B62D7****.

  5. Install and initialize the Agent.
    If an ECS instance is newly purchased and has never been used, follow these steps:
    1. Log on to the ECS instance through Secure Shell (SSH) as the root user.
    2. Run the following commands:
      chown admin:admin /opt/taobao  //Grant the admin user the permission to access the /opt/taobao directory.
      wget https://alisaproxy.shuju.aliyun.com/install.sh --no-check-certificate
      sh install.sh --user_name=*****19d --password=****h1bm --enable_uuid=false
    3. Wait for a while, click Refresh in the Add Server dialog box, and check whether the service status changes to Available.
    4. Enable port 8000 on the ECS instance.
      Note If an error occurs when you run the install.sh script or you have to run it again, run the rm –rf install.sh command in the same directory as the install.sh script to delete the generated file. Then, run the install.sh script again. The commands that need to be run during the installation and initialization process differ for each user. Run relevant commands according to instructions on the initialization interface.
If the service status remains Stopped after the preceding steps, the hostname may not be bound to an IP address, as shown in the following figure.Stopped
To bind the hostname to an IP address, follow these steps:
  1. Switch to the admin user.
  2. Run the hostname -i command to view the hostname binding information.
  3. Run the vim/etc/hosts command to add the binding of the IP address and hostname.
  4. Refresh the service status and check whether the ECS instance is registered.
Note
  • If the ECS instance is still in the Stopped status after you refresh the page, you can restart alisatasknode.
    Switch to the admin user and run the following command:
    /home/admin/alisatasknode/target/alisatasknode/bin/serverctl restart
  • You need to enter your AccessKey when running this command. Keep the AccessKey information secure.

Select the custom resource group for a sync node

On the configuration tab of a sync node, select the custom resource group for Resource Group in the Channel section.Configuring a channel