This topic describes how to connect the DataWorks metadata service and your exclusive resource group to the Virtual Private Cloud (VPC) where an AnalyticDB for PostgreSQL instance resides, and create a crawler to collect metadata.

Background information

If an AnalyticDB for PostgreSQL instance is deployed in a VPC and you want to use the instance as a data store in the DataWorks console, you must connect DataWorks to the VPC where the instance is deployed.

You can configure a whitelist in the AnalyticDB for PostgreSQL console to allow DataWorks to access the VPC where the AnalyticDB for PostgreSQL instance resides. For more information about how to configure a whitelist, see Configure a whitelist.

You must add the addresses of the following two objects to the whitelist:
  • DataWorks metadata service: This service is used to collect metadata from the AnalyticDB for PostgreSQL instance. You can configure a crawler to collect the metadata of the AnalyticDB for PostgreSQL instance to DataWorks. After the metadata is collected, you can manage tables in the AnalyticDB for PostgreSQL instance on the Data Map and DataStudio pages.
  • Exclusive resource group: An exclusive resource group consists of dedicated Elastic Compute Service (ECS) instances that are used for running nodes of DataWorks. To access the AnalyticDB for PostgreSQL instance, you must connect the ECS instances to the VPC where the AnalyticDB for PostgreSQL instance resides.
Currently, Data Map only allows you to collect metadata of AnalyticDB for PostgreSQL instances in the following regions. You can add the Classless Inter-Domain Routing (CIDR) blocks and IP addresses of the DataWorks service in these regions to the whitelist of the AnalyticDB for PostgreSQL instance.
Region Whitelist
China (Shanghai) 100.104.189.64/26,11.115.110.10,11.115.110.28,11.115.109.9,47.102.181.128/26,47.102.181.192/26,47.102.234.0/26,47.102.234.64/26
China (Hangzhou) 100.104.135.128/26,11.193.215.233,11.194.73.32,118.31.243.0/26,118.31.243.64/26,118.31.243.128/26,118.31.243.192/26
China (Shenzhen) 100.104.46.128/26,11.192.91.119,11.192.91.123,120.77.195.128/26,120.77.195.192/26,120.77.195.64/26,47.112.86.0/26
China (Beijing) 100.104.37.128/26,11.193.82.20,11.197.254.171,39.107.223.0/26,39.107.223.64/26,39.107.223.128/26,39.107.223.192/26
China (Chengdu) 100.104.88.64/26,11.195.57.28,11.195.57.27,47.108.46.0/26,47.108.46.64/26,47.108.46.128/26,47.108.46.192/26
China (Zhangjiakou) 100.104.197.0/26,11.193.236.121,11.193.236.120,47.92.185.0/26,47.92.185.64/26,47.92.185.128/26,47.92.185.192/26

Create a crawler

  1. Go to the Data Discovery page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces. The Workspaces page appears.
    3. Find the target workspace and click Data Analytics in the Actions column.
    4. On the DataStudio page, click Icon in the upper-left corner and choose All Products > DataMap. The Data Map page appears.
    5. Click Data Discovery in the top navigation bar.
  2. In the left-side navigation pane, click AnalyticDB for PostgreSQL.
  3. On the AnalyticDB for PostgreSQLMetadata Crawler page that appears, click Create Crawler.
  4. In the Create Crawler dialog box that appears, follow these steps:
    1. In the Basic Information step, set basic parameters.
      Basic Information
      Parameter Description
      Crawler Name Required. The name of the crawler. You must specify a unique name.
      Crawler Description The description of the crawler.
      Workspace The workspace where the metadata collected from the specific data store will be used.
      Connect To The type of the data store from which metadata will be collected. The default value is AnalyticDB for PostgreSQL and cannot be changed.
    2. Click Next.
    3. In the Select object type step, select a connection from the Connection drop-down list.
      If the required connection does not exist, click Go to New to go to the Data Source page in Workspace Management and create the connection. For more information, see Configure an AnalyticDB for PostgreSQL Connection.
    4. Click Test Crawler Connectivity. If the message The test was successful appears, the DataWorks metadata service can access the AnalyticDB for PostgreSQL instance.
    5. Click Next.
    6. In the Configure Execution Plan step, set scheduling parameters.
      The valid values of Execution Plan are as follows: On-demand Execution, Monthly, Weekly, Daily and Hourly.
    7. Click Next.
    8. In the Confirm Information step, verify that the configuration of the crawler is correct and click Confirm.
  5. On the AnalyticDB for PostgreSQLMetadata Crawler page, find the created crawler and click Run in the Actions column.
    After the crawler is run, click the number in the Last run update table or Last run Add table column to view the collected metadata.
    Notice The Run button only appears in the Actions column of a crawler that needs to be triggered manually.
    You can also perform the following operations on the page:
    • Click Details in the Actions column of a crawler. In the Crawler Details dialog box that appears, view the detailed information about the crawler.
    • Click Edit in the Actions column of a crawler. In the Edit Crawler dialog box that appears, modify the configuration of the crawler.
    • Click Delete in the Actions column of a crawler. In the Confirm dialog box that appears, click OK to delete the crawler.
    • Click Stop in the Actions column of a running crawler to stop the crawler.

Add the information about your exclusive resource group to the whitelist of the AnalyticDB for PostgreSQL instance

Note When you purchase an exclusive resource group, select the region where the AnalyticDB for PostgreSQL instance resides.

When you bind the exclusive resource group to a VPC, select the VPC where the AnalyticDB for PostgreSQL instance resides and the VSwitch used by the instance.

To allow your exclusive resource group to access the AnalyticDB for PostgreSQL instance, you must add the following objects to the whitelist of the AnalyticDB for PostgreSQL instance:
  • CIDR block of the exclusive resource group
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Resource Groups.
    3. On the Exclusive Resource Groups tab, find the target resource group and click View Information in the Actions column.
    4. In the dialog box that appears, click the value of the CIDR Blocks parameter. The value is automatically copied. Add the value to the whitelist of the AnalyticDB for PostgreSQL instance.
  • Elastic IP address (EIP) of the exclusive resource group

    On the Exclusive Resource Groups tab, find the target resource group and click View Information in the Actions column. In the dialog box that appears, click the value of the EIPAddress parameter. The value is automatically copied. Add the value to the whitelist of the AnalyticDB for PostgreSQL instance.

  • VSwitch CIDR block of the VPC to which the exclusive resource group is bound

    You need to bind the exclusive resource group to a VPC. For more information, see Manage exclusive resource groups.

    After you bind the exclusive resource group to a VPC, go to the Resource Groups page in the DataWorks console. On the Exclusive Resource Groups tab, find the target resource group and click Add VPC Binding in the Actions column. On the page that appears, obtain the VSwitch CIDR block in the Switch CIDR block column.

Test the connectivity between the exclusive resource group and the AnalyticDB for PostgreSQL instance

  1. Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. The Workspaces page appears.
  2. Find the target workspace and click Data Analytics in the Actions column.
  3. On the DataStudio page that appears, click Workspace Manage icon in the upper-right corner. The Workspace Management page appears.
  4. In the Compute Engine section, click the AnalyticDB for PostgreSQL tab.
  5. Click Add instance. In the Add an AnalyticDB for PostgreSQL instance dialog box that appears, set the parameters as required.
    If you have added an instance, click Edit to modify the instance configuration.
  6. Verify that the parameter configuration is correct in the Basic information section. Select a resource group from the drop-down list in the Test section and click Test connectivity.
    If the message The connectivity test has passed appears, the exclusive resource group can access the AnalyticDB for PostgreSQL instance.