Create a node to synchronize schemas of MaxCompute tables with a few clicks - DataWorks

DataWorks allows you to create multiple Hologres external tables whose schemas are the same as those of the source MaxCompute tables at a time by creating and configuring a node on the DataStudio page. You can then use the created Hologres external tables to query data of the source MaxCompute tables in an accelerated manner. This topic describes how to create a node to synchronize schemas of MaxCompute tables with a few clicks.

Background information

Hologres is an all-in-one real-time data warehousing service developed by Alibaba Cloud and is seamlessly connected to MaxCompute at the underlying layer. You can create external tables in Hologres to query data of MaxCompute tables in an accelerated manner.

DataWorks allows you to synchronize the schemas of MaxCompute tables with a few clicks based on the IMPORT FOREIGN SCHEMA statement in a visualized manner.

Limits

You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.

Go to the configuration tab of the node that you create to synchronize schemas of MaxCompute tables with a few clicks

Go to the DataStudio page.
1. Log on to the DataWorks console.
2. In the left-side navigation pane, click Workspaces.
3. In the top navigation bar, select the region where your workspace resides. Find your workspace and click DataStudio in the Actions column.
Create a workflow.
If you have an existing workflow, skip this step.
1. Move the pointer over the icon and select Create Workflow.
2. In the Create Workflow dialog box, configure the Workflow Name parameter.
3. Click Create.
Create a One-click MaxCompute table structure synchronization node.
1. Move the pointer over the icon and choose Create Node > Hologres > One-click MaxCompute table structure synchronization.
  You can also find the desired workflow, right-click the workflow, and then choose Create Node > Hologres > One-click MaxCompute table structure synchronization.
2. In the Create Node dialog box, configure the Name, Engine Instance, Node Type, and Path parameters.
3. Click Commit. The configuration tab of the node appears.

Configure the One-click MaxCompute table structure synchronization node.

Configure the node information.

On the configuration tab of the node, configure the Hologres connection information for the Hologres external tables, the source information of MaxCompute tables, and the policy that is used to handle conflicts that may occur when you create the Hologres external tables. Configure the node information

Configure the parameters in the Target Info section.

The parameters that you configure in this section determine the Hologres compute engine instance in which you want to store the Hologres external tables.


Parameter	Description
Target connection	The name of the Hologres compute engine instance.
Target Library	The name of the database in the Hologres compute engine instance.
Schema	The name of the schema in the database. Default value: public.

Configure the parameters in the Source MaxCompute Tables section.

The parameters that you configure in this section determine the source MaxCompute tables based on which you create Hologres external tables. DataWorks allows you to create external tables whose schemas are the same as those of the source tables based on the parameters that you configure in this section. You can then use the created external tables to query data of the source tables in an accelerated manner.


Parameter	Description
Type	The type of the source table based on which you create a Hologres external table. Only MaxCompute is supported.
Server List	The server where the source tables reside. You can use the odps_server server that is created at the underlying layer of Hologres. For more information, see postgres_fdw.
Source Project	The name of the project to which the source tables belong.
Select tables	The source tables based on which you create Hologres external tables. Entire Project: All tables in the selected project. Selected Tables: Specific tables in the selected project. If you select this option, you can search for tables by name. Note Fuzzy match is supported. After you enter a keyword, all tables whose names contain the keyword are displayed.

Configure the parameters in the Advanced Options section.

The parameters that you configure in this section determine the policy that is used to handle conflicts that may occur when you create the Hologres external tables.


Parameter	Description
Processing Rule for Table Name Conflicts	The policy that is used to handle the following conflict: The name of a Hologres external table that you want to create is the same as the name of an existing table in Hologres. Valid values: Ignore, continue to create the rest tables Update, modify the same name tables Alert, do not duplicately create
Unsupported data type	The policy that is used to handle the following conflict: The type of data in a Hologres external table that you want to create is not supported in Hologres. Valid values: Alert, import failure: If you select this option, the Hologres external table fails to be created. Ignore, skip unsupported columns: If you select this option, the system skips fields whose data types are not supported and continues to create the Hologres external table.

Save the node configurations and run the node.
1. In the top navigation bar of the configuration tab of the node, click the icon to save the node configurations.
2. In the top navigation bar of the configuration tab of the node, click the icon to create the Hologres external tables.

Note You must select an exclusive resource group for scheduling of DataWorks that is connected to the Hologres compute engine instance to run the node. For more information, see the Initialize the network environment step in the Associate a Hologres compute engine with a workspace section of the Associate a Hologres compute engine with a workspace topic.

What to do next

After the Hologres external tables are created, you can go to the Workspace Tables page in DataStudio to view the created external tables. For more information, see Manage tables. You can also run Hologres commands to query data of the source MaxCompute tables in an accelerated manner. For more information, see Create a foreign table in Hologres to accelerate queries of MaxCompute data.

Note You can use the created node to accelerate only data queries of MaxCompute internal tables. You cannot use the created node to accelerate data queries of MaxCompute external tables or views.