On the Workspace Management page of a workspace, you can manage and configure the workspace. DataWorks supports a variety of compute engines, such as MaxCompute, E-MapReduce (EMR), Realtime Compute for Apache Flink, Hologres, Graph Compute, AnalyticDB for PostgreSQL, and AnalyticDB for MySQL.

Go to the Workspace Management page

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. Go to the Workspace Management page of a workspace.
    You can use one of the following methods to go to the Workspace Management page:
    • On the Workspaces page, find the workspace that you want to configure and click Workspace Settings in the Actions column. In the Workspace Settings panel, click More. The Workspace Management page appears.More
    • On the Workspaces page, find the workspace that you want to configure and click Data Analytics in the Actions column. On the DataStudio page, click the Workspace Management icon in the upper-right corner. The Workspace Management page appears.Workspace Management page
  4. On the Workspace Management page, configure basic properties, scheduling properties, security settings, and compute engines for the workspace based on your business requirements.

Configure basic properties

Basic properties
Parameter Description
Workspace ID The ID of the workspace.
Workspace name The name of the workspace. The name must start with a letter and can contain only letters and digits. It is not case-sensitive. The name uniquely identifies the workspace and cannot be changed after the workspace is created.
Status The status of the workspace. Valid values: Normal, Deleted, Initializing, Initialization Failed, Manual Disable, Deleting, Deletion Failed, Suspended (Overdue), Updating, and Update Failed.
Note
  • If a workspace fails to be created, it enters the Initialization Failed state. In this case, you can create the workspace again.
  • A workspace in the Normal state can be disabled by the workspace administrator. After a workspace is disabled, all the features of the workspace become unavailable. However, the data of the workspace is retained, and nodes that have been committed can normally run.
  • The workspace administrator can click Enable in the Actions column to recover the disabled workspace to a normal state.
Display name The display name that is used to identify the workspace. The display name can contain only letters and digits. You can change it based on your requirements.
Creation date The point in time when the workspace was created, which cannot be changed.
Mode The mode of the workspace. Valid values: Simple Mode and Standard mode.
Note In this example, a workspace in standard mode is used.
The head of The owner of the workspace, who has permissions to delete and disable the workspace. The owner of the workspace cannot be changed.
Description The description of the workspace, which provides comments on the workspace. You can modify the description based on your requirements. The description can be a maximum of 128 characters in length and can contain letters, special characters, and digits.

Configure scheduling properties

In the Scheduling properties section, you can enable periodic scheduling for the workspace. You can also specify Default scheduling Resource Group, Default data integration resource group, Default error automatic re-run times, and Default error automatic rerun interval for the workspace.Scheduling properties

Nodes can be periodically run in a workspace only after you turn on Enable periodic scheduling for the workspace.

Configure security settings

Security settings
Parameter Description
Allow download of select results Specifies whether the query results that are returned by SELECT statements in DataStudio can be downloaded. If you turn off this switch, the query results cannot be downloaded.
Allow sub-accounts to change their own node owners Specifies whether to allow RAM users to change the owners of their nodes.
Sandbox whitelist (configure IP addresses or domain names that Shell tasks can access) The IP addresses or domain names that can be accessed by a Shell node that runs on the default resource group.
Note You must specify public IP addresses or domain names that can be accessed. For internal services in your enterprise, we recommend that you use exclusive resource groups to ensure network accessibility. For more information, see Exclusive resource group mode.

To add an IP address or domain name to the whitelist, perform the following steps:

  1. In the Security Settings section, click Add sandbox whitelist.
  2. In the Add sandbox whitelist dialog box, enter an IP address or domain name in the Address field and a port number in the Port field.
  3. Click Confirm.

Bind a MaxCompute compute engine instance

  1. In the Computing Engine information section, click the MaxCompute tab. On this tab, you can view the information about all available MaxCompute compute engine instances in the workspace.
  2. Click Add instances.
  3. In the Add a MaxCompute instance dialog box, configure the parameters.
    Add a MaxCompute instance dialog box
    Parameter or section Description
    Instance display name The display name of the compute engine instance. The display name can be a maximum of 27 characters in length. It must start with a letter and can contain letters, underscores (_), and digits.
    Region The region of the workspace.
    Payment mode The billing method of the compute engine instance. Valid values: The pay-as-you-go billing method, Monthly package, and Developer version.
    Note An instance of the developer version cannot be bound to a workspace in standard mode.
    Quota group The quotas of computing resources and disk space for the compute engine instance.
    Production Environment The parameters in this section include Project name and Access identity.
    • Project name: the name of the MaxCompute project that serves as the production environment at the underlying layer of the DataWorks workspace.
    • Access identity: the type of the account used to access the MaxCompute project. Valid values: Alibaba Cloud primary account and Alibaba Cloud sub-account.
    Development Environment The parameters in this section include Project name and Access identity.
    • Project name: the name of the MaxCompute project that serves as the development environment at the underlying layer of the DataWorks workspace.
      Note This MaxCompute project provides computing and storage resources.
    • Access identity: the type of the account used to access the MaxCompute project. The default value of this parameter is Task owner and cannot be changed.
  4. Click Confirm.
    After the compute engine instance is added, you can set it as the default instance.

Bind an EMR compute engine instance

  1. In the Compute Engine information section, click the E-MapReduce tab. On this tab, you can view the information about all available EMR compute engine instances in the workspace.
  2. Click Add instances.
  3. In the New EMR cluster dialog box, configure the parameters.
    New EMR cluster dialog box
    Parameter Description
    Instance display name The display name of the EMR cluster that you want to bind.
    Region The region of the workspace, which cannot be modified.
    Access Mode The access mode of the EMR cluster. Valid values: Shortcut mode and Security mode.
    Note In this example, an EMR cluster in shortcut mode is bound.
    Scheduling access identity The identity that is used to commit the code of an EMR node in the production environment to the EMR cluster after the node is committed to the scheduling system of DataWorks. Valid values: Alibaba Cloud primary account and Alibaba Cloud sub-account.
    Note If you select Alibaba Cloud sub-account, you must specify a RAM user to which the AliyunEMRDevelopAccess policy is attached.
    Access identity The identity that is used to commit the code of an EMR node in the development environment to the EMR cluster. Default value: Task owner.
    Note This parameter is available only for workspaces in standard mode.
    Cluster ID The ID of the EMR cluster. Select an ID from the drop-down list. The selected EMR cluster is used as the runtime environment of EMR nodes.
    Project ID The ID of the EMR project that you want to bind. Select an ID from the drop-down list. The selected EMR project is used as the runtime environment of EMR nodes.
    Note If Access Mode is set to Security mode, EMR projects cannot be selected.
    YARN resource queue The name of the resource queue in the EMR cluster. Unless otherwise specified, set the parameter to default.
    Endpoint The endpoint of EMR, which cannot be modified.
  4. Click Confirm.
    After the compute engine instance is added, you can set it as the default instance and modify the instance configuration based on your requirements.

Bind a Realtime Compute for Apache Flink compute engine instance

  1. In the Compute Engine information section, click the Real-time computing tab. On this tab, you can view the information about all available Realtime Compute for Apache Flink compute engine instances in the workspace.
  2. Click Add instances.
  3. In the Add a real-time computing instance dialog box, configure the parameters.
    Add a real-time computing instance dialog box
    Parameter Description
    Instance display name The display name of the Realtime Compute for Apache Flink compute engine instance.
    Region The region of the workspace.
    Select Project The Realtime Compute for Apache Flink project that you want to bind to the workspace. Select a project from the drop-down list. If you need to create a project, click Real-time calculation control platform.
  4. Click Confirm.
    After the compute engine instance is added, you can set it as the default instance and modify the instance configuration based on your requirements.

Bind a Hologres compute engine instance

  1. In the Compute Engine information section, click the Hologres tab. On this tab, you can view the information about all available Hologres compute engine instances in the workspace.
  2. Click Binding HologresDB.
  3. In the Binding HologresDB dialog box, configure the parameters.
    Hologres
    Parameter Description
    Instance display name The display name of the Hologres compute engine instance.
    Access identity The identity used to run the code of committed Hologres nodes. Valid values: Alibaba Cloud primary account and Alibaba Cloud sub-account.
    Hologres instance name The name of the Hologres instance that you want to bind to the workspace.
    Database name The name of the database that was created in SQL Console, such as testdb.
    Server The endpoint of the purchased Hologres instance. The value is automatically generated after you select the Hologres instance.
    Port The port of the purchased Hologres instance. The value is automatically generated after you select the Hologres instance.
  4. Click Test connectivity.
  5. After the connectivity test is passed, click Confirm.

Bind a Graph Compute compute engine instance

  1. In the Compute Engine information section, click the GraphCompute tab.
  2. Click Bind a GraphCompute instance.
    Notice A Graph Compute instance can be bound to only one DataWorks workspace. After a Graph Compute instance is bound to a DataWorks workspace, the instance cannot be used in other DataWorks workspaces.
  3. In the Bind a GraphCompute instance dialog box, configure the parameters.
    GraphCompute
    Parameter or button Description
    Instance display name The display name of the compute engine instance.
    GraphCompute instance name The name of the Graph Compute instance that you want to bind to the workspace as the compute engine instance.
    Create an instance If you do not have a Graph Compute instance, click Create an instance to purchase a Graph Compute instance.
    Notice By default, each Alibaba Cloud account can purchase only one Graph Compute instance.
  4. Click Binding.

Bind an AnalyticDB for PostgreSQL compute engine instance

Notice
  • You can use the AnalyticDB for PostgreSQL compute engine only in DataWorks Standard Edition or a more advanced edition. Therefore, the AnalyticDB for PostgreSQL tab is available only in DataWorks Standard Edition or a more advanced edition.
  • AnalyticDB for PostgreSQL nodes can run only on exclusive resource groups for scheduling.
  1. In the Compute Engine information section, click the AnalyticDB for PostgreSQL tab.
  2. Click Add instances.
    For a workspace in standard mode, the development environment is isolated from the production environment. If you are using a workspace in standard mode, you must add instances to both the development environment and the production environment.
  3. In the Add an AnalyticDB for PostgreSQL instance dialog box, configure the parameters. In this example, the workspace is in standard mode.
    ADB
    Parameter Description
    Instance display name The display name of the compute engine instance, which must be unique.
    InstanceName The name of the AnalyticDB for PostgreSQL instance that you want to bind to the workspace as the compute engine instance.
    DatabaseName The name of the database that you want to bind to the workspace in the AnalyticDB for PostgreSQL instance.
    Username The username that you can use to connect to the database. You can obtain the information from the Account Management page in the AnalyticDB for PostgreSQL console. For more information, see Configure an account.
    Password The password that you can use to connect to the database. You can obtain the information from the Account Management page in the AnalyticDB for PostgreSQL console. For more information, see Configure an account.
    Connectivity Test AnalyticDB for PostgreSQL nodes must be run on exclusive resource groups for scheduling. Therefore, you must select an exclusive resource group for scheduling. For more information, see Exclusive resource group mode.

    Click Test connectivity to test the connectivity between the specified exclusive resource group for scheduling and AnalyticDB for PostgreSQL instance. If no exclusive resource group for scheduling exists, click Create a new exclusive Resource Group to create one.

  4. After the connectivity test is passed, click Confirm.

Bind an AnalyticDB for MySQL compute engine instance

Notice
  • You can use the AnalyticDB for MySQL compute engine only in DataWorks Standard Edition or a more advanced edition. Therefore, the AnalyticDB for MySQL tab is available only in DataWorks Standard Edition or a more advanced edition.
  • AnalyticDB for MySQL nodes can run only on exclusive resource groups for scheduling.
  1. In the Compute Engine information section, click the AnalyticDB for MySQL tab.
  2. Click Add instances.
    For a workspace in standard mode, the development environment is isolated from the production environment. If you are using a workspace in standard mode, you must add instances to both the development environment and the production environment.
  3. In the Add an AnalyticDB for MySQL instance dialog box, configure the parameters. In this example, the workspace is in standard mode.
    AnalyticDB for MySQL
    Parameter Description
    Instance display name The display name of the compute engine instance, which must be unique.
    InstanceName The name of the AnalyticDB for MySQL cluster that you want to add as the compute engine instance.
    DatabaseName The name of the database that you want to bind to the DataWorks workspace in the AnalyticDB for MySQL cluster.
    Username The username that you can use to connect to the database. You can obtain the information from the Accounts page in the Cloud Native Data Warehouse Console. For more information, see Database accounts and permissions.
    Password The password that you can use to connect to the database. You can obtain the information from the Accounts page in the Cloud Native Data Warehouse Console. For more information, see Database accounts and permissions.
    Connectivity Test AnalyticDB for MySQL nodes must be run on exclusive resource groups for scheduling. Therefore, you must select an exclusive resource group for scheduling. For more information, see Exclusive resource group mode.

    Click Test connectivity to test the connectivity between the specified exclusive resource group for scheduling and AnalyticDB for MySQL cluster. If no exclusive resource group for scheduling exists, click Create a new exclusive Resource Group to create one.

  4. After the connectivity test is passed, click Confirm.