On the Workspaces page in the DataWorks console, you can view all the workspaces within your account and perform relevant operations. For example, you can create, configure, delete, enable, and disable workspaces, and refresh the workspace list.

Go to the Workspaces page

  1. Log on to the DataWorks console by using your Alibaba Cloud account. The Overview page appears.
  2. In the left-side navigation pane, click Workspaces. On the Workspaces page, you can view all the workspaces within your Alibaba Cloud account.
    • Status: This column displays the status of each workspace. The status of a workspace may be Normal, Initializing, Initialization Failed, Deleting, or Deleted. After you create a workspace, the workspace enters the Initializing state first. Then, it enters the Initialization Failed or Normal state based on the initialization result.

      After you disable a workspace, you can enable it again or delete it. The workspace enters the Normal state after you enable it.

    • Service: You can move the pointer over an icon in this column to view the service that you activated. If a service is available, its icon is in blue. If a service is overdue, its icon is in red and has an overdue payment mark. If a service is overdue and deleted, its icon is dimmed. In most cases, a service is automatically deleted if you do not renew it within seven days after it is overdue.

Create a workspace

  1. On the Workspaces page, move the pointer over the region in the top navigation bar and select a region where you want to create a workspace.
  2. Click Create Workspace. The Basic Settings step of the Create Workspace wizard appears. Configure the parameters and click Next.
    Section Parameter Description
    Basic Information Workspace Name The name of the workspace. The name must be 3 to 23 characters in length. It must start with a letter and can contain only letters, underscores (_), and digits.
    Display Name The display name of the workspace. The display name can be a maximum of 23 characters in length. It must start with a letter and can contain only letters, underscores (_), and digits.
    Mode The mode of the workspace. Valid values: Basic Mode (Production Environment Only) and Standard Mode (Development and Production Environments).
    • Basic Mode (Production Environment Only): A workspace in basic mode is associated with only one MaxCompute project. Workspaces in basic mode do not isolate the development environment from the production environment. In these workspaces, you can perform only basic data development and cannot strictly control the data development process and the permissions on tables.
    • Standard Mode (Development and Production Environments): A workspace in standard mode is associated with two MaxCompute projects. One serves as the development environment, and the other serves as the production environment. Workspaces in standard mode allow you to develop code in a standard way and strictly control the permissions on tables. These workspaces impose limits on table operations in the production environment for data security.

    For more information, see Basic mode and standard mode.

    Description The description of the workspace.
    Advanced Settings Download SELECT Query Result Specifies whether the query results that are returned by SELECT statements in DataStudio can be downloaded. If you turn off this switch, the query results cannot be downloaded.
  3. In the Select Engines and Services step, select the required compute engines and services, and click Next.
    DataWorks is available as a commercial service. If you have not activated DataWorks in a region, activate it before you create a workspace in the region.
    Section Parameter Description
    DataWorks Services
    Note The services that are enabled for the workspace. By default, the check box in this section is selected.
    Data Integration Provides a stable, efficient, and scalable data synchronization platform. Data Integration is designed to efficiently transmit and synchronize data between various heterogeneous data sources in complex network environments. For more information, see Data Integration.
    Data Analytics Allows you to design a data computing process that consists of multiple mutually dependent nodes based on your business requirements. The nodes are automatically run in Operation Center. For more information, see DataStudio.
    Operation Center Allows you to view all your nodes and node instances and perform operations on them. For more information, see Operation Center.
    Data Quality Provides an end-to-end data quality solution that relies on DataWorks. This solution allows you to explore data, compare data, monitor data quality, scan SQL statements, and use intelligent alerting. For more information, see Data Quality.
    Compute Engines MaxCompute Provides a rapid, fully managed data warehouse solution that can process terabytes or petabytes of data. MaxCompute supports fast computing on large amounts of data, effectively reduces costs for enterprises, and ensures data security. For more information, see the MaxCompute documentation.
    Note After you create workspaces in DataWorks, you must associate them with MaxCompute projects. Otherwise, the error project not found is returned when you run commands in the workspaces.
    Realtime Compute Allows you to develop streaming computing nodes in DataWorks.
    E-MapReduce Allows you to use E-MapReduce (EMR) to develop big data processing nodes in DataWorks. For more information, see the EMR documentation.
    Hologres Allows you to use HoloStudio in DataWorks to manage internal and foreign tables and develop Hologres SQL nodes.
    Graph Compute Allows you to use Graph Studio in DataWorks to manage Graph Compute instances.
    AnalyticDB for PostgreSQL Allows you to develop AnalyticDB for PostgreSQL nodes in DataWorks. For more information, see Overview.
    Note You can use the AnalyticDB for PostgreSQL compute engine only in DataWorks Standard Edition or a more advanced edition.
    AnalyticDB for MySQL Allows you to develop AnalyticDB for MySQL nodes in DataWorks. For more information about AnalyticDB for MySQL, see Product introduction.
    Note You can use the AnalyticDB for MySQL compute engine only in DataWorks Standard Edition or a more advanced edition.
    Machine Learning Services PAI Studio Uses statistical algorithms to learn large amounts of historical data and generate an empirical model to provide business strategies.
  4. In the Engine Details step, configure the parameters for the selected compute engines.
    Section Parameter Description
    MaxCompute Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Resource Group The quotas of computing resources and disk space for the compute engine instance.
    MaxCompute Data Type Edition The MaxCompute data type edition. This configuration takes effect within 5 minutes. For more information, see Data types. If you do not know which edition to select, we recommend that you contact the workspace administrator.
    Whether to encrypt Specifies whether to encrypt data. Valid values: No encryption and Encryption.
    MaxCompute Project Name The name of the MaxCompute project. By default, the MaxCompute project that serves as the production environment is named after the DataWorks workspace. The MaxCompute project that serves as the development environment is named in the format of DataWorks workspace name_dev.
    Account for Accessing MaxCompute The identity that you can use to access the MaxCompute project. For the development environment, the value is fixed to Node Owner.

    For the production environment, the valid values are Alibaba Cloud Account and RAM User.

    Realtime Compute Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Realtime Compute Cluster The Realtime Compute for Apache Flink cluster to which the Realtime Compute for Apache Flink project you want to bind belongs. If no Realtime Compute for Apache Flink cluster exists, create one in the Realtime Compute for Apache Flink console.
    Realtime Compute Project The Realtime Compute for Apache Flink project that you want to bind to the DataWorks workspace. If no Realtime Compute for Apache Flink project exists, create one in the Realtime Compute for Apache Flink console.
    E-MapReduce Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Access ID The AccessKey ID of the account that is authorized to access the EMR cluster.
    Access Key The AccessKey secret of the account that is authorized to access the EMR cluster.
    Cluster ID The ID of the EMR cluster. You can obtain the ID from the EMR console.
    EmrUserID The ID of the user who created the EMR cluster.
    Workspace ID The ID of the project in the EMR cluster.
    YARN Resource Queue The name of the resource queue in the EMR cluster. Unless otherwise specified, set this parameter to default.
    Endpoint The endpoint of the EMR cluster. You can obtain the endpoint from the EMR console.
    Hologres Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Access identity The identity that you can use to access the Hologres instance. For the development environment, the value is fixed to Task owner.

    For the production environment, the valid values are Alibaba Cloud primary account and Alibaba Cloud sub-account.

    Hologres instance name The name of the Hologres instance.
    Database name The name of the database that you want to bind to the DataWorks workspace. After you create a Hologres instance, the system automatically creates a database named postgres for management only. You can create a database based on your business requirements in the Hologres console and associate the database with the DataWorks workspace.
    Connectivity Test Click Test Connectivity to test the connectivity of the compute engine instance.
    Graph Compute Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Bind Graph Compute Instance The Graph Compute instance that you want to bind to the DataWorks workspace.
    AnalyticDB for PostgreSQL Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    InstanceName The name of the AnalyticDB for PostgreSQL instance that you want to add as the compute engine instance.
    DatabaseName The name of the database that you want to bind to the DataWorks workspace in the AnalyticDB for PostgreSQL instance.
    Username The username that you can use to connect to the database.
    Password The password that you can use to connect to the database.
    Select Resource Group AnalyticDB for PostgreSQL nodes must be run on exclusive resource groups. Therefore, you must specify an exclusive resource group. If no exclusive resource group exists, click Create Exclusive Resource Group to create one.
    Test Connectivity Click Test Connectivity to test the connectivity between the specified exclusive resource group and AnalyticDB for PostgreSQL instance.
    AnalyticDB for MySQL Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    InstanceName The name of the AnalyticDB for MySQL cluster that you want to add as the compute engine instance.
    DatabaseName The name of the database that you want to bind to the DataWorks workspace in the AnalyticDB for MySQL cluster.
    Username The username that you can use to connect to the database.
    Password The password that you can use to connect to the database.
    Select Resource Group AnalyticDB for MySQL nodes must be run on exclusive resource groups. Therefore, you must specify an exclusive resource group. If no exclusive resource group exists, click Create Exclusive Resource Group to create one.
    Test Connectivity Click Test Connectivity to test the connectivity between the specified exclusive resource group and AnalyticDB for MySQL cluster.
  5. Click Create Workspace.
After the workspace is created, you can view the information about the workspace on the Workspaces page.
Note
  • If you are the owner of a workspace, all data in the workspace belongs to you. Other users can access the workspace only after you grant permissions to them. If you create a workspace as a RAM user of an Alibaba Cloud account, the workspace belongs to both the RAM user and the Alibaba Cloud account.
  • You can add a RAM user to a workspace so that the RAM user can use the workspace. This way, the RAM user does not need to create a workspace.

Configure a workspace

On the Workspaces page, you can find a workspace and click Workspace Settings in the Actions column. In the Workspace Settings panel, you can configure the basic and advanced settings of the workspace. For example, you can modify the display name and description of the workspace and enable the recurrence feature for the workspace. For more information, see Configure a workspace.

Configure services

You can configure DataWorks services, compute engines, and machine learning services for a workspace. Before you configure services, you must purchase them.

On the Workspaces page, find a workspace and click Modify service configuration in the Actions column. In the Modify service configuration panel, you can configure services.

To use a new service in your workspace, select the service and click Next. In the Engine Details step, configure the parameters for the service and click OK.

Go to the DataStudio, Data Integration, or DataService Studio page

On the Workspaces page, find a workspace and click Data Analytics, Data Integration, or DataService Studio in the Actions column to go to the related page.

Delete or disable a workspace

On the Workspaces page, you can delete or disable workspaces.
  • Delete a workspace
    Find the workspace that you want to delete and choose More > Delete Workspace in the Actions column. In the Delete Workspace panel, enter the verification code YES and click OK.
    Note
    • In the Delete Workspace panel, the verification code is fixed to YES.
    • After you delete a workspace, you cannot recover it. Proceed with caution when you delete a workspace.
  • Disable a workspace

    Find the workspace that you want to disable and choose More > Disable Workspace in the Actions column. In the Disable Workspace panel, click OK.

    After you disable a workspace, instances are no longer generated for auto triggered nodes in the workspace. Instances generated before you disable the workspace run automatically at the specified time. However, you cannot log on to the workspace to view information about these instances.