If you turn on Participate in Public Preview of Data Studio when you create a workspace, you must also associate computing resources with the workspace. Then, you can develop and schedule tasks of the computing resource types in the workspace.
Prerequisites
A workspace is created, and Participate in Public Preview of Data Studio is turned on when you create the workspace. For more information about how to create a workspace, see Create a workspace.
You can find the desired workspace on the Workspaces page in the DataWorks console and perform the following operations to check whether Participate in Public Preview of Data Studio is turned on:
Participate in Public Preview of Data Studio not turned on
Participate in Public Preview of Data Studio turned on
Choose Shortcuts > Data Development in the Actions column.
The old-version DataStudio page appears, as shown in the following figure.
For more information about old-version DataStudio, see Overview.
Choose Shortcuts > DataStudio (new version) in the Actions column.
The new-version Data Studio page appears, as shown in the following figure.
For more information about new-version Data Studio, see Data Studio (new version).
Related computing resources are available. The operation of associating a computing resource in DataWorks is to only associate your existing computing resource with a DataWorks workspace. The storage, data, and billing of the computing resource are still managed in the corresponding service.
A pay-as-you-go serverless resource group is automatically purchased and associated with the default workspace when you activate DataWorks. You are not charged for the resource group if you do not use it. If you create a workspace and want to perform the operations that are described in this topic on the new workspace, associate the serverless resource group with the new workspace. For more information about how to associate a resource group with a workspace, see the Step 2: Associate the resource group with a workspace section of the "Create and use a serverless resource group" topic.
The computing resources that you want to associate with the workspace are connected to the serverless resource group. For more information, see Network connectivity solutions.
Terms
computing resource
A computing resource is a resource instance that is used by the related compute engine to run data processing and analysis tasks. For example, a MaxCompute project for which a quota group is configured and a Hologres instance are computing resources. For example, when you use Alibaba Cloud MaxCompute in big data processing scenarios, you can configure a quota group to manage the amount of computing resources used by your computing tasks.
You can associate multiple types of computing resources with a workspace. After you associate MaxCompute, Hologres, AnalyticDB for PostgreSQL, AnalyticDB for MySQL V3.0, ClickHouse, E-MapReduce (EMR), Cloudera's Distribution Including Apache Hadoop (CDH), OpenSearch, Serverless Spark, Serverless StarRocks, or Realtime Compute for Apache Flink computing resources with a workspace, you can develop and schedule the tasks of specific computing resource types in the workspace.
data source
Data sources can be connected to different data storage services. A data source contains all the information that is required to connect to a data storage service. The information includes the username, password, and host address. Before data development, you must define information about the data sources that you want to use in DataWorks. This way, when you configure a task, you can select the names of data sources to determine the database from which you want to read data and the database to which you want to write data. You can add multiple types of data sources to a workspace.
data catalog
A data catalog is a structured list or map that displays all data assets within an organization. The data assets include but are not limited to databases, tables, and files. In DataWorks, a data catalog records the metadata information about data assets.
relationship among computing resources, data sources, and data catalogs
Computing resources, data sources, and data catalogs are independent items, but associations exist among them.
When you associate a computing resource with a workspace, the system automatically adds a corresponding data source to the workspace and associates a data catalog with the workspace.
When you add a data source to a workspace, the system automatically associates a corresponding data catalog with the workspace.
When you create a data catalog in a workspace, the system does not automatically add a corresponding data source to the workspace or associate a corresponding computing resource with the workspace.
Associate a computing resource with a workspace
DataWorks allows you to associate a computing resource with a workspace in multiple ways. You can select a method based on your business requirements.
Associate a computing resource with a workspace when you create the workspace
When you create a workspace, the Associate Computing Resource step appears after you configure parameters and click Create Workspace. Then, you can select computing resources based on your business requirements and associate the computing resources with the workspace.
If you turn on Participate in Public Preview of Data Studio when you create a workspace in DataWorks, you can associate different types of computing resources with the workspace in the Associate Computing Resource step. The following table describes specific information about the association.
Category | Computing resource type | Association description | References for parameter configuration |
Offline computing | MaxCompute | DataWorks cannot be directly connected to MaxCompute quota groups. You can associate only MaxCompute projects with a DataWorks workspace. After you associate a MaxCompute computing resource with a DataWorks workspace, the system automatically adds a MaxCompute data source to the workspace and associates a MaxCompute data catalog with the workspace. | |
Serverless Spark | You can associate a Serverless Spark workspace with a DataWorks workspace. For Spark computing resources, no data catalog is required for association. | ||
Real-time query | Hologres | DataWorks cannot be directly connected to Hologres virtual warehouses. You can associate Hologres databases with a DataWorks workspace. After you associate a Hologres computing resource with a DataWorks workspace, the system automatically adds a Hologres data source to the workspace and associates a Hologres data catalog with the workspace. | |
Serverless StarRocks | DataWorks cannot be directly connected to Serverless StarRocks queues. You can associate Serverless StarRocks instances with a DataWorks workspace. After you associate a Serverless StarRocks computing resource with a DataWorks workspace, the system automatically adds a Serverless StarRocks data source to the workspace and associates a Serverless StarRocks data catalog with the workspace. | ||
Fully managed | Realtime Compute for Apache Flink | You can associate a Realtime Compute for Apache Flink namespace with a DataWorks workspace. For Realtime Compute for Apache Flink computing resources, no data catalog is required for the association. | |
Multimodal search | OpenSearch | You can associate an OpenSearch instance with a DataWorks workspace. After you associate an OpenSearch computing resource with a DataWorks workspace, the system automatically adds an OpenSearch data source to the workspace. For OpenSearch computing resources, no data catalog is required for the association. |
Associate a computing resource with a DataWorks workspace on the details page of the workspace
If you do not associate a computing resource with a workspace when you create the workspace, you can associate the computing resource with the workspace on the details page of the workspace.
Log on to the DataWorks console. In the top navigation bar, select a desired region. Then, click Workspace in the left-side navigation pane.
On the Workspaces page, find the desired workspace and click Details in the Actions column.
In the left-side navigation pane of the Workspace Details page, click Computing Resource. On the Computing Resource page, click Associate Computing Resource. In the Associate Computing Resource panel, select computing resource types based on your business requirements and configure parameters. For more information about parameter settings, see the Parameter configuration for association of different types of computing resources section in this topic.
After the configuration is complete, click OK.
Associate a computing resource with a DataWorks workspace in Management Center
If you do not associate a computing resource with a DataWorks workspace when you create the workspace, you can associate the computing resource with the workspace in Management Center.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.
In the left-side navigation pane of the SettingCenter page, click Computing Resource.
On the Computing Resource page, click Associate Computing Resource. In the Associate Computing Resource panel, select computing resource types based on your business requirements and configure parameters. For more information about parameter settings, see the Parameter configuration for association of different types of computing resources section in this topic.
Associate a computing resource with a DataWorks workspace on the Data Studio page
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
In the left-side navigation pane of the Data Studio page, click the
icon and select Computing Resources.
On the Computing Resources tab, click Associate Computing Resource. In the Associate Computing Resource panel, select computing resource types based on your business requirements and configure parameters. For more information about parameter settings, see the Parameter configuration for association of different types of computing resources section in this topic.
Parameter configuration for association of different types of computing resources
MaxCompute
Serverless Spark
Hologres
Serverless StarRocks
Realtime Compute for Apache Flink
OpenSearch
AnalyticDB for MySQL V3.0
AnalyticDB for PostgreSQL
AnalyticDB for Spark
CDH
ClickHouse
EMR
What to do next
After you associate specific types of computing resources that are described in the Parameter configuration for association of different types of computing resources section in this topic with a workspace, the system automatically associates corresponding data catalogs with the workspace. In addition, you can also separately associate Data Lake Formation (DLF), MaxCompute, Hologres, or StarRocks data catalogs for visualized data query and management in new-version Data Studio.
After you associate a data catalog, you can go to Data Studio to view and manage tables in the data catalog.
After you associate a computing resource with a workspace, you can perform operations, such as data development, data analysis, and implementing periodic task scheduling in Operation Center, in the current workspace. For more information, see Data Studio (new version), DataAnalysis overview, and Getting started with Operation Center.