After you activate MaxCompute, you must create a project to use MaxCompute. A project in MaxCompute is called a workspace in DataWorks. This topic describes how to create a MaxCompute project in the DataWorks console.
- DataWorks is activated.
- The same region is selected to activate DataWorks and MaxCompute.
- Log on to the DataWorks console by using your Alibaba Cloud account.
- On the Overview page, click Create Workspace in the Shortcuts section on the right.Alternatively, click Workspaces in the left-side navigation pane. On the Workspaces page, select a desired region from the region drop-down list at the top and click Create Workspace.
- In the Create Workspace pane, set parameters in the Basic Settings step and click Next.In this example, a workspace in standard mode is created.
Section Parameter Description Region All regions where DataWorks is available The region where you want to create a workspace. You must select the region where you activate MaxCompute. For more information about regions, see Configure endpoints.Note This section is visible only when you click Create Workspace in the Shortcuts section on the Overview page. This section is invisible if you click Create Workspace on the Workspaces page. Basic Information Workspace Name The name of the workspace. The name must be 3 to 23 characters in length, and can contain letters, digits, and underscores (_). It must start with a letter. Display Name The display name of the workspace. The name can be up to 23 characters in length, and can contain letters, digits, and underscores (_). It must start with a letter. Mode The mode of the workspace. Valid values: Basic Mode (Production Environment Only) and Standard Mode (Development and Production Environments). For more information, see Basic mode and standard mode.
- Basic Mode (Production Environment Only): One DataWorks workspace is associated with only one MaxCompute project. A workspace in basic mode does not isolate the development environment from the production environment. In this workspace, you can only perform basic data development but cannot completely control the data development process or table permissions.
- Standard Mode (Development and Production Environments): One DataWorks workspace is associated with two MaxCompute projects. One MaxCompute project serves as the development environment and the other serves as the production environment. In this workspace, you can develop code in a standard manner and strictly control table permissions. Without authorization, developers are prohibited from managing tables in the production environment. This ensures data security.
Description The description of the workspace. Advanced Settings Download SELECT Query Result Specifies whether workspace members can download the query results returned by SELECT statements in DataStudio. If you turn off the Download SELECT Query Result switch, workspace members cannot download query results.
- In the Select Engines and Services step, select MaxCompute in the Compute Engines section and click Next.Note You can also select other compute engines at the same time. For more information, see Create a workspace.
Section Parameter Description DataWorks ServicesNote The check box in this section is selected by default. Data Integration Provides a data synchronization platform that features stable, efficient, and scalable services. It is designed to transfer and synchronize data fast and stably between heterogeneous data stores in complex networks. Data Analytics Allows you to design a data computing process consisting of multiple mutually dependent nodes based on business needs to automatically run them in Operation Center. Operation Center Allows you to view all your nodes and node instances and manage them as needed. Data Quality Provides a comprehensive data quality scheme that relies on DataWorks. For example, you can explore data, compare data, monitor data quality, scan SQL statements, and use intelligent alerting. For more information, see Overview. Compute Engines MaxCompute Provides a rapid and fully-managed data warehouse solution that can process exabytes of data. It supports fast computing on a large amount of data, effectively saves costs for enterprises, and ensures data security. For more information, see What is MaxCompute?.Note After you create DataWorks workspaces, you must associate them with MaxCompute projects. Otherwise, the error
project not foundis returned when you run commands in the workspaces.
- In the Engine Details step, set parameters under MaxCompute.Note
- The preceding figure shows the Engine Details page for a workspace in standard mode.
- If you selected more compute engines in the preceding step, configure them accordingly. For more information, see Create a workspace.
Section Parameter Description MaxCompute Instance display name The display name of the compute engine instance. The display name must be 3 to 27 characters in length, and can contain letters, digits, and underscores (_). It must start with a letter. Resource Group The resource group that provides the quotas of computing resources and disk spaces for the compute engine instance. For more information, see MaxCompute Management. MaxCompute Data Type Edition The data type edition of MaxCompute. Valid values: MaxCompute V2.0 Data Type Edition (Recommended), MaxCompute V1.0 Data Type Edition (Suitable for Early MaxCompute Projects), and Hive-Compatible Data Type Edition (Suitable for MaxCompute Projects Migrated from Hadoop). For more information, see Date types. MaxCompute Project Name The name of the MaxCompute project. If you create a DataWorks workspace in basic mode, the project name is set to the name you specified for the workspace by default. If you create a DataWorks workspace in standard mode, the project name in the development environment is fixed in the format of Workspace name_dev. In the production environment, the project name is set to the name you specified for the workspace by default. Account for Accessing MaxCompute The identity that you can use to access the MaxCompute project. For the development environment, the value is fixed to Node Owner.
For the production environment, the valid values are Alibaba Cloud Account and RAM User.
- Click Create Workspace.After the workspace is created, you can view information about the workspace on the Workspaces page.Note
- If you are the owner of a workspace, you have permissions to manage all resources in the workspace. Nobody else can access your workspace without authorization. If you create a workspace as a Resource Access Management (RAM) user under an Alibaba Cloud account, both the RAM user and the Alibaba Cloud account have permissions to manage all resources in the workspace.
- A RAM user who is not the creator of a workspace can use the workspace after the RAM user is added to the workspace.