To make sure that you can complete the workshop, you must activate MaxCompute, DataWorks, and Machine Learning Platform for AI (PAI) for your Alibaba Cloud account.

Prerequisites

  • An Alibaba Cloud account is created.
  • Real-name verification for individuals or enterprises is completed.

Background information

The following Alibaba Cloud services are used in this workshop:

Activate MaxCompute

Note If you have already activated MaxCompute, skip this step and directly create a workspace in DataWorks.
  1. Go to the Alibaba Cloud official website, click Log In in the upper-right corner, and then enter your account name and password.
  2. Move the pointer over Products in the top navigation bar and choose Analytics > Data Computing > MaxCompute to go to the product page of MaxCompute.
  3. Click Activate Now.
  4. On the buy page of MaxCompute, select a region, read and agree to the service agreement, and then click Confirm Order and Pay.
    Note
    • By default, DataWorks Basic Edition and the standard pay-as-you-go resource package of MaxCompute are provided on the buy page.
    • The project management, query, and editing features of MaxCompute are integrated into the features of DataWorks. Therefore, you must activate DataWorks at the same time. DataWorks Basic Edition is free of charge. You are charged only if you use Data Integration or run scheduled nodes.
    • When you activate MaxCompute, you must consider other Alibaba Cloud services that are available in each region. For example, you must consider the region where your Elastic Compute Service (ECS) instance resides and the region where the data resides.

Create a DataWorks workspace

  1. Log on to the DataWorks console by using your Alibaba Cloud account.
  2. On the Overview page, click Create Workspace in the Shortcuts section on the right.
    You can also click Workspaces in the left-side navigation pane and click Create Workspace on the page that appears.
  3. In the Create Workspace panel, set the parameters in the Basic Settings step and click Next.
    Section Parameter Description
    Basic Information Workspace Name The name of the workspace. The name must be 3 to 23 characters in length. It must start with a letter and can contain only letters, underscores (_), and digits.
    Display Name The display name of the workspace. The display name can be a maximum of 23 characters in length. It must start with a letter and can contain only letters, underscores (_), and digits.
    Mode The mode of the workspace. Valid values: Basic Mode (Production Environment Only) and Standard Mode (Development and Production Environments).
    • Basic Mode (Production Environment Only): A workspace in basic mode is associated with only one MaxCompute project. Workspaces in basic mode do not isolate the development environment from the production environment. In these workspaces, you can perform only basic data development and cannot strictly control the data development process and the permissions on tables.
    • Standard Mode (Development and Production Environments): A workspace in standard mode is associated with two MaxCompute projects. One serves as the development environment, and the other serves as the production environment. Workspaces in standard mode allow you to develop code in a standard way and strictly control the permissions on tables. These workspaces impose limits on table operations in the production environment for data security.

    For more information, see Basic mode and standard mode.

    Description The description of the workspace.
    Advanced Settings Download SELECT Query Result Specifies whether the query results that are returned by SELECT statements in DataStudio can be downloaded. If you turn off this switch, the query results cannot be downloaded.
  4. In the Select Engines and Services step, select required compute engines and services and click Next.
    DataWorks is now available as a commercial service. If you have not activated DataWorks in a region, activate it before you create a workspace in the region. By default, the following services are selected when you create a workspace: Data Integration, Data Analytics, Operation Center, and Data Quality.
    Note In this workshop, you must select PAI Studio and MaxCompute.
  5. In the Engine Details step, set the parameters for the selected compute engines.
    Engine Parameter Description
    MaxCompute Instance Display Name The display name of the compute engine instance. The display name must start with a letter and can contain only letters, underscores (_), and digits.
    Resource Group The quotas of computing resources and disk space for the compute engine instance.
    MaxCompute Data Type Edition The edition of the MaxCompute data type. This configuration takes effect within 5 minutes. For more information, see Data types. If you do not know which edition to select, we recommend that you contact the workspace administrator.
    Whether to encrypt Specifies whether to encrypt data. Valid values: No encryption and Encryption.
    MaxCompute Project Name The name of the MaxCompute project. By default, the MaxCompute project that serves as the production environment is named after the DataWorks workspace. The MaxCompute project that serves as the development environment is named in the format of DataWorks workspace name_dev.
    Account for Accessing MaxCompute The identity that you can use to access the MaxCompute project. For the development environment, the value is fixed to Node Owner.

    For the production environment, the valid values are Alibaba Cloud Account and RAM User.

  6. Click Create Workspace.
    After the workspace is created, you can view information about the workspace on the Workspaces page.