On the Workspaces page in the DataWorks console, you can view all workspaces under your account and perform relevant operations. For example, you can create, configure, delete, enable, and disable workspaces, and refresh the workspace list.
- Log on to the DataWorks console by using your Alibaba Cloud account. The Overview page appears by default.
- In the left-side navigation pane, click Workspaces. On the Workspaces page, you can view all workspaces under your account.
- Status: This column displays the status of workspaces, which can be Normal, Initializing,
Initialization Failed, Deleting, or Deleted. After you create a workspace, the workspace
enters the Initializing state first. Then, it enters the Initialization Failed or
Normal state based on the initialization result.
After you disable a workspace, you can enable or delete it. The workspace enters the Normal state again after you enable it.
- Service: When you move the pointer over a service icon in this column, the services that you have activated appear. If a service is in the normal state, the service icon is blue. If a service is overdue, the service icon is red with an overdue payment mark. If a service is overdue and has been deleted, the service icon is gray. Generally, a service is automatically deleted if you do not renew it after it is overdue for 7 days.
- Status: This column displays the status of workspaces, which can be Normal, Initializing, Initialization Failed, Deleting, or Deleted. After you create a workspace, the workspace enters the Initializing state first. Then, it enters the Initialization Failed or Normal state based on the initialization result.
Create a workspace
- Move the pointer over the region in the top navigation bar and select the region in which you want to create a workspace from the drop-down list.
- Click Create Workspace, set parameters in the Basic Settings step, and click Next.
Section Parameter Description Basic Information Workspace Name The name of the workspace. The name must be 3 to 27 characters in length and start with a letter. It can contain only letters, underscores (_), and digits. Display Name The display name of the workspace. The display name can be up to 27 characters in length. It can only start with a letter and can contain only letters, underscores (_), and digits. Mode The mode of the workspace. The mode is a new feature of DataWorks and includes the basic and standard modes. For more information about the differences between the basic and standard modes, see Basic mode and standard mode.
- Basic mode: A basic workspace is associated with only one MaxCompute project. Basic workspaces do not isolate the development environment from the production environment. In basic workspaces, you can only perform basic data development and cannot strictly control the data development process and table permissions.
- Standard mode: A standard workspace is associated with two different MaxCompute projects. One of the projects serves as the development environment, and the other serves as the production environment. Standard workspaces guarantee code development in a standard way and allow you to strictly control table permissions. Standard workspaces impose restrictions on table operations in the production environment for data security.
Description The description of the workspace. Advanced Settings Download SELECT Query Result Specifies whether workspace members can download the query results returned by SELECT statements in DataStudio. If you disable this option, workspace members cannot download the query results.
- In the Select Engines and Services step, select required compute engines and services and click Next.
DataWorks is now available as a commercial service. If you have not activated DataWorks in a region, activate it first before creating a workspace in the region.
Section Service or Engine Description DataWorks ServicesNote The check box in this section is selected by default. Data Integration A data synchronization platform that provides stable, efficient, and scalable services. It is designed to transfer and synchronize data fast and stably between various heterogeneous data stores in complex networks. Data Analytics Allows you to design a data computing process consisting of multiple mutually dependent nodes based on business needs to automatically run them in Operation Center. Operation Center Allows you to view all your nodes and node instances and perform relevant operations on them as needed. Data Quality Provides a comprehensive data quality scheme that relies on DataWorks. For example, you can explore data, compare data, monitor data quality, scan SQL statements, and use intelligent alerting. For more information, see Data Quality. Compute Engines MaxCompute A rapid and fully-managed data warehouse solution that can process terabytes or petabytes of data. It supports fast computing on a large amount of data, effectively saves costs for enterprises, and guarantees data security. For more information, see What is MaxCompute?.Note After creating DataWorks workspaces, you must associate them with MaxCompute projects. Otherwise, the error
project not foundis returned when you run commands in the workspaces.
Realtime Compute Allows you to use Stream Studio in DataWorks to develop streaming computing nodes. E-MapReduce Allows you to use E-MapReduce to develop big data processing nodes in DataWorks. For more information, see What is E-MapReduce?. Interactive Analytics Allows you to use HoloStudio in DataWorks to manage internal and foreign tables and develop SQL nodes of Interactive Analytics. Graph Compute Allows you to use Graph Studio in DataWorks to manage Graph Compute instances. Machine Learning Services Machine Learning Platform for AI Uses statistical algorithms to learn large amounts of historical data and generate an empirical model to provide business strategies.
- In the Engine Details step, set parameters for the selected engines or services.
Engine or Service Parameter Description MaxCompute Instance Display Name The display name of the compute engine instance. The display name can be up to 27 characters in length. It can only start with a letter and can contain only letters, underscores (_), and digits. MaxCompute Project Name The name of the MaxCompute project. By default, the name is the same as that of the DataWorks workspace. Account for Accessing MaxCompute The account for accessing the MaxCompute project. The valid values are Private Account and Workspace Owner. We recommend that you set the value to Workspace Owner. Resource Group The quotas of computing resources and disk spaces. Realtime Compute Instance Display Name The display name of the compute engine instance. Realtime Compute Project The Realtime Compute project to be bound to the workspace. If no Realtime Compute project exists, create one in the Realtime Compute console. For more information, see Activate Realtime Compute and create a project. E-MapReduce Instance Display Name The display name of the compute engine instance. Cluster Name The name of the E-MapReduce cluster. The value must be globally unique. Access ID and Access Key The AccessKey of the account authorized to access the E-MapReduce cluster. Cluster ID The ID of the E-MapReduce cluster, which is obtained from the E-MapReduce console. EmrUserID The ID of the user who created the E-MapReduce cluster, which is obtained from the E-MapReduce console. Project ID The ID of the project in the E-MapReduce cluster, which is obtained from the E-MapReduce console. YARN resource queue The name of the resource queue in the E-MapReduce cluster, which is obtained from the E-MapReduce console. Endpoint The endpoint of the E-MapReduce cluster, which is obtained from the E-MapReduce console. Interactive Analytics Instance Display Name The display name of the compute engine instance. Interactive Analytics Instance Name The name of the Interactive Analytics instance. Database Name The name of the database in Interactive Analytics. Servers The endpoint of the Interactive Analytics instance. Port The port of the Interactive Analytics instance. Graph Compute Instance Display Name The display name of the compute engine instance. Bind Graph Compute Instance The Graph Compute instance to be bound to the workspace. PAI GPU Utilization This feature is disabled by default. If you need to use this feature, enable it on the Workspace Management page.
- Click Create Workspace.
- If you are the owner of a workspace, all data in the workspace belongs to you. Other users cannot access your workspace before you grant permissions to them. If you create a workspace as a Resource Access Management (RAM) user under an Alibaba Cloud account, the workspace belongs to both the RAM user and the Alibaba Cloud account.
- You can add a RAM user to a workspace to allow the RAM user to use the MaxCompute project associated with the workspace instead of creating a workspace as the RAM user.
Configure a workspace
You can click Workspace Settings in the Actions column of a workspace to configure the basic and advanced settings of the workspace. For example, you can modify the display name and description of the workspace and enable the recurrence feature for the workspace.
For more information, see Configure a workspace.
You can configure DataWorks services, compute engines, and machine learning services for a workspace. Before you configure services, you must purchase them first.
Click Service Configuration in the Actions column of a workspace. In the Change Services dialog box that appears, you can configure services as needed.
To use a new service in your workspace, select the service and click Next. In the Engine Details step, set parameters for the service and click OK.
Go to the DataStudio, Data Integration, or DataService Studio page
Click Data Analytics, Data Integration, or DataService Studio in the Actions column of a workspace to go to the corresponding page.
Delete or disable a workspace
In the Actions column of a workspace, choose More > Delete Workspace or More > Disable Workspace to delete or disable the workspace.
- Delete a workspace
Choose More > Delete Workspace. In the Delete Workspace dialog box that appears, enter the verification code YES and click OK.Note
- In the Delete Workspace dialog box, the verification code is always YES.
- Delete a workspace with caution because the workspace cannot be restored once deleted.
- Disable a workspace
Choose More > Disable Workspace. In the Disable Workspace dialog box that appears, click OK.
After you disable a workspace, instances are no longer generated for recurring nodes in the workspace. Instances generated before you disable the workspace run automatically upon the specified time. However, you cannot log on to the workspace to view information about these instances.