All Products
Search
Document Center

DataWorks:Personal development environment

Last Updated:Apr 01, 2025

Data Studio provides personal development environments that are account-level cloud development instances. Personal development environments can be integrated with a Git repository, File Storage NAS (NAS), and a Python or notebook ecosystem. Personal development environments support script running on on-premises machines, online debugging, and task committing, and provide efficient and customizable full-process support for data processing, model training, and collaborative development based on flexible custom images and external service extension capabilities. This helps improve code quality and R&D efficiency. This topic describes how to use a personal development environment.

Background information

Data Studio allows you to create a personal development environment instance at the account level. After network connections are established, you can access a Git repository and NAS with ease to clone code in the Git repository, and develop and debug Python code and notebook code online. You can also commit code to a workspace directory for scheduling.

Personal development environment instances provide the following features:

  • Support SQL, AI notebook, and Python ecosystems, allow you to run Python and Shell scripts on your on-premises machine, and provide a code debugging feature.

  • Support the integration with a Git repository for code management. This way, you can clone, push, and manage code with ease.

  • Support the integration with NAS for data management and access.

  • Allow you to use custom images and support the connection with various external services to improve flexibility and extensibility.

Data Studio provides an efficient, flexible, and powerful development environment that allows you to perform operations, such as data processing, analysis, and model training, in a more convenient manner to improve development efficiency and code quality.

Billing

When you create a personal development environment instance, you must specify a resource group and a compute unit (CU) quota. You are charged for the resource group based on the number of CUs and the running duration of the resource group. For more information, see Billing of serverless resource groups.

Important
  • If your personal development environment instance is in the running state and a pay-as-you-go resource group is used, you are charged computing fees that are calculated based on the following formula: Resource quota × Instance running duration. If your personal development environment instance is in the running state and a subscription resource group is used, the available quota of the resource group is occupied.

  • If you no longer need to use a personal development environment instance, stop the instance on the instance management page at the earliest opportunity.

Prerequisites

  • A workspace is created, and Participate in Public Preview of Data Studio is turned on. For more information about how to create a workspace, see Create a workspace.

  • A resource group is created and is associated with the created workspace. For more information, see Create and use a serverless resource group.

Precautions

  • You can select and use only a personal development environment instance that is created by using the current Alibaba Cloud account.

  • A maximum of 10 personal development environment instances can be created by each member in a workspace.

  • Descriptions for file deletion from the storage space of a personal development environment instance:

    • After you delete files from the NAS file system that is mounted to a personal development environment instance, the files cannot be found in the recycle bin of Data Studio. The default mount target of the NAS file system is /mnt/data. If you enable the recycle bin feature in the NAS console, the deleted files can be found in the recycle bin of the NAS file system.

    • After you delete files from the built-in storage space of a personal development environment instance, the files cannot be found in the recycle bin of Data Studio. The default storage directory of the personal development environment instance is /mnt/workspace.

  • Each personal development environment instance comes with a cloud disk that has a free quota of 30 GiB. The cloud disk is reclaimed 15 days after the instance is stopped. Save your personal code files at the earliest opportunity.

Go to the page for creating a personal development environment instance

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose Shortcuts > Data Studio in the Actions column.

  2. In the top navigation bar, click Select Personal development environment to select an existing personal development environment instance or create a personal development environment instance based on your business requirements.

Create a personal development environment instance

  • If no personal development environment instance is available, click go to new to create one.

  • If a personal development environment instance exists, click Management Environment, and click New Instance in the Personal Development Environment Instances panel to create a personal development environment instance.

Parameters that are involved in creating a personal development environment instance:

Required parameters

Parameter

Description

Instance Name

The name of the personal development environment instance. You can specify a custom name.

Resource Group

Select a serverless resource group for the personal development environment instance.

Resource Type

The resource type. Valid values: CPU and GPU.

Resource Quota

Select resource specifications for the personal development environment instance based on your business requirements.

After you select a type of CPU-based specifications, the following information is displayed:

  • Specifications, CPU, memory, bandwidth, and Resource quota (Deduction CU).

  • In addition to the preceding information, the following information is also displayed if you select a type of GPU-based resource specifications: GPU and display memory.

You can configure the Maximum CUs or Minimum CUs parameter in the Manage Quota dialog box of your serverless resource group for the current personal development environment instance.

Important

Pay attention to the Resource Quota parameter. If your personal development environment instance is in the running state and a pay-as-you-go resource group is used, you are charged computing fees that are calculated based on the following formula: Resource quota × Instance running duration. If your personal development environment instance is in the running state and a subscription resource group is used, the available quota of the resource group is occupied.

Select Image

Select an image based on your business requirements. Valid values:

  • CPU image: Select dataworks-notebook:py3.11-ubuntu22.04.

  • GPU image: Select tensorflow-pythorch-develop:2.14-pytorcj2.1-gpu-py311-cu118-ubuntu22.04.

    Note

    The selected image is suitable for in-depth learning frameworks, such as TensorFlow and PyTorch.

System Disks

Each personal development environment instance comes with a cloud disk that has a free quota of 30 GiB. The cloud disk is reclaimed 15 days after the instance is stopped. Save your personal code files at the earliest opportunity.

The default path of the built-in storage space of a personal development environment instance is /mnt/workspace.

Optional parameters

  • Add Storage Source

    You can click Add Storage Source to mount a NAS file system to your personal development environment instance. This way, you can develop files that are stored in the NAS file system in Data Studio, and persistently store the scripts and files that are developed in Data Studio in the NAS file system.

    Parameter description

    Parameter

    Description

    Data Storage

    The data storage. Valid values: General-purpose NAS and Extreme NAS.

    File System

    Select an existing file system.

    File system mount point

    Select a mount target for the file system. If no mount target is available, click New Mount Point to create one.

    File System Path

    The mount path and the subpath of the file system. You can log on to the NAS console to query the paths of the file system. For more information, see Query the mount status of an ECS instance.

    • Enter the value in the Mount Path column in Area 1 in the following figure in the Default mount path field.

    • Enter the value in the NAS Directory column in Area 2 in the following figure in the File System Path field. If the specified file system path does not exist in the file system, the personal development environment instance cannot be created.

    image

    Default mount path

  • Networking

    The network settings are optional. You can configure network settings in the following business scenarios:

    • If you want to access a resource in a virtual private cloud (VPC), configure the VPC, Security Group, and vSwitch parameters in the Networking section.

    • If you do not need to access resources in a VPC, you do not need to specify a VPC. By default, a personal development environment supports Internet access.

    • If you want to access resources in a VPC and over the Internet at the same time, you must specify a VPC and configure a private Internet NAT gateway for the VPC.

    Parameter description

    Parameter

    Description

    VPC

    We recommend that you select the VPC configured for the data source that you want to use for task development. This helps reduce additional configurations for network connectivity.

    Security Group

    Select an existing security group for mounting.

    vSwitch

    Optional. If you do not configure this parameter, the system randomly selects a vSwitch based on the selected VPC.

  • Advanced Information

    Parameter

    Description

    Instance RAM Role

    You can associate a RAM role with a personal development environment instance to enable the instance to access other cloud services based on temporary credentials of Security Token Service (STS). The temporary credentials are periodically updated. This ensures the security of your AccessKey pair and helps you implement fine-grained permission management by using RAM. By default, this parameter is set to DataWorks Default Role.

    For more information about roles, see Associate a RAM role with an individual development environment.

Stop a personal development environment instance

Important
  • If your personal development environment instance is in the running state and a pay-as-you-go resource group is used, you are charged computing fees that are calculated based on the following formula: Resource quota × Instance running duration. If your personal development environment instance is in the running state and a subscription resource group is used, the available quota of the resource group is occupied.

  • If you no longer need to use a personal development environment instance, stop the instance on the instance management page at the earliest opportunity.

  • Each personal development environment instance comes with a cloud disk that has a free quota of 30 GiB. The cloud disk is reclaimed 15 days after the instance is stopped. Save your personal code files at the earliest opportunity.

You can use one of the following methods to stop a personal development environment instance:

Method 1: Configure a workspace-level automatic shutdown policy

You can configure an automatic shutdown policy for all personal development environment instances in the current workspace in Management Center by using an Alibaba Cloud account or as a workspace administrator. A personal development environment instance that is in the running state in the current workspace is automatically shut down if the instance matches the workspace-level automatic shutdown policy.

  1. Go to the SettingCenter page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

  2. In the Personal Development Environment section of the General Configurations page, configure an Automatic Shutdown Policy based on your business requirements.

    Parameter description:

    • The Idle Duration parameter is required. Unit: hours.

    • You must configure at least one of the following parameters: GPU Utilization and CPU Utilization.

    Note
    • The defined shutdown policy applies to all personal development environment instances in the running state in the current workspace. The policy immediately takes effect after it is saved.

    • If the automatic shutdown configurations of personal development environment instances are modified, the system re-calculates the idle duration of the personal development environment instances.

Method 2: Configure a scheduled shutdown time

  1. In the top navigation bar of the Data Studio page, click Select Personal development environment. Then, click Management Environment.

  2. In the Personal Development Environment Instances panel, find the desired instance and click Auto-stop Settings in the Actions column. In the Auto-stop Settings dialog box, turn on Auto-stop and configure the Duration parameter. Then, click OK.

Method 3: Manually stop a personal development environment instance

  1. In the top navigation bar of the Data Studio page, click Select Personal development environment. Then, click Management Environment.

  2. In the Personal Development Environment Instances panel, find the desired instance and click Stop in the Actions column.

Delete a personal development environment instance

If you want to release a personal development environment instance, find the instance in the Personal Development Environment Instances panel and click Delete in the Actions column. In the Delete message, click Delete Instance.

View the resource utilization of a personal development environment instance

In the top navigation bar of the Data Studio page, move the pointer over the name of the personal development environment instance of which you want to view the resource utilization. Then, in the brief information section, view the details of each metric.

References