All Products
Search
Document Center

Platform For AI:Create and manage DSW instances

Last Updated:Mar 06, 2024

Data Science Workshop (DSW) is a cloud integrated development environment (IDE) for interactive programming that is provided by Platform for AI (PAI). Before you use DSW, you must create a DSW instance. This topic describes how to create a DSW instance, view the instance details, and modify the instance settings.

Background information

You can create and manage DSW instances in the PAI console.

Prerequisites

  • The required permissions to use DSW are granted. For more information, see Grant the permissions that are required to use DSW.

  • The general training resources or Lingjun AI Computing Service (Lingjun) resources that you want to use to run DSW instances are prepared.

  • (Optional) An Apsara File Storage NAS (NAS) or Object Storage Service (OSS) dataset is created. For more information, see Create and manage datasets.

    Important

    After you create an OSS or NAS dataset for a DSW instance, you must obtain the permissions to access OSS or NAS. Otherwise, the DSW instance fails to read or write data. For more information, see Grant the permissions that are required to use DSW.

    • If you create a DSW instance by using a public resource group, DSW provides a limited quota of disk storage. You can mount datasets to increase the storage size.

    • If you create a DSW instance by using a dedicated resource group, DSW provides non-persistent on-premises storage. You can mount datasets to persist data.

Create a DSW instance

  1. Go to the Interactive Modeling (DSW) page.

    1. Log on to the PAI console.

    2. On the Overview page, select a region in the top navigation bar.

    3. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace that you want to manage.

    4. In the left-side navigation pane, choose Model Development and Training > Interactive Modeling (DSW).

  2. Click Create Instance.

  3. In the Configure Instance step, configure the parameters. The following table describes the parameters.

  4. Parameter

    Description

    Instance Name

    The name of the DSW instance.

    Resource Quota

    Valid value:

    • Public Resource Group (default): The public resource quota. You can select CPU Specifications or GPU Specifications and then select an instance type. For information about the metrics and descriptions of instance types, see Overview of instance families.

    Storage

    • If you set the Resource Quota parameter to Public Resource Group, select one of the following options:

      • System Disk: Use the free disk that is provided for each pay-as-you-go instance to persist data. If the instance is stopped and not recovered for more than 15 days, data in the disk is cleared.

        • Free disk storage for an instance that uses GPU specifications: 100 GB.

        • Free disk storage for an instance that uses CPU specifications: 30 GB.

      • Shared Datasets: Mount a dataset to increase the storage size on top of the free disk storage. You can mount OSS, Apsara File Storage NAS, and Cloud Parallel File Storage (CPFS) datasets. If no datasets are available, click Create Dataset to create a dataset. For information about how to create a dataset, see Create and manage datasets.

    • If you set the Resource Quota parameter to a dedicated resource quota:

      The system disk of the DSW instance is used for temporary storage. Data in the system disk is cleared when the instance is stopped or deleted. If you want to permanently store the data, click Shared Datasets and mount a dataset to the instance. If no datasets are available, click Create Dataset to create a dataset. For information about how to create a dataset, see Create and manage datasets.

    Note
    • You cannot mount multiple datasets to the same directory.

    • If you use a CPFS dataset, specify a virtual private cloud (VPC) for the instance. The VPC must be the same as the VPC of the CPFS dataset. Otherwise, the DSW instance may fail to be created.

    • If you set the Resource Quota parameter to a dedicated resource quota, the first dataset that you mount to the instance must be a NAS dataset. The dataset is mounted to the directory that you specify and the default working directory of DSW (/mnt/workspace/).

      Select Image

      Valid values:

      • Alibaba Cloud Image: a pre-built official image. PAI provides pre-built images for multiple versions of frameworks, including Python, TensorFlow, and PyTorch.

      • Custom Image: a custom image that you created. For information about how to create a custom image, see Custom images.

      • Image URL: the publicly accessible URL of the image that you want to use. You can use an image that belongs to a Container Registry Personal Edition instance in the current region.

      Networking

      This parameter is available only if you set the Resource Quota parameter to Public Resource Group.

      VPC: You can use a VPC to connect to a DSW instance. If you configure the VPC parameter, you must also configure the vSwitch and Security Group parameters.

      You can select an existing VPC or click Create VPC to the right of VPC to create a VPC.

      You can configure the following options if you set the Storage parameter to a CPFS dataset:

      • Enable All Options: By default, this option is unselected, which indicates that the system disables the VPCs that cannot connect to the CPFS dataset.

      • Hide Unavailable Options: If you select this option, the system does not display the VPCs that cannot connect to the CPFS dataset.

      Note
      • In most cases, we recommend that you do not use VPCs.

      • If you set the Storage parameter to a CPFS dataset, configure a VPC that is the same as the VPC of the CPFS dataset.

      Internet Access Gateway: You can select one of the following options:

      • Public Gateway: The instance shares bandwidth with other instances in the cluster. The download rate can be slow in high concurrency scenarios.

      • Private Gateway: The instance uses dedicated bandwidth. You can configure the dedicated bandwidth based on your business requirements. If you select this option, you must create an Internet NAT gateway for the VPC that is associated with the DSW instance, associate an elastic IP address (EIP) with the DSW instance, and configure an SNAT entry. For more information, see Configure a DSW instance to access the Internet by using a private NAT gateway.

    • In the Confirm step, check the parameter configurations, read and select Machine Learning DSW Terms of Service, and then click Create Instance.

    Manage an instance

    You can manage a DSW instance as shown in the following figure.image.png

    Warning

    If a DSW instance is created by using a dedicated resource group and no datasets are mounted to the instance, the system disk of the DSW instance is used for temporary storage. If the DSW instance is stopped or deleted, data cannot be restored. Proceed with caution.

    • Move the pointer over the sections that are labeled with ① to view the details of the instance, the change log of the instance status, and the resource type.

    • Click Auto-stop Settings. In the Auto-stop Settings dialog box, turn on Auto-stop and specify the time when you want the system to stop the DSW instance.

    • Click Save Image and follow the on-screen instructions to save the DSW instance to Container Registry Personal Edition or Container Registry Enterprise Edition. This way, you can use the image for further development.

    • Click the name of a DSW instance. On the details page of the instance, you can view and modify the basic information and settings of the instance.

      • In the Basic Information section, you can change the instance name.

      • On the Instance Settings tab, click Change Settings to modify the resource, image, dataset, or network settings.

    References

    • After you create a DSW instance, you can prepare the data files that are required for development. DSW supports multiple data sources, including OSS, NAS, and MaxCompute. For more information, see Read and write data and file transfer. DSW also supports uploading and downloading small data files. For more information, see Upload and download data files.

    • For information about the features and workflow of DSW and how to get started with DSW, see DSW overview.

    • For information about the use cases of DSW, see DSW use cases.