This topic provides an overview of Data Science Workshop (DSW) 2.0. DSW 2.0 is developed based on cloud native technologies of Alibaba Cloud, such as Docker and Kubernetes. It provides open and AI-aided development environments for you to train models with high elasticity.

Features

DSW 2.0 was released on May 7, 2020. Compared with DSW 1.0, it supports the following new features:
  • Allows you to purchase DSW instances in a convenient way and provides more types of computing resources.
  • Allows you to start and stop DSW instances on demand, save images with one click, restore development environments, and access Virtual Private Cloud (VPC) networks.
  • Provides integrated development environments:
    • Provides built-in big data development packages and algorithm packages, and grants Sudo permissions to you for installing third-party libraries.
    • Provides built-in JupyterLab plug-ins, such as Git and Tensorboard, to improve development efficiency.
    • Provides official images that support different versions of mainstream computing frameworks, such as TensorFlow and PyTorch.
    • Provides built-in WebIDE that allows you to install all plug-ins.
  • Supports basic features of Machine Learning Platform for AI, including EasyVision, AutoML, TAO, and CommonIO. EasyVision is a computer vision algorithm tool. AutoML can help you automatically tune parameters. TAO is used for compilation optimization. CommonIO allows you to read data from MaxCompute tables.

Instance management

You can perform the following operations to manage instances of DSW 2.0:
  • Start

    You can start a DSW instance in the Stopped state or restart a DSW instance that is failed to be started. After the DSW instance is started, the system automatically loads a saved image to restore the development environment for the instance. The system starts to bill a pay-as-you-go DSW instance immediately after the instance is started.

  • Stop
    You can stop a DSW instance by using one of the following methods:
    • Stop directly: Release the underlying Elastic Compute Service (ECS) instance and stop the DSW instance. If the DSW instance is a pay-as-you-go instance, the system stops billing the instance after the instance is stopped.
    • Save before stop: Save the Docker image, and then stop the ECS and DSW instances. The system stops billing the ECS instance immediately after the ECS instance is stopped. The system automatically restores the development environment after you restart the DSW instance. This method takes more time for you to stop a DSW instance.
  • Delete

    If you delete a DSW instance in the Machine Learning Platform for AI console, the system releases the DSW instance, ECS instance, and system disk. Data cannot be restored after the DSW instance is deleted. The VPC network, VSwitch, and security group that were automatically created when you purchased the DSW instance are retained. To release all resources, log on to the Container Service for Kubernetes console and delete the cluster whose name starts with DSW_.

Development environments

DSW 2.0 supports the following development environments:
  • JupyterLab

    JupyterLab provides built-in plug-ins, such as TensorBoard (visualization tool) and Git, which can be used to facilitate debugging, optimization, and management.

  • WebIDE

    WebIDE supports code engineering and allows you to develop, debug, and run code online. It also allows you to install plug-ins to meet your requirements.

  • Terminal

    Terminal is applicable to command-line programming. DSW 2.0 offers a development experience that is similar to an on-premises machine. It grants Sudo permissions to users. Therefore, you can install plug-ins that you need.

Preset basic features

DSW 2.0 supports basic features of Machine Learning Platform for AI. You can use EasyVision, a computer vision algorithm tool, to evaluate image classification models and make predictions. You can also use AutoML to tune hyperparameters of algorithms. The CommonIO component is provided by DSW 2.0 for you to read data from MaxCompute tables.

Official images

DSW 2.0 provides the following official images:
Image Description
py27_cpu_tf1.12_ubuntu Supports TensorFlow 1.12.
py27_cuda90_tf1.12_ubuntu Supports TensorFlow 1.12.
py36_cuda101_tf2.1_torch1.4_ubuntu Supports TensorFlow 2.1 and PyTorch 1.4.
py36_cpu_tf2.1_torch1.4_ubuntu Supports TensorFlow 2.1 and PyTorch 1.4 (CPU).

Service linked roles

Before you create an instance of DSW 2.0, you must assign the following service linked roles to DSW. For more information about how to assign service linked roles to DSW instances, see Authorization.
Role Description
AliyunPAIDSWDefaultRole DSW assumes this role to access your cloud resources.
AliyunCSDefaultRole Container Service assumes this role to access your cloud resources while managing clusters.
AliyunCSManagedLogRole The Log Service component for Container Service for Kubernetes (ACK) clusters assumes this role to access your cloud resources.
AliyunCSManagedCmsRole The Content Management System (CMS) component for Container Service clusters assumes this role to access your cloud resources.
AliyunCSClusterRole Contain Service assumes this role to access your cloud resources during application runtime.
AliyunCSKubernetesAuditRole The auditing feature of ACK assumes this role to access your cloud resources.
AliyunCSManagedNetworkRole The Network component for Container Service clusters assumes this role to access your cloud resources.
AliyunCSManagedKubernetesRole Managed ACK clusters assume this role to access your cloud resources.
AliyunCSKubernetesAuditRole The auditing feature of ACK assumes this role to access your cloud resources.
AliyunESSDefaultRole Auto Scaling assumes this role to access your cloud resources.