Data Science Workshop (DSW) provides a cloud-based AI development Integrated Development Environment (IDE) with multiple built-in development environments. If you are familiar with Notebook or VSCode, you can quickly start developing models. This topic uses the MNIST handwriting recognition task as an example to demonstrate how to quickly develop a model in DSW.
The MNIST handwriting recognition task is one of the most classic introductory tasks in deep learning. The goal is to build a machine learning model to recognize 10 handwritten digits (0 to 9).

Prerequisites
Before you begin, you must activate PAI and create a workspace using your Alibaba Cloud account. To do this, log on to the PAI console, select a region in the upper-left corner, and then grant the required permissions to activate the product.
Billing information
This example uses public resources to create a DSW instance and an Elastic Algorithm Service (EAS) model service. These resources are billed on a pay-as-you-go basis. For more information about billing rules, see DSW billing and EAS billing.
Create a DSW instance
Go to the DSW page.
Log on to the PAI console.
In the upper-left corner of the page, select the destination region.
In the navigation pane on the left, click Workspace, and then click the name of the workspace that you want to manage.
In the navigation pane on the left, choose . Then, click Create Instance.

On the Create Instance page, configure the following key parameters and use the default values for the other parameters.
Resource Type: Select Public Resources. The billing method for this resource type is pay-as-you-go.
Instance Type: Select
ecs.gn7i-c8g1.2xlarge.If the inventory for this instance type is insufficient, you can select another GPU-accelerated instance type.
Image config: Select Alibaba Cloud Image, and then search for and select the following image:
modelscope:1.26.0-pytorch2.6.0-gpu-py311-cu124-ubuntu22.04.To avoid environment issues, select the same image as the one used in this topic.
Storage Path Mounting: To persistently store files from the model development process, this topic uses Object Storage Service (OSS). Click OSS, click the
icon, select a Bucket, and create a folder, such as pai_test. The complete parameter configuration is as follows.If you have not activated OSS or do not have an available bucket in the current region, follow these steps to activate OSS and create a bucket:
Uri:
oss://**********oss-cn-hangzhou-internal.aliyuncs.com/pai_test/.Mount Path:
/mnt/data/.
Click OK to create the DSW instance.
If the instance fails to start, see Common Issues with Instance Startup and Release for troubleshooting.
Develop a model in DSW
Open the DSW instance
Click Open to go to the development environment of the DSW instance that you created.

The PAI-DSW interface is shown in the following figure:

Write the model development code. This topic uses the Notebook development environment as an example and provides the training code for MNIST handwriting recognition. Click mnist.ipynb to download the code. Then, in the upper-left corner of the DSW page, click the
icon to upload the code file.
Run the model training code. Open the
mnist.ipynbfile, find the cell that contains the training code, and then click the
button to run the code. The code automatically downloads the MNIST dataset to the dataSetdirectory and saves the best checkpoint to theoutputdirectory after training. The training process takes about 10 minutes.

During training, the accuracy of the model on the validation set is displayed. This value represents the model's generalization ability on unknown data. In this example, the accuracy on the validation set is 98%, which indicates that the model performs well. You can proceed to the next steps.
View the loss curve in TensorBoard to understand the training status. Run the following cell and click the TensorBoard URL:
http://localhost:6006/.
In TensorBoard, you can view the train_loss curve, which reflects the loss on the training dataset, and the validation_loss curve, which reflects the loss on the validation set.

After you view the graph, click the
icon in the cell to stop TensorBoard.Invoke the trained model to test its performance. Run the cell shown in the figure. The cell displays 20 test images and outputs their true labels and the model's prediction results.

Sample output:

Copy the model file to Object Storage Service (OSS) for persistent storage. The DSW instance in this topic is created using public resources, and its files are stored on a free disk. If the instance remains stopped for more than 15 days, the content on the disk is deleted. Therefore, you must copy the model file to OSS for persistent storage. This also makes it easier to deploy the model using PAI-EAS later.

Log on to the OSS console to view the file:

This completes the model development. If you want to invoke the model in other applications in a production environment, see Deploy the model as an online service using EAS.
The DSW instance in this topic is created using public resources and is billed on a pay-as-you-go basis. When you no longer need the DSW instance, stop or delete it to avoid further charges.
Deploy the model as an online service using EAS
Once a model is trained, you can use Elastic Algorithm Service (EAS) to quickly deploy it as an online inference service or an AI web application. EAS supports heterogeneous resources and combines features like Automatic scaling, One-click stress testing, Canary release, and Real-time monitoring to ensure stable, continuous service in high-concurrency scenarios at a lower cost.
Write a web interface for the model service and copy it to OSS. The web interface code and the copy command are provided. You can run the following cell to perform these operations.

(Optional) Verify that the web interface can be started in DSW. Run the following cell to install the missing third-party packages and start the service.

Run the code to test the service interface. At the top of the page, click WebIDE. On the left, click the
request_web.pycode file. Then, click the
icon to run the code and send a request to the service interface.
The following result is returned:
{"prediction": 7}NoteIf you want to directly access the web service that is running in DSW from the internet, you must configure a virtual private cloud (VPC), a NAT Gateway, and an Elastic IP Address (EIP) for DSW. For more information, see Access a service in an instance over the internet.
Configure EAS. In the PAI console, in the navigation pane on the left, click Elastic Algorithm Service (EAS) > Deploy Service > Custom Deployment.

Configure the following key parameters and use the default values for the other parameters:
Deployment Method: Image-based Deployment
Image Configuration: Select Image Address. Copy and paste the URL of the image that is used for the DSW instance.
The environment of this image was verified to correctly run the model service code in this topic when you used DSW. Therefore, use the same image for deployment to avoid unknown runtime environment issues.

Mount storage: The model file and service interface code have been copied to OSS. Therefore, click OSS and select the corresponding OSS path.

Command: The command is the same as the service startup command in DSW. However, because
web.pyis now mounted to/mnt/data/, you must modify the path ofweb.pyaccordingly. The final command is:python /mnt/data/web.pyPort: Configure the port that is used in
web.py, which is9000.Third-party Library Configuration: During testing in DSW, the selected image was found to be missing the bottle library. Therefore, you must add this library in the third-party library configuration.

Resource Type: Select Public Resources. For Resource Specification, select
ecs.gn7i-c8g1.2xlarge.Configure a system disk: Click Show More and set Extra System Disk to 20 GB.
Because the image that is used is large, the service cannot start due to insufficient space if you do not set an extra system disk.
Click Deploy to create the service. The creation process takes about 5 minutes. When the status changes to Running, the service is deployed.
View the invocation information. On the model service details page, click View Invocation Information to obtain the Public Endpoint and Token.

Invoke the service. Run the following service request code. Replace the Endpoint and Token in the code with the actual information that you obtained in the previous step.
import requests """ Test image URLs: label is 7 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_7_No_0.jpg label is 2 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_2_No_1.jpg label is 1 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_1_No_2.jpg label is 0 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_0_No_3.jpg label is 4 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_4_No_4.jpg label is 5 http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_9_No_5.jpg """ image_url = 'http://aliyun-document-review.oss-cn-beijing.aliyuncs.com/dsw_files/mnist_label_7_No_0.jpg' # The client downloads the image to get the binary data. img_response = requests.get(image_url, timeout=10) # Automatically check if the request is successful based on the status code. img_response.raise_for_status() img_bytes = img_response.content # Header information. Replace YOUR_TOKEN with the actual token. # In a production environment, we recommend that you set the token as an environment variable to prevent sensitive information leaks. # For more information about how to configure environment variables, see https://www.alibabacloud.com/help/en/sdk/developer-reference/configure-the-alibaba-cloud-accesskey-environment-variable-on-linux-macos-and-windows-systems headers = {"Authorization": "YOUR_TOKEN"} # Send the binary data as the body of a POST request to the model service. resp = requests.post('YOUR_ENDPOINT/predict_image', data=img_bytes, headers=headers) print(resp.json())The following result is returned:
{"prediction": 7}
The EAS service in this topic is created using public resources and is billed on a pay-as-you-go basis. When you no longer need the service, stop or delete it to avoid further charges.

References
For more information about how to troubleshoot DSW instance startup failures, see Create a DSW instance.
For more information about DSW billable items and billing methods, see DSW billing.
For more information about the core features of DSW, see DSW overview.
For more information about how to access a web service that is running in DSW directly from the internet, see Access a service in an instance over the internet.
For more information about the core features of EAS, see EAS overview.
