Apsara File Storage NAS (NAS) is a file system service intended for computing services, such as Elastic Compute Service (ECS) instances, Elastic High Performance Computing (E-HPC) clusters, and Container Service for Kubernetes (ACK) clusters. NAS provides seamless integration, storage sharing, and security management. NAS is suitable for multi-cluster applications because the ECS instances, E-HPC clusters, or ACK clusters where the application is deployed may need to access the same data source. This topic describes how to configure a shared NAS volume.
Background information
To ensure data security and share training data, we recommend that you configure and use shared volumes in the runtime environment where you submit jobs with Arena. This ensures that code and data are not deleted together with containers. We recommend that you use a shared volume to share data and code to the members in your team.
When you use Arena to submit jobs, if you have declared a shared volume and the path of the runtime environment to be mounted, you can specify the --data
parameter to mount the shared volume to the path. This enables your jobs to reuse the data and code in the path.
In Kubernetes, volumes are typically declared by using persistent volumes (PVs) and persistent volume claims (PVCs). As the administrator of a Kubernetes cluster, you must create a PVC for each data scientist in your team. For example, you can mount the PVCs of User A and User B to the same NAS or CPFS file system, and mount these PVCs to different subdirectories to isolate User A from User B.
Step 1: Create a NAS file system
For more information about how to create a NAS file system, see Create a General-purpose NAS file system in the NAS console.
Set the parameters based on the following descriptions:
Select General Purpose NAS as the file system type.
Select the region in which your ACK cluster is deployed.
Select the virtual private cloud (VPC) in which your ACK cluster is deployed.
Select NFS as the protocol type.
Step 2: Mount the file system
After the NAS file system is created, you need to mount the NFS file system to an Elastic Compute Service (ECS) instance in the VPC of the ACK cluster and verify the mounting. This example shows how to mount the NFS file system in the console. For more information about how to mount the NFS file system by using other methods, see Usage notes.
Mount a NAS file system
Log on to the NAS console and perform the following operations:
In the left-side navigation pane, choose File System > File System List.
In the top navigation bar, select a region.
Click the icon in front of the file system name to expand the list of mount targets.
Choose in the Actions column of the mount target.
In the Mount on ECS dialog box, configure the parameters. The following table describes the parameters.
Parameter
Description
ECS Instances
Select the created ECS instance from the drop-down list. The ECS instance runs CentOS 7.9.
Mount Path
Enter the local path of the ECS instance on which you want to mount the file system. Example: /mnt.
Automatic Mount
By default, Automatic Mount at Startup is selected. When you restart the ECS instance, you do not need to re-mount the file system.
Protocol Type
Select NFSv3.
NAS Directory
Enter the directory of the NAS file system. You can enter the root directory (/) of the NAS file system.
Mount Parameters
We recommend that you use the default mount parameters. For more information, see the mount parameters described in Mount an NFS file system on a Linux ECS instance.
NoteThe first time you use the mount feature in the NAS console, you must assign the AliyunServiceRoleForNasEcsHandler service-linked role to NAS. Follow the instructions in the dialog box to complete authorization. For more information, see Service-linked roles of NAS.
Click Mount.
Verify the mounting
After you mount the file system to the ECS instance, you can use the file system in the same way as you use a local directory.
Remotely connect to the ECS instance by referring to Connection methods, and run the following commands to access the file system:
mkdir /mnt/dir1 mkdir /mnt/dir2 touch /mnt/file1 echo 'some file content' > /mnt/file2 ls /mnt
If an output similar to the following example appears, you have accessed the General-purpose Capacity NFS file system.
Before you mount a file system on an ECS instance, create a mount target for the file system. For more information about how to add mount targets and view the addresses of mount targets, see Manage mount targets.
Pay attention to the following items when you add a mount target:
Select VPC as the type of mount target.
Select the same VPC and same vSwitches that are used by the specified ACK cluster.
Step 3: Create a PV and a PVC for the specified ACK cluster
Create a PV
Log on to the ACK console.
In the left-side navigation pane of the ACK console, click Clusters.
On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
In the left-side navigation pane of the cluster details page, choose .
- In the upper-right corner of the Persistent Volumes page, click Create.
- In the Create PV dialog box, set the following parameters.
Parameter Description PV Type You can select Cloud Disk, NAS, or OSS. In this example, NAS is selected. Volume Name The name of the PV that you want to create. The name must be unique in the cluster. In this example, pv-nas is used. Volume Plug-in You can select Flexvolume or CSI. In this example, CSI is selected. Capacity The capacity of the PV. A NAS file system provides unlimited capacity. This parameter does not limit the storage usage of the NAS file system but defines the capacity of the PV. Access mode You can select ReadWriteMany or ReadWriteOnce. Default value: ReadWriteMany. Mount Target Domain Name You can select Select Mount Target to select a mount target or select Custom to enter a mount target. Show Advanced Options - Subdirectory: the subdirectory of the NAS file system that you want to mount. The subdirectory must start with a forward slash (/). After you set this parameter, the PV is mounted to the subdirectory.
- If the specified subdirectory does not exist, the system automatically creates the subdirectory in the NAS file system and mounts the subdirectory to the cluster.
- If you do not set this parameter, the root directory of the NAS file system is mounted.
- If you want to mount an Extreme NAS file system, the subdirectory must be under the /share directory.
- Version: the version of the PV.
Label Add labels to the PV. - Subdirectory: the subdirectory of the NAS file system that you want to mount. The subdirectory must start with a forward slash (/). After you set this parameter, the PV is mounted to the subdirectory.
- Click Create.
Create a PVC
In the left-side navigation pane of the details page, choose .
- In the upper-right corner of the Persistent Volume Claims page, click Create.
- In the Create PVC dialog box, set the following parameters.
Parameter Description PVC Type You can select Cloud Disk, NAS, or OSS. In this example, NAS is selected. Name The name of the persistent volume claim (PVC). The name must be unique in the cluster. Allocation Mode In this example, Existing Volumes is selected. Note If no PV is created, you can set Allocation Mode to Create Volume and set the required parameters to create a PV. For more information, see Create a PV.Existing Volumes Click Select PV. Find the PV that you want to use and click Select in the Actions column. Capacity The capacity claimed by the PVC. Note The capacity claimed by the PVC cannot exceed the capacity of the PV that is bound to the PVC. - Click Create. After the PVC is created, you can view the PVC in the PVCs list. The PVC is bound to the corresponding PV.
Step 4: Upload data to the NAS file system
The ACK cluster accesses shared data (the NAS file system created in Step 1) by using the PVC. Therefore, you only need to upload data to the NAS file system specified in the PVC.
Use Workbench to log on to an ECS instance of the ACK cluster. For more information, see Log on to a Linux instance. For more information about how to connect to an ECS instance by using other methods, see Overview of connection methods.
Mount the NFS file system in Step 2 to the
/mnt
directory of the ECS instance. Run the following commands to create thetf_data/
andpytorch_data/
directories in the /mnt directory to store theTF mnist
andPytorch mnist
datasets, respectively:cd /mnt/ mkdir tf_data/ mkdir pytorch_data/
Run the following command to download the
TF mnist
dataset:cd tf_data git clone https://code.aliyun.com/xiaozhou/tensorflow-sample-code.git mv tensorflow-sample-code/data/* ./ && rm -rf tensorflow-sample-code
Run the following command to download the
Pytorch mnist
dataset:cd pytorch_data git clone https://code.aliyun.com/370272561/mnist-pytorch.git mv mnist-pytorch/MNIST ./ && rm -rf mnist-pytorch