This topic describes how to configure datasets and source code repositories for a training job.
Prerequisites
- Create a professional managed Kubernetes cluster.
- The AI development console and scheduling component of the cloud-native AI component set are installed in the professional Kubernetes cluster. The cluster must run Kubernetes 1.20 or later.
- A Resource Access Management (RAM) user is created in the RAM console by the cluster administrator. A quota group is added and associated with the RAM user. For more information, see Step 1: Add a quota group and associate the quota group with the RAM user.
- Persistent volume claims (PVCs) are created. For more information, see Use a NAS file system as a statically provisioned volume in the ACK console or Use a NAS file system as a statically provisioned volume in the ACK console.
Note In most cases, data used to train models is stored in Object Storage Service (OSS) volumes or Apsara File Storage NAS (NAS) volumes.
Configure a dataset
- Log on to the AI development console. For more information, see Step 2: Log on to the AI development console.
- In the left-side navigation pane of the AI development console, click Data Config.
- On the Data Config page, click New Data Configuration.
- On the New Data Configuration page, set Name, Namespace, and Persistent Volume Claim for the dataset and specify local directory based on your requirements.
- Click Submit.
- After the dataset is created, you can view the detailed information about the dataset on the Data tab of the Data Config page.
Configure a source code repository
- In the left-side navigation pane of the AI development console, click Data Config.
- On the Data Config page, click New Git configuration.
- In the New Code Configuration dialog box, set Name, Git Repository, and Default Branch for the source code repository and specify local directory based on your requirements.
- Click Submit.
- After the source code repository is configured, you can view the detailed information about the source code repository on the Code tab of the Data Config page.