All Products
Search
Document Center

Resource Orchestration Service:ALIYUN::PAI::DatasetVersion

Last Updated:Mar 25, 2025

ALIYUN::PAI::DatasetVersion is used to create a dataset version.

Syntax

{
  "Type": "ALIYUN::PAI::DatasetVersion",
  "Properties": {
    "DataSourceType": String,
    "DatasetId": String,
    "Property": String,
    "Uri": String,
    "Description": String,
    "DataSize": Integer,
    "DataCount": Integer,
    "Labels": List,
    "Options": String,
    "SourceType": String,
    "SourceId": String
  }
}

Properties

Property

Type

Required

Editable

Description

Constraint

DataSourceType

String

Yes

No

The storage types of the data source.

Separate multiple storage types with commas (,). Valid values:

  • NAS: File Storage NAS (NAS).

  • OSS: Object Storage Service (OSS).

  • CPFS: Cloud Parallel File Storage (CPFS).

Note

The DataSourceType value of a dataset version must be the same as the DataSourceType value of the dataset. When you create a dataset version, the system checks whether the values are the same.

DatasetId

String

Yes

No

The ID of the dataset.

None.

Property

String

Yes

No

The property of the dataset.

Valid values:

  • FILE

  • DIRECTORY

Uri

String

Yes

No

The Uniform Resource Identifier (URI) configurations.

  • Value format when DataSourceType is set to OSS: oss://bucket.endpoint/object.

  • Value formats when DataSourceType is set to NAS:

    • Value format for a General-purpose NAS file system: nas://<nasfisid>.region/subpath/to/dir/.

    • Value format for a CPFS 1.0 file system: nas://<cpfs-fsid>.region/subpath/to/dir/.

    • Value format for a CPFS 2.0 file system: nas://<cpfs-fsid>.region/<protocolserviceid>/. You can distinguish CPFS 1.0 and CPFS 2.0 file systems based on the format of the file system ID: The ID of a CPFS 1.0 file system is in the cpfs-<8-bit ASCII characters> format. The ID of a CPFS 2.0 file system is in the cpfs-<16-bit ASCII characters> format.

Description

String

No

No

The description of the dataset version.

You can use descriptions to distinguish dataset versions.

DataSize

Integer

No

No

The size of the dataset file.

Unit: bytes.

DataCount

Integer

No

No

The number of dataset files.

None.

Labels

List

No

No

The labels of the dataset version.

For more information, see Labels properties.

Options

String

No

Yes

The extended field.

The value of this property is a JSON string. When you use the dataset in Deep Learning Containers (DLC), you can use the mountPath field to specify the default mount path of the dataset.

SourceType

String

No

No

The type of the data source.

Default value: USER. Valid values:

  • PAI-PUBLIC-DATASET: a public dataset of Platform for AI (PAI).

  • ITAG: a dataset generated from a labeling job of iTAG.

  • USER: a dataset registered by a user.

SourceId

String

No

No

The ID of the data source.

  • If SourceType is set to USER, the value of SourceId is a custom string.

  • If SourceType is set to ITAG, the value of SourceId is the ID of the labeling job of iTAG.

  • If SourceType is set to PAI_PUBLIC_DATASET, SourceId is empty by default.

Labels syntax

"Labels": [
  {
    "Value": String,
    "Key": String
  }
]

Labels properties

Property

Type

Required

Editable

Description

Constraint

Key

String

Yes

No

The key of the label.

The key must be 1 to 128 characters in length, and cannot contain http:// or https://. It cannot start with aliyun or acs:.

Value

String

No

No

The value of the label.

The value can be up to 128 characters in length, and cannot contain http:// or https://. It cannot start with aliyun or acs:.

Return values

Fn::GetAtt

VersionName: the name of the dataset version.

Examples

ROSTemplateFormatVersion: '2015-09-01'
Parameters:
  DataSourceType:
    Type: String
    Description:
      en: |-
        The data source type. The following values are supported:
        - OSS: Alibaba Cloud Object Storage (OSS).
        - NAS: Alibaba cloud file storage (NAS).
        - CPFS
    AllowedValues:
      - OSS
      - NAS
      - CPFS
    Required: true
  Uri:
    Type: String
    Description:
      en: |-
        The Uri configuration sample is as follows:
        - The data source type is OSS:'oss://bucket.endpoint/object'
        - The data source type is NAS:
        The general NAS format is: 'nas://<nasfisid>.region/subpath/to/dir/';
        CPFS1.0:'nas://<cpfs-fsid>.region/subpath/to/dir /';
        CPFS2.0:'nas://<cpfs-fsid>.region/<protocolserviceid>/'.
        CPFS1.0 and CPFS2.0 are distinguished by the format of fsid: CPFS1.0 is cpfs-<8-bit ascii characters>;CPFS2.0 is cpfs-<16 ascii characters>.
    AllowedPattern: ^(oss://|nas://).*
    Required: true
  Property:
    Type: String
    Description:
      en: |-
        The properties of the dataset. The following values are supported:
        - FILE: FILE.
        - DIRECTORY: folder.
    AllowedValues:
      - FILE
      - DIRECTORY
    Required: true
  DatasetId:
    Type: String
    Description:
      en: The ID of the dataset.
    Required: true
Resources:
  ExtensionResource:
    Type: ALIYUN::PAI::DatasetVersion
    Properties:
      DataSourceType:
        Ref: DataSourceType
      Uri:
        Ref: Uri
      Property:
        Ref: Property
      DatasetId:
        Ref: DatasetId
Outputs:
  VersionName:
    Description: Dataset version name.
    Value:
      Fn::GetAtt:
        - ExtensionResource
        - VersionName
{
  "ROSTemplateFormatVersion": "2015-09-01",
  "Parameters": {
    "DataSourceType": {
      "Type": "String",
      "Description": {
        "en": "The data source type. The following values are supported:\n- OSS: Alibaba Cloud Object Storage (OSS).\n- NAS: Alibaba cloud file storage (NAS).\n- CPFS"
      },
      "AllowedValues": [
        "OSS",
        "NAS",
        "CPFS"
      ],
      "Required": true
    },
    "Uri": {
      "Type": "String",
      "Description": {
        "en": "The Uri configuration sample is as follows:\n- The data source type is OSS:'oss://bucket.endpoint/object'\n- The data source type is NAS:\nThe general NAS format is: 'nas://<nasfisid>.region/subpath/to/dir/';\nCPFS1.0:'nas://<cpfs-fsid>.region/subpath/to/dir /';\nCPFS2.0:'nas://<cpfs-fsid>.region/<protocolserviceid>/'.\nCPFS1.0 and CPFS2.0 are distinguished by the format of fsid: CPFS1.0 is cpfs-<8-bit ascii characters>;CPFS2.0 is cpfs-<16 ascii characters>."
      },
      "AllowedPattern": "^(oss://|nas://).*",
      "Required": true
    },
    "Property": {
      "Type": "String",
      "Description": {
        "en": "The properties of the dataset. The following values are supported:\n- FILE: FILE.\n- DIRECTORY: folder."
      },
      "AllowedValues": [
        "FILE",
        "DIRECTORY"
      ],
      "Required": true
    },
    "DatasetId": {
      "Type": "String",
      "Description": {
        "en": "The ID of the dataset."
      },
      "Required": true
    }
  },
  "Resources": {
    "ExtensionResource": {
      "Type": "ALIYUN::PAI::DatasetVersion",
      "Properties": {
        "DataSourceType": {
          "Ref": "DataSourceType"
        },
        "Uri": {
          "Ref": "Uri"
        },
        "Property": {
          "Ref": "Property"
        },
        "DatasetId": {
          "Ref": "DatasetId"
        }
      }
    }
  },
  "Outputs": {
    "VersionName": {
      "Description": "Dataset version name.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "VersionName"
        ]
      }
    }
  }
}