If you want to access other Alibaba Cloud services in a Deep Learning Containers (DLC) job, you must configure an AccessKey pair for identity authentication. If you associate a RAM role with the DLC job, you can access other Alibaba Cloud services in the DLC job by using a temporary access credential provided by Security Token Service (STS) without the need to configure the AccessKey pair. This ensures the security of AccessKey pairs. This topic describes how to create a RAM role and associate the RAM role with a DLC job. This topic also describes how to obtain a temporary access credential provided by STS by using the RAM role.
Benefits
You can use a RAM role whose trusted entity is an Alibaba Cloud service. The Alibaba Cloud service can assume the RAM role to implement cross-service access. You can obtain a temporary access credential by using the RAM role to implement identity authentication and access control. This method has the following benefits:
Security and confidentiality: You do not need to manage credentials in a DLC job. You can use a temporary access credential provided by STS instead of an AccessKey pair to reduce the risk of AccessKey pair leaks.
Convenient management: You can modify the policy attached to the RAM role associated with a DLC job to manage the access permissions of each developer on Alibaba Cloud services in the DLC job in a more convenient and fine-grained manner.
Limits
A DLC job can be associated with only one RAM role.
Configuration method
Associate a RAM role with a DLC job when you create the DLC job, and obtain a temporary access credential provided by STS by using the RAM role.
Associate a RAM role with a DLC job
Scenario 1: Associate the default role of PAI to a DLC job
The default role of Platform for AI (PAI) is a RAM role to which the normal service role AliyunPAIDLCDefaultRole is assigned. The default role has access permissions only on MaxCompute and Object Storage Service (OSS) and supports fine-grained access control. When you access MaxCompute tables, a temporary access credential provided by using the default role of PAI has the same permissions as the owner of a DLC instance. When you access OSS, a temporary access credential can be used to access only the default OSS bucket configured for the current workspace.
If you associate the default role with a DLC job, you can obtain a temporary access credential to access basic development resources in the DLC job without the need to create another RAM role.
Use scenarios
After you associate the default role of PAI with a DLC job, you do not need to configure an AccessKey pair in the following scenarios:
Use MaxCompute SDK to submit a job to a MaxCompute project on which the job owner has the execution permissions.
Use OSS SDK to access data in the default OSS bucket configured for the current workspace. For more information about how to configure a default OSS storage path of the current workspace, see Configure the default storage path of a workspace.
Configuration method
On the Create Job page, select Default Roles of PAI for the Instance RAM Role parameter in the Role Information section. For more information, see Submit training jobs.
After you associate the RAM role with the DLC job, you must obtain a temporary access credential by using the RAM role.
Scenario 2: Associate a custom role with a DLC job
If the permissions of a temporary access credential that you obtain by using the default role of PAI cannot meet your requirements, you can create a RAM role and grant permissions to the RAM role to control the range of Alibaba Cloud resources that developers can access in the job. Perform the following steps:
Log on to the RAM console and create a RAM role. For more information, see Create a RAM role for a trusted Alibaba Cloud service.
Take note of the following key parameters:
Select Trusted Entity: Select Alibaba Cloud Service.
Role Type: Select Normal Service Role.
Select Trusted Service: Select Platform for AI.
Grant permissions to the RAM role.
You can attach a system policy or a custom policy to the RAM role. This way, the RAM role can access or manage related resources. For more information, see the "Step 3: Grant permissions to a RAM role" section in the Create a RAM role and attach the required policies to the role topic. For example, you can attach the AliyunOSSReadOnlyAccess policy to the RAM role.
If you use a RAM user, contact the owner of the Alibaba Cloud account to grant the current RAM user the permissions to use the RAM role. For more information, see Grant permissions to a RAM user. Sample policy document:
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "ram:PassRole", "Resource": "acs:ram::*:role/${RoleName}" } ] }
Replace ${RoleName} in the preceding sample policy document with the name of the RAM role that you want to associate with the DLC job.
Associate the RAM role with the DLC job and submit the DLC job. You need to configure only the following key parameters in the Role Information section. For information about other parameters, see Submit training jobs.
Parameter
Description
Instance RAM Role
Select Custom Roles.
RAM Role
Select the RAM role that you created in Step 1. After you associate the RAM role with the DLC job, you have the permissions of the RAM role to access other Alibaba Cloud services in the DLC job by using a temporary access credential provided by STS.
After you associate the RAM role with the DLC job, you must obtain a temporary access credential by using the RAM role.
Scenario 3: Do not associate a RAM role with a DLC job
If you do not need to use an AccessKey pair to access data, we recommend that you do not associate a RAM role with a DLC job. When you create a DLC job, select Does Not Associate Role for the Instance RAM Role parameter in the Role Information section. For more information, see Submit training jobs.
Obtain a temporary access credential by using the RAM role associated with a DLC job
When you create a DLC job, if you associate the DLC job with the default role of PAI or a custom role, obtain a temporary access credential by using the following methods in a convenient manner:
Method 1: Use the Alibaba Cloud Credentials tool
The Alibaba Cloud Credentials tool calls the local service that is automatically injected when you create a DLC job to obtain a temporary access credential provided by STS. This credential is updated on a regular basis.
When you create a DLC job, complete the following key configurations. For more information, see Submit training jobs.
Install the Alibaba Cloud Credentials tool.
On the Create Job page, select Select from List for the Third-party Libraries parameter and enter alibabacloud_credentials in the Third-party Libraries field to install the Alibaba Cloud Credentials tool.
NoteIf the third-party library is pre-installed in the image, you can skip this configuration.
Configure a script file.
In this example, a Python script file is used. For more information about sample code of SDKs for other programming languages, see Sample code. You can select Online configuration for the Code Builds parameter, or select Local Upload to upload a script file from your on-premises machine to the DLC environment.
from alibabacloud_credentials.client import Client as CredClient from alibabacloud_credentials.models import Config as CredConfig credentialsConfigig = CredConfig( type='credentials_uri' # This parameter is optional. If you did not configure other access methods for the default credential chain, you do not need to configure this parameter. Credentials SDK obtains a temporary access credential by using the URI. ) credentialsClient = CredClient(CredConfig)
Method 2: Access the local service of the DLC job
When you create a DLC job, you can set the Startup Command parameter to the following command. This way, you can access the local service that is automatically injected into the DLC job to obtain a temporary access credential. For more information, see Submit training jobs.
# Obtain a temporary access credential for the RAM role of an instance.
curl $ALIBABA_CLOUD_CREDENTIALS_URI
The following output is returned:
{
"Code": "Success",
"AccessKeyId": "STS.N*********7",
"AccessKeySecret": "3***************d",
"SecurityToken": "DFE32G*******"
"Expiration": "2024-05-21T10:39:29Z"
}
In the output, take note of the following parameters:
SecurityToken: the temporary access credential of the RAM role.
Expiration: the expiration time of the temporary access credential for the RAM role.
Method 3: Access the local file of the DLC job
Access the file in the specified path of the DLC container to obtain the temporary access credential by using the RAM role. The file is automatically injected by PAI and refreshed on a regular basis. The path of the file is /mnt/.alibabacloud/credentials
. The following sample code provides an example of the file content:
{
"AccessKeyId": "STS.N*********7",
"AccessKeySecret": "3***************d",
"SecurityToken": "DFE32G*******"
"Expiration": "2024-05-21T10:39:29Z"
}
Examples
Example 1: Access MaxCompute by using a RAM role associated with a DLC job
When you create a DLC job, complete the following key configurations. For more information, see Submit training jobs.
Install the Alibaba Cloud Credentials tool.
Set the Third-party Libraries parameter to Select from List and enter the following third-party libraries to install Alibaba Cloud Credentials and MaxCompute SDK.
alibabacloud_credentials odps
NoteIf the third-party libraries are pre-installed in the image, you can skip this configuration.
Configure a script file.
In this example, a Python script file is used. You can select Online configuration for the Code Builds parameter, or select Local Upload to upload a script file from your on-premises machine to the DLC environment.
from alibabacloud_credentials import providers from odps.accounts import CredentialProviderAccount from odps import ODPS if __name__ == '__main__': account = CredentialProviderAccount(providers.DefaultCredentialsProvider()) o = ODPS( account=account, project="{odps_project}", # Replace {odps_project} with the name of your project. endpoint="{odps_endpoint}" # Replace {odps_endpoint} with the endpoint of the region where your project resides. ) for t in o.list_tables(): print(t)
Example 2: Access OSS by using a RAM role associated with a DLC job
When you create a DLC job, complete the following key configurations. For more information, see Submit training jobs.
Install the Alibaba Cloud Credentials tool.
Set the Third-party Libraries parameter to Select from List and enter the following third-party libraries to install Alibaba Cloud Credentials and OSS SDK.
alibabacloud_credentials oss2
NoteIf the third-party libraries are pre-installed in the image, you can skip this configuration.
Configure a script file.
In this example, a Python script file is used. You can select Online configuration for the Code Builds parameter, or select Local Upload to upload a script file from your on-premises machine to the DLC environment.
import oss2 from alibabacloud_credentials.client import Client from alibabacloud_credentials import providers from itertools import islice auth = oss2.ProviderAuth(providers.DefaultCredentialsProvider()) bucket = oss2.Bucket(auth, '{oss_endpoint}', # Replace {oss_endpoint} with the endpoint of the region where your OSS bucket resides. '{oss_bucket}' # Replace {oss_bucket} with the name of your OSS bucket. ) for b in islice(oss2.ObjectIterator(bucket), 10): print(b.key)
FAQ
What do I do if an error occurs when I associate a custom role with a DLC job during job creation?
The error message is check permission for ram role failed or check permission for sub user failed.
To resolve this issue, log on to the RAM console to check whether the RAM role exists.
If the RAM role does not exist, change the RAM role to an existing role.
If the RAM role exists, contact the owner of the Alibaba Cloud account to grant the current RAM user the permissions to use the RAM role. For more information, see Grant permissions to a RAM user. The following sample code shows the policy document. You must replace
${RoleName}
with the name of the RAM role.{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "ram:PassRole", "Resource": "acs:ram::*:role/${RoleName}" } ] }
The error message is Failed to assume role for user.
In most cases, this error occurs because no trust policy is configured for the RAM role. To configure a trust policy for the RAM role, perform the following steps:
Log on to the RAM console.
In the left-side navigation pane, choose Identities > Roles.
On the Roles page, find the RAM role and click the name of the RAM role.
On the details page of the RAM role, click Trust Policy. On the Trust Policy tab, click Edit Trust Policy. In the trust policy editor, modify the policy document.
The following sample code shows the original policy document of the RAM role:
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::aaa:root" ], "Service": [ "xxx.aliyuncs.com" ] } } ], "Version": "1" }
The following sample code shows the new policy document of the RAM role:
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::aaa:root" ], "Service": [ "xxx.aliyuncs.com", "pai.aliyuncs.com" ] } } ], "Version": "1" }
Click Save trust policy document.