To ensure that requests to access OSS are sent by legitimate users or applications and OSS Connector for AI/ML is properly initialized, you must configure parameters accordingly.
Prerequisites
OSS Connector for AI/ML is installed. For more information, see Install OSS Connector for AI/ML.
Configure access credentials
-
You can create an access credential configuration file.
mkdir -p /root/.alibabacloud && touch /root/.alibabacloud/credentials -
Add and save the configuration.
-
Example:
{ "AccessKeyId": "<Access-key-id>", "AccessKeySecret": "<Access-key-secret>", "SecurityToken": "<Security-Token>", "Expiration": "2024-08-02T15:04:05Z" }The following table describes the configuration items.
Parameter
Required
Example value
Description
AccessKeyId
Yes.
STS.L4aB******************
The AccessKey ID and AccessKey secret associated with your Alibaba Cloud account or a Resource Access Management (RAM) user.
If you use a temporary access token for authentication, set these parameters to the AccessKey ID and AccessKey secret from the temporary access credentials.
AccessKeySecret
Yes.
At32************************
SecurityToken
No
STS.6MC2***************************************
The security token. This parameter is required when you use temporary access credentials from Security Token Service (STS) to access OSS.
If you use the AccessKey ID and AccessKey secret of an Alibaba Cloud account or a RAM user to authenticate, leave this parameter empty.
Expiration
No
2024-08-02T15:04:05Z
Specifies the time-to-live (TTL) for the authentication information. After the TTL expires, the OSS connector re-reads the authentication information. If you leave this parameter empty, the information never expires.
If you use a temporary access token to configure permissions, set a TTL.
If you use the AccessKey ID and AccessKey secret of an Alibaba Cloud account or a Resource Access Management (RAM) user to configure permissions, leave this parameter empty.
-
Use AccessKey ID and AccessKey secret:
Replace
<Access-key-id>and<Access-key-secret>in the example with the AccessKey ID and AccessKey secret of a RAM user. For more information about how to create an AccessKey ID and AccessKey secret, see Create an AccessKey pair.{ "AccessKeyId": "LTAI************************", "AccessKeySecret": "At32************************" } -
Example configuration with temporary access credentials:
NoteTo ensure data security in scenarios in which access credentials are used in the production environment for a long period of time, we recommend that you use temporary access credentials to prevent the AccessKey ID and the AccessKey secret from being leaked. If you want to authorize temporary access, you must obtain temporary access credentials. For more information, see Use temporary credentials provided by STS to access OSS. After you obtain the temporary access credentials, replace <Access-key-id>, <Access-key-secret>, and <Security-Token> with the AccessKey ID, AccessKey secret, and security token.
{ "AccessKeyId": "STS.L4aB******************", "AccessKeySecret": "wyLTSm*************************", "SecurityToken": "************", "Expiration": "2024-08-15T15:04:05Z" }
-
-
Run the
chmod 400 /root/.alibabacloud/credentialscommand to grant read-only permissions on thecredentialsfile to ensure the security of the AccessKey ID and AccessKey secret.
Configure OSS Connector
-
Create a configuration file named config.json for OSS connector.
mkdir -p /etc/oss-connector/ && touch /etc/oss-connector/config.json -
Configure parameters and save the configuration file.
In most cases, you can use the default configurations.
{ "logLevel": 1, "logPath": "/var/log/oss-connector/connector.log", "auditPath": "/var/log/oss-connector/audit.log", "expireTimeSec": 120, "prefetch": { "vcpus": 16, "workers": 16, "maxCacheAdviseGB": -1 }, "datasetConfig": { "prefetchConcurrency": 24, "prefetchWorker": 2 }, "checkpointConfig": { "prefetchConcurrency": 24, "prefetchWorker": 4, "uploadConcurrency": 64 } }The following table describes the parameters. Read the instructions in the table carefully before you change the configurations.
Parameter
Required
Example
Description
logLevel
No
1
The level of the log record. The default value is INFO. For production environments, we recommend setting the level to WARN.
The valid values are 0 (Debug), 1 (INFO), 2 (WARN), and 3 (ERROR).
logPath
No
/var/log/oss-connector/connector.log
The path to the connector log. The default is
/var/log/oss-connector/connector.log.auditPath
No
/var/log/oss-connector/audit.log
The audit log for connector I/O records read and write requests with a latency greater than 100 ms. The default path is
/var/log/oss-connector/audit.log.expireTimeSec
No
120
The interval in seconds for re-reading the access credential file. Default value: 120. After the authentication information expires, OSS Connector re-reads the credential file at this interval.
prefetch
vcpus
No
16
The number of vCPUs available for prefetching in inference scenarios. Default value: 16.
workers
No
16
The number of worker threads for prefetching in inference scenarios. Default value: 16.
maxCacheAdviseGB
No
-1
The memory cache size available for prefetching, in GB. Default value: -1 (unlimited). When the model file size exceeds the available memory of the node, we recommend that you set this parameter to avoid OOM errors. You can also set this parameter by using the
CONNECTOR_MAX_CACHE_ADVISE_GBenvironment variable, which takes higher priority.DatasetConfig
prefetchConcurrency
No
24
The number of concurrent download tasks for prefetching data from OSS with a Dataset. The default value is 24.
prefetchWorker
No
2
The number of vCPUs to use for prefetching data from OSS for a dataset. The default value is 2.
checkpointConfig
prefetchConcurrency
No
24
The number of concurrent tasks that use checkpoint read to prefetch data from OSS. Default value: 24.
prefetchWorker
No
4
The number of vCPUs available for prefetching data from OSS using checkpoint read. Default value: 4.
uploadConcurrency
No
64
The number of concurrent uploads for checkpoint writes is 64 by default.
Configure environment variables
OSS Connector supports partial configuration through environment variables. Environment variables take higher priority than the corresponding settings in the configuration file.
|
Environment variable |
Description |
|
OSS_AUTHORIZATION_FILE_PATH |
The path to a JSON-formatted access credential file. Both AK/SK and STS temporary credential formats are supported.
This environment variable takes higher priority than the |
|
CONNECTOR_CONFIG_PATH |
Specifies the path to the configuration file. Default value: |
|
CONNECTOR_UDS_PATH |
The path to the Unix Domain Socket (UDS) file. Default value: |
|
CONNECTOR_MAX_CACHE_ADVISE_GB |
Specifies the memory cache size available for prefetching, in GB. This has the same function as |
References
After you install and configure OSS Connector for AI/ML, you can:
-
Use OssMapDataset to build a map dataset suitable for random reading. For more information, see Use OSS data to build an OssMapDataset dataset for random reading.
-
Use OssIterableDataset to build an iterable dataset suitable for sequential streaming reading. For more information, see Use data in OSS to build an iterable dataset suitable for sequential streaming reading.
-
Use OssCheckpoint to perform read and write operations on checkpoints in OSS. For more information, see Store and access checkpoints in OSS.
-
Use OSS Connector to improve model deployment efficiency in inference scenarios. For more information, see Improve model deployment efficiency with OSS Connector for AI/ML.
-
Use the model broadcast feature to efficiently distribute model data to multiple inference instances. For more information, see Model Broadcast.
-
Enable Connector in Kubernetes for model inference deployment. For more information, see Enable Connector in Kubernetes.