Use the P2P acceleration feature in an ACK cluster to accelerate image pulls and reduce application deployment times. This topic describes how to install the P2P acceleration agent in an ACK cluster.
Prerequisites
-
You have created an ACR Enterprise Edition instance (Standard or Advanced Edition).
-
You have created an ACK managed cluster or ACK dedicated cluster, or an ACK Serverless Pro cluster.
-
You have configured the ACR Enterprise Edition instance to allow access from the VPC of your ACK cluster. For more information, see Configure access control for a VPC.
Image usage limitations
If you use extra-large images, such as those for large language models, you must meet one of the following requirements to ensure efficient P2P image pulling: the nodes in the node pool have a data disk of the AutoPL type, or the nodes have at least 8 GB of free memory for P2P data caching.
Step 1: Obtain instance ID and enable P2P acceleration
Log on to the Container Registry console.
In the top navigation bar, select a region.
In the left-side navigation pane, click Instances.
On the Instances page, click the Enterprise Edition instance that you want to manage.
-
On the Overview page, record the Instance ID. Then, in the Component Settings section, turn on P2P acceleration and click Confirm in the confirmation dialog box.
WarningBefore disabling the P2P acceleration agent, you must stop using the P2P feature and uninstall the agent from all relevant clusters. Re-enabling the feature requires reinstalling the agent.
Step 2: Install the P2P agent and grant permissions
You can grant the P2P acceleration agent permission to access your ACR Enterprise Edition instance by using one of the following three methods.
-
Use a worker RAM role for authorization and installation.
Limitation: The ACR Enterprise Edition instance and the ACK cluster must belong to the same Alibaba Cloud account.
-
Use the AccessKey pair of a RAM user for authorization and installation.
-
Use RAM Roles for Service Accounts (RRSA) for authorization and installation.
Limitation: This method is supported only for ACK managed clusters that run Kubernetes 1.22 or later.
Worker RAM role
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Cluster Information.
-
On the Cluster Information page, click the Basic Information tab. In the Cluster Resources section, copy the name of the worker RAM role and click the link to open the Resource Access Management (RAM) console to grant permissions to the role.
-
Create the following custom policy. For more information, see Create a custom policy.
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "cr:GetInstanceVpcEndpoint", "cr:ListInstanceEndpoint" ], "Resource": "*" } ] } -
On the Role page, find the worker RAM role and attach the custom policy that you created. For more information, see Grant permissions to a RAM role.
-
Log on to the ACK console. In the left navigation pane, click .
-
On the App Catalog page, enter ack-acr-acceleration-p2p in the search box, find the component, and then click its card.
-
On the component details page, click Quick Deployment in the upper-right corner.
-
In the Create panel, select a Cluster and Namespace, specify a release name, and then click Next.
-
On the Parameters panel, select the latest chart version and set the acrInstances parameter to the instance ID that you recorded in Step 1. If you have multiple instances, separate their IDs with a comma (,).
# ID of ACR EE instances, support multi, e.g. "cri-xxx,cri-yyy" acrInstances: "" # Region of the ACR EE instance. The default value is the region of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center. region: "" # The VPC that is connected to the VPC network of the ACR EE instance. The default value is the VPC of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center through a VPC network. vpcId: "" p2p: # Port of the P2P agent in the host network. port: 65001Note-
By default, the agent uses port 65001 on nodes. If a port conflict occurs, change the port number as needed.
-
If the ACK cluster and the ACR Enterprise Edition instance are in the same region, you can leave the
regionandvpcIdparameters empty. If they are in different regions, you must set theregionparameter to the region of the ACR Enterprise Edition instance and thevpcIdparameter to the ID of the VPC associated with the instance. -
For extra-large images, such as those for large language models, you must adjust the P2P data caching mode based on your node configuration:
-
Data disk-based caching mode (default): Ensure that the node's data disk type is AutoPL. Set
p2p.v2.cache.modetodisk. -
Memory-based caching mode: Ensure that the node has at least 8 GB of free memory, and then set
p2p.v2.cache.modetomemory.
-
-
RAM user AccessKey pair
-
Create a RAM user. For more information, see Create a RAM user.
-
Grant the following permissions to the RAM user. Then, create an AccessKey pair and record the AccessKey ID and AccessKey Secret.
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "cr:GetInstanceVpcEndpoint", "cr:ListInstanceEndpoint" ], "Resource": "*" } ] } Log on to the ACK console. In the left navigation pane, click .
-
On the App Catalog page, enter ack-acr-acceleration-p2p in the search box, find the component, and then click its card.
-
On the component details page, click Quick Deployment in the upper-right corner.
-
In the Create panel, select a Cluster and Namespace, specify a release name, and then click Next.
-
On the Parameters panel, select the latest chart version. Set the acrInstances parameter to the instance ID that you recorded in Step 1. If you have multiple instances, separate their IDs with a comma (,). Then, enter the AccessKey ID and AccessKey Secret that you recorded.
# ID of ACR EE instances, support multi, e.g. "cri-xxx,cri-yyy" acrInstances: "" # You must specify the following parameters if your Kubernetes cluster is self-managed in a data center. accessKey: "" accessKeySecret: "" # Region of the ACR EE instance. The default value is the region of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center. region: "" # The VPC that is connected to the VPC network of the ACR EE instance. The default value is the VPC of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center through a VPC network. vpcId: "" p2p: # Port of the P2P agent in the host network. port: 65001Note-
By default, the agent uses port 65001 on nodes. If a port conflict occurs, change the port number as needed.
-
If the ACK cluster and the ACR Enterprise Edition instance are in the same region, you can leave the
regionandvpcIdparameters empty. If they are in different regions, you must set theregionparameter to the region of the ACR Enterprise Edition instance and thevpcIdparameter to the ID of the VPC associated with the instance. -
For extra-large images, such as those for large language models, you must adjust the P2P data caching mode based on your node configuration:
-
Data disk-based caching mode (default): Ensure that the node's data disk type is AutoPL. Set
p2p.v2.cache.modetodisk. -
Memory-based caching mode: Ensure that the node has at least 8 GB of free memory, and then set
p2p.v2.cache.modetomemory.
-
-
RRSA
The RAM Roles for Service Accounts (RRSA) feature provides fine-grained, Pod-level permission control.
The RRSA feature is supported only in clusters that run Kubernetes 1.22 or later.
-
To use RRSA, you must upgrade the agent to version 0.3.6 or later.
-
Enable RRSA in the correct order: first for the cluster, then for the P2P acceleration agent. If you perform these steps in the wrong order, you must reinstall the agent for RRSA to take effect.
-
Enable RRSA for your cluster. For more information, see Configure a RAM role for a service account by using RRSA to enforce fine-grained permission management for Pods.
-
Configure ACR resource access permissions for the RAM role.
-
Scenario 1: The ACK cluster and the ACR Enterprise Edition instance belong to the same Alibaba Cloud account.
If Account A owns both the ACK cluster and the ACR Enterprise Edition instance, create a RAM role in Account A and attach the following permission policy to it. For more information, see Create a RAM role for a trusted Alibaba Cloud account.
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "cr:GetInstanceVpcEndpoint", "cr:ListInstanceEndpoint" ], "Resource": "*" } ] }Note-
Replace <oidc_issuer_url> with the URL of the OIDC provider for your cluster. You can find this URL on the Basic Information tab of the cluster details page in the ACK console.
-
Replace <oidc_provider_arn> with the ARN of the OIDC provider for your cluster. You can find this ARN on the Basic Information tab of the cluster details page in the ACK console.
{ "Statement": [ { "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "oidc:aud": [ "sts.aliyuncs.com" ], "oidc:iss": "<oidc_issuer_url>", "oidc:sub": [ "system:serviceaccount:aliyun-acr-acceleration:ack-acr-acceleration-p2p-job-sa", "system:serviceaccount:aliyun-acr-acceleration:ack-acr-acceleration-p2p-sa" ] } }, "Effect": "Allow", "Principal": { "Federated": [ "<oidc_provider_arn>" ] } } ], "Version": "1" } -
-
Scenario 2: The ACK cluster and the ACR Enterprise Edition instance belong to different Alibaba Cloud accounts.
Assume Account A owns the ACK cluster, and Account B owns the ACR Enterprise Edition instance. You must grant the ACK cluster in Account A permission to access the ACR resources in Account B.
In Account A, create a RAM role. For more information, see Create a RAM role for a trusted Alibaba Cloud account. Attach the AliyunSTSAssumeRoleAccess policy to grant the role permission to assume other roles. Then, modify its trust policy.
Note-
Replace <oidc_issuer_url> with the URL of the OIDC provider for your cluster. You can find this URL on the Basic Information tab of the cluster details page in the ACK console.
-
Replace <oidc_provider_arn> with the ARN of the OIDC provider for your cluster. You can find this ARN on the Basic Information tab of the cluster details page in the ACK console.
{ "Statement": [ { "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "oidc:aud": [ "sts.aliyuncs.com" ], "oidc:iss": "<oidc_issuer_url>", "oidc:sub": [ "system:serviceaccount:aliyun-acr-acceleration:ack-acr-acceleration-p2p-job-sa", "system:serviceaccount:aliyun-acr-acceleration:ack-acr-acceleration-p2p-sa" ] } }, "Effect": "Allow", "Principal": { "Federated": [ "<oidc_provider_arn>" ] } } ], "Version": "1" }In Account B, create a RAM role with ACR-related permissions. In the Trust Policy, add the ARN of the role from Account A. Attach the following permission policy to the role in Account B.
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "cr:GetInstanceVpcEndpoint", "cr:ListInstanceEndpoint" ], "Resource": "*" } ] }NoteYou can set the Maximum Session Duration for this RAM role to a value from 3,600 to 43,200 seconds. The expireDuration parameter, which you will configure later, must not exceed this value. For best results, set expireDuration to the same value as the Maximum Session Duration.
-
-
Log on to the ACK console. In the left navigation pane, click .
-
On the App Catalog page, enter ack-acr-acceleration-p2p in the search box, find the component, and then click its card.
-
On the component details page, click Quick Deployment in the upper-right corner.
-
In the Create panel, select a Cluster and Namespace, specify a release name, and then click Next.
-
On the Parameters panel, select the latest chart version. Set the acrInstances parameter to the instance ID that you recorded in Step 1. If you have multiple instances, separate their IDs with a comma (,). Configure the RRSA parameters as described in the following table.
Parameter
Description
Value
rrsa.enable
Specifies whether to enable RRSA.
true
rrsa.rrsaRoleARN
The ARN of the RAM role created in Account A.
Example:
acs:ram::aaarrsa.rrsaOIDCProviderRoleARN
The ARN of the OIDC provider for the cluster in Account A.
Example:
acs:ram::bbbrrsa.assumeRoleARN
The ARN of the RAM role created in Account B. This parameter is not required for same-account scenarios.
Example:
acs:ram::cccrrsa.expireDuration
The session duration for the role created in Account B. This determines the validity period of the temporary credentials generated by the agent. This parameter is not required for same-account scenarios.
ImportantThe value of expireDuration must not exceed the Maximum Session Duration of the role created in Account B.
The default value is 3600. The value must be between 3,600 and 43,200. Unit: seconds.
For information about other parameters, see Appendix.
Note-
By default, the agent uses port 65001 on nodes. If a port conflict occurs, change the port number as needed.
-
If the ACK cluster and the ACR Enterprise Edition instance are in the same region, you can leave the
regionandvpcIdparameters empty. If they are in different regions, you must set theregionparameter to the region of the ACR Enterprise Edition instance and thevpcIdparameter to the ID of the VPC associated with the instance. -
For extra-large images, such as those for large language models, you must adjust the P2P data caching mode based on your node configuration:
-
Data disk-based caching mode (default): Ensure that the node's data disk type is AutoPL. Set
p2p.v2.cache.modetodisk. -
Memory-based caching mode: Ensure that the node has at least 8 GB of free memory, and then set
p2p.v2.cache.modetomemory.
-
# ID of ACR EE instances, support multi, e.g. "cri-xxx,cri-yyy" acrInstances: "" rrsa: enable: true rrsaRoleARN: "" rrsaOIDCProviderRoleARN: "" assumeRoleARN: "" expireDuration: 3600 # Region of the ACR EE instance. The default value is the region of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center. region: "" # The VPC that is connected to the VPC network of the ACR EE instance. The default value is the VPC of the cluster. # You must set this parameter if the cluster and the instance are in different regions, or if you access the instance from a self-managed cluster in a data center through a VPC network. vpcId: "" p2p: # Port of the P2P agent in the host network. port: 65001 -
Appendix
The following table describes some of the parameters for the ack-acr-acceleration-p2p component.
|
File cache parameters |
Default (disk mode) |
Default (memory mode) |
Description |
|
blocksize |
256 |
256 |
The size of a single data chunk requested from the source Object Storage Service (OSS). |
|
capacity |
4294967296 |
0 |
The size of the disk cache. |
|
optionBlockSize |
67108864 |
8589934592 |
The size of the memory cache:
|
|
memoryCacheCapacityGB |
1 |
8 |
|
|
aio |
0 |
0 |
Specifies whether to enable libaio. This parameter has no effect in memory-based caching mode.
|
|
DeployConfig |
|||
|
proxyFsParallels |
128 |
The number of concurrent requests that the P2P agent can process. |
|
|
AgentConfig |
|||
|
connectTimeout (s) |
5 |
The timeout period for the P2P agent to establish a connection with an upstream peer. |
|
|
transferTimeout (s) |
15 |
The timeout period for data transfers. If no data is received within this period, the transfer times out. |
|