In some pre-installation or high-performance scenarios, you may want to customize an operating system (OS) image to simplify elastic scaling in complex situations. You can use Alicloud Image Builder to build a custom OS image and create node pools based on this image. Alicloud Image Builder can accelerate node provisioning and optimize the performance of node autoscaling.
Prerequisites
An ACK cluster is created. For more information, see Create an ACK managed cluster.
A kubectl client is connected to the ACK cluster. For more information, see Get a cluster kubeconfig and connect to the cluster using kubectl.
Why you need elasticity-optimized custom images
ACK node pools support node autoscaling. The OS images provided for node pools, such as Alibaba Cloud Linux and CentOS, meet the requirements of most scenarios. However, in some pre-installation or high-performance scenarios, the base images may not meet your business needs. Alibaba Cloud provides Alicloud Image Builder to help you build custom OS images and simplify elastic scaling in complex scenarios.
When you use Alicloud Image Builder to create a custom image, you can submit an image building task to the cluster as a Job or CronJob.
Use an ACK Job to quickly build a custom OS image
This topic uses a ConfigMap named build-config and a Job workload named build as an example to show how to use Alicloud Image Builder to quickly build a custom OS image.
1. Configure parameters for building the OS image
You can create a ConfigMap named build-config to configure the parameters for building the OS image.
Create a file named build-config.yaml with the following YAML content.
The following tables describe the parameters in the preceding YAML content.
Table 1. Alicloud Image Builder configuration file parameters
Parameter
Example
Description
variables{"<variable1>":"<value>"}variables{"access_key":"{{env ALICLOUD_ACCESS_KEY}}"}
The variables (
variables) that are used by Alicloud Image Builder.NoteIf you write sensitive information, such as an AccessKey pair (including
access_keyandsecret_key), into the configuration file, the information may be leaked. To prevent accidental disclosure, set them as variables whose values are sourced from the runtime.builders{"type":"<value>"}builders{"type":"alicloud-ecs"}
The image builders (
builders). If you set type to alicloud-ecs, a temporary ECS instance is created to build the image. After the build is complete, the ECS instance is automatically destroyed.provisioners{"type":"<value>"}provisioners{"type":"shell"}
The image provisioners (
provisioners), which define the operations to be performed within the temporary instance. If you set type to shell, a shell provisioner is used. This means a shell command is automatically run after connecting to the Linux instance. For example, run the shell commandyum install redis.x86_64 -yto install Redis.For more information about provisioner configurations, see Provisioner configurations.
Table 2. Image building parameters
Parameter
Example
Description
Required
access_keyyourAccessKeyID
Your AccessKey ID. For more information, see Obtain an AccessKey pair.
Required
secret_keyyourAccessKeySecret
Your AccessKey secret.
Required
regioncn-beijing
The region of the destination custom image.
Required
image_nameack-custom_image
The name of the destination custom image. The name must be unique within the region.
Required
source_imagealiyun_2_1903_x64_20G_alibase_20200904.vhd
The ID of the Alibaba Cloud public image that has the same operating system. For more information, see OS images supported by Container Service for Kubernetes.
Required
instance_typeecs.c6.xlarge
The instance that is created from the source_image runs the specified pre-installation task and then generates the custom image. If you need a GPU image, specify a GPU-accelerated instance type here.
Required
RUNTIMEcontainerd
The container runtime, which can be Docker or containerd.
Required
RUNTIME_VERSION1.6.28
If the container runtime is Docker, the default value of RUNTIME_VERSION is 19.03.15.
If the container runtime is containerd, the default value of RUNTIME_VERSION is 1.6.20.
Required
SKIP_SECURITY_FIXtrue
Skip security updates.
Required
KUBE_VERSION1.30.1-aliyun.1
The version number of the cluster.
Required
PRESET_GPUtrue
Pre-install a GPU driver to accelerate startup.
Optional
NVIDIA_DRIVER_VERSION460.91.03
The version of the pre-installed GPU driver. If you do not specify this parameter, the default version 460.91.03 is used.
Optional
OS_ARCHamd64
The CPU architecture, which can be amd64 or arm64.
Required
MOUNT_RUNTIME_DATADISK
true
If your custom image has cached application images and you need to attach a data disk to the ECS instance during runtime, set this parameter to true.
Optional
ImportantBefore you configure a custom image for a node pool, make sure that the node pool's configurations, such as cluster version, cluster region, container runtime, and GPU version-compatible instance type, match the build configurations of the custom image. Otherwise, nodes cannot be added to the cluster.
During the custom image verification phase, use a regular node pool that matches the selected parameters for verification. After nodes are successfully added to the node pool, verify that your services run as expected.
Run the following command to deploy Alicloud Image Builder to the cluster.
kubectl apply -f build-config.yaml
2. Create a Job to build the custom OS image
Use the following YAML content to grant the required permissions to the AccessKey pair.
Run the following commands to generate encrypted strings for the AccessKey ID and AccessKey secret.
echo -n "yourAccessKeyID" | base64 echo -n "yourAccessKeySecret" | base64Use the following YAML content to create a Secret named my-secret.
apiVersion: v1 kind: Secret metadata: name: my-secret namespace: default type: Opaque data: ALICLOUD_ACCESS_KEY: TFRI**************** # The Base64-encoded string from the previous step. ALICLOUD_SECRET_KEY: a0zY****************Create a file named build.yaml with the following YAML content.
Configure variables to run the Job. During the process, an ECS instance of the specified instance_type is created from the source_image in the account that owns the AccessKey pair. Then, the provisioner configurations are run. After the process is complete, the ECS instance is used to generate an image, which is then pushed as a custom image to the specified region under the same account.
Deploy the Job to the cluster to start building the OS image.
kubectl apply -f build.yaml
3. (Optional) View the custom image build logs
Operation logs are generated during the image building process. The logs record the steps performed during the build, including parameter validation, temporary resource creation, software pre-installation, target resource creation, and temporary resource release. You can perform the following steps to view the image build logs.
Log on to the ACK console. In the navigation pane on the left, click Clusters.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, choose .
On the Jobs page, find the Job that you created in the preceding step. In the Actions column, click Details. On the details page, click the Logs tab to view the image building logs.
Provisioner configurations
A provisioner is a component used to install and configure software on a running machine before the machine is converted into a static OS image. The main scenarios where you need to install software into an image include the following:
Install software packages.
Patch the kernel.
Create users.
Download application code.
Create a custom Alibaba Cloud Linux 3 image.
Execute a shell script
"provisioners": [{
"type": "shell",
"script": "script.sh"
}]Use Ansible to execute an orchestration script
"provisioners": [
{
"type": "ansible",
"playbook_file": "./playbook.yml"
}
]Install the CPFS client
Cloud Paralleled File System (CPFS) requires many packages to be installed. Some of these packages involve on-site compilation, which makes the installation process time-consuming. When the number of client nodes is large, you can use a custom image to reduce the cost of batch installing the CPFS client.
Customize an Arm architecture image
Customize a GPU node system image to accelerate startup
Custom GPU images and custom CPU images cannot be used interchangeably.
Cache application images in the system image
When an ECS instance with a mounted data disk is added to a node pool, the disk is initialized, and any pre-cached application images are cleared. To mount a data disk when you create an ECS instance from a custom image, you can generate a data disk snapshot during the custom image creation process to ensure that the application images are not cleared.
{
"variables": {
"image_name": "ack-custom_image",
"source_image": "aliyun_3_x64_20G_alibase_20240528.vhd",
"instance_type": "ecs.c6.xlarge",
"access_key": "{{env `ALICLOUD_ACCESS_KEY`}}",
"region": "{{env `ALICLOUD_REGION`}}",
"secret_key": "{{env `ALICLOUD_SECRET_KEY`}}"
},
"builders": [
{
"type": "alicloud-ecs",
"system_disk_mapping": {
"disk_size": 120,
"disk_category": "cloud_essd"
},
"image_disk_mappings": {
"disk_size": 40,
"disk_category": "cloud_auto"
}, # Configure a data disk when you create the custom image. A snapshot of the data disk is automatically generated after the image is created.
"access_key": "{{user `access_key`}}",
"secret_key": "{{user `secret_key`}}",
"region": "{{user `region`}}",
"image_name": "{{user `image_name`}}",
"source_image": "{{user `source_image`}}",
"instance_type": "{{user `instance_type`}}",
"ssh_username": "root",
"skip_image_validation": "true",
"io_optimized": "true"
}
],
"provisioners": [
{
"type": "file",
"source": "scripts/ack-optimized-os-linux3-all.sh",
"destination": "/root/"
},
{
"type": "shell",
"inline": [
"export RUNTIME=containerd",
"export SKIP_SECURITY_FIX=true",
"export KUBE_VERSION=1.30.1-aliyun.1",
"export OS_ARCH=amd64",
"export MOUNT_RUNTIME_DATADISK=true", # Mount the file path of the container runtime to the data disk.
"bash /root/ack-optimized-os-linux3-all.sh",
"ctr -n k8s.io i pull registry-cn-hangzhou-vpc.ack.aliyuncs.com/acs/pause:3.9", # Add the application image to the system image.
"mv /var/lib/containerd /var/lib/container/containerd" # Move the image file to the data disk.
]
}
]
}When you configure the node pool, you can specify a custom image that includes a data disk snapshot. The system automatically associates the corresponding data disk snapshot.

Pull an image from a private repository when the runtime is Docker
docker login <Registry Address> -u user -p password
docker pull nginxPull an image from a private repository when the runtime is containerd
ctr -n k8s.io i pull --user=username:password nginxPull an image from a private repository during a custom image build
On a Linux machine with Docker installed, run the following
docker logincommand to generate a certificate.docker login --username=zhongwei.***@aliyun-test.com --password xxxxxxxxxx registry.cn-beijing.aliyuncs.comAfter you successfully run the
docker logincommand, a certificate file named config.json is generated in the/root/.dockerdirectory.
Create a ConfigMap from the generated config.json file.
apiVersion: v1 kind: ConfigMap metadata: name: docker-config data: config.json: |- { "auths": { "registry.cn-beijing.aliyuncs.com": { "auth": "xxxxxxxxxxxxxx" } }, "HttpHeaders": { "User-Agent": "Docker-Client/19.03.15 (linux)" } }Modify the Job YAML to mount the ConfigMap to the pod.

Modify the build-config by adding the content shown in the figure.

Run the Job.
Set the image upload and download concurrency
Log on to the ACK console. In the navigation pane on the left, click Clusters.
On the Clusters page, click the name of the target cluster. In the navigation pane on the left, choose .
Click the name of the target node pool. On the Basic Information tab, in the Node Pool Information section, click the link next to Auto Scaling Group.
Click the Instance Configuration Sources tab. Find the scaling configuration that you want to modify, click Modify in the Actions column, and then click OK.
On the Modify Scaling Configuration page, expand Advanced Settings and copy the content of the Instance User Data field. Base64-decode the data in the Instance User Data field.
Decode and modify the user data.
Base64-decode the existing instance user data. After the data is decoded, append the following code to the end of the original script.
The following code installs the
jqtool, modifies the Docker configuration file to increase the number of concurrent downloads and uploads, and then restarts the Docker service.yum install -y jq echo "$(jq '. += {"max-concurrent-downloads": 20,"max-concurrent-uploads": 20}' /etc/docker/daemon.json)" > /etc/docker/daemon.json service docker restart
Re-encode and update the user data.
Base64-encode the complete modified script. Replace the original data in the Instance User Data field with the newly generated encoded content, and then click Confirm Modification to save the changes.
Create a custom Alibaba Cloud Linux 3 image
Create a custom Red Hat Enterprise Linux 9 image
What to do next
You can create a node pool based on the custom image. For more information, see Create a custom image from an existing ECS instance and use the image to create nodes.
To learn how to scale out nodes based on a custom image, see Enable node autoscaling.