This topic describes the release notes for Container Service for Kubernetes (ACK) and provides links to the relevant references.
Background information
For more information about the Kubernetes versions supported by ACK, see Supported Kubernetes versions.
The following operating systems are supported by ACK: Container OS, Alibaba Cloud Linux 3, Alibaba Cloud Linux 3 Container-optimized, Alibaba Cloud Linux 3 for ARM, Alibaba Cloud Linux (UEFI 3), Red Hat, Ubuntu, and Windows. For more information, see OS images.
March 2025
Product | Feature | Description | Region | References |
ACK | Auto mode supported in ACK managed Pro clusters | When creating an ACK managed cluster, you can enable auto mode to rapidly create a Kubernetes cluster that adheres to Alibaba Cloud's best practices. After the cluster is created, a node pool with auto mode enabled will be automatically created. This node pool provides:
| All regions | |
Tracing Analysis of cluster control plane and data plane components supported | Once Tracing Analysis is enabled for the cluster API server or kubelet, the tracing information is automatically reported to Managed Service for OpenTelemetry, which provides monitoring data, such as visual link details and real-time topology. | All regions | ||
Release of high-risk KubeConfig SMS and email notification feature | You can notify users via SMS and email that there are deleted but still risky kubeconfig files associated with the current account. | All regions | None | |
Intelligent routing and traffic management supported based on ACK Gateway with Inference Extension | You can configure the inference service extension by using the ACK Gateway with the Inference Extension component to achieve intelligent routing and efficient traffic management. | All regions | Implement intelligent routing and traffic management by using ACK Gateway with Inference Extension | |
Distributed Cloud Container Platform for Kubernetes (ACK One) | Unified management of multi-cluster fleet components supported | ACK One Fleet provides unified and automated component management capabilities for cluster O&M engineers. It allows for the definition of baselines that include multiple components and their versions, which can be deployed across multiple clusters. Additionally, ACK One Fleet supports features such as component configuration, deployment batches, and rollback, thereby enhancing the stability of the system. | All regions | |
Dynamic distribution and rescheduling supported | ACK One Fleet can partition workloads into replicas based on the available resources of the subclusters by using PropagationPolicy. The descheduling feature of ACK One Fleet is enabled by default, and this feature conducts automatic checks every two minutes. If a pod remains in an unschedulable state for more than 30 seconds, the descheduling for that replica is triggered. | All regions | ||
cloud-native AI suite | Slurm queue priority configuration supported | New best practices are added to introduce how to achieve optimal performance in a Slurm system environment by using appropriate queue configuration strategies when job submissions or job state changes occur. This aims to maximize task scheduling and processing. | All regions |
February 2025
Product | Feature | Description | Region | References |
ACK | Security group and time zone changeable for the control plane | The security group and time zone of the control plane of an ACK cluster can be changed. When the security group and time zone of a cluster do not meet your business requirements, you can change the security group and time zone on the Basic Information tab of the cluster details page in the ACK console. | All regions | |
containerd parameters customizable for node pools | The containerd parameters of a node pool can be customized. For example, you can configure multiple registry mirrors for an image registry or configure the container runtime to skip certificate authentication when pulling container images from an image registry. | All regions | ||
Node pool scalability level displayed in the ACK console | The scalability levels of node pools are displayed in the ACK console. When instances are out of stock or specific instance types are not supported in the zones where a node pool is deployed, the node pool may fail to be scaled out. You can evaluate the configuration availability and instance inventory sufficiency of a node pool based on its scalability level. The ACK console provides suggestions for node pools based on their scalability levels. | All regions | ||
Batch job orchestration supported | Batch job orchestration is supported. Argo Workflows is a Kubernetes-native workflow engine. It allows you to use YAML or Python to orchestrate concurrent jobs in order to simplify the automation and management of containerized applications. It is suitable for CI/CD pipelines, data processing, and machine learning. You can install the Argo Workflows component to enable batch job orchestration. Then, you can use the Argo CLI or console to create and manage workflows. | All regions | ||
GPU diagnostics supported | GPU diagnostics is supported. ack-node-problem-detector is a monitoring component that ACK develops based on the open source node-problem-detector project for node exception detection. This component provides a rich variety of check items and enhanced GPU fault detection. When a GPU fault is detected, a Kubernetes event or Kubernetes node condition is generated to record the fault type and relevant information. | All regions | ||
Distributed Cloud Container Platform for Kubernetes (ACK One) | Spark jobs schedulable and distributable based on idle resources in multiple clusters | Idle resources can be used to schedule and distribute Spark jobs to multiple clusters. This topic describes how to use an ACK One Fleet instance and the ACK Koordinator component to use the idle resources in the clusters associated with the Fleet instance to schedule and distribute a Spark job across multiple clusters. This helps you utilize idle resources in multiple clusters. You can configure job priority and the colocation feature to prevent the online services from being affected by the Spark job. | All regions | Use idle resources to schedule and distribute Spark jobs in multiple clusters |
ACK Edge | New pod vSwitches addable | New pod vSwitches can be added for ACK Edge clusters. In ACK Edge cluster scenarios with the Terway Edge plug-in deployed, if you run out of vSwitch IP addresses or need to expand your pod CIDR block, you can add new pod vSwitches to provide additional IP addresses for the cluster. | All regions | |
GPU monitoring supported | GPU monitoring is supported. ACK Edge clusters allow you to manage GPU-accelerated nodes in the data center and at the edge. You can manage heterogeneous computing power across multiple regions and environments. You can connect an ACK Edge cluster to Managed Service for Prometheus. This way, GPU-accelerated nodes in the data center and at the edge can be monitored in the same way as nodes in the cloud. | All regions | Best practices for monitoring GPU resources in ACK Edge clusters | |
Cloud-native AI Suite | Inference services deployable from DeepSeek distilled models in ACK | Inference services deployable from DeepSeek distilled models in ACK. You can use KServe to deploy an inference service from a DeepSeek-R1-Distill-Qwen-7B model in an ACK cluster in a production environment. | All regions | Deploy an inference service from a DeepSeek distilled model in ACK |
Best practices for deploying the DeepSeek full version across multiple nodes in ACK released | A new topic is released to provide best practices for deploying a distributed inference service from a DeepSeek-R1-671B model across multiple nodes in ACK. This topic describes how to use Arena to efficiently deploy a distributed inference service on two nodes with a hybrid parallelism strategy. This topic also describes how to seamlessly integrate a DeepSeek-R1 deployed in ACK into the Dify platform to build an enterprise-level intelligent Q&A system that supports long text comprehension. | All regions | Practice for deploying the DeepSeek full version across multiple nodes in ACK |
January 2025
Product | Feature | Description | Region | References |
ACK | On-demand image loading supported to accelerate container startup in node pools | On-demand image loading is supported to accelerate container startup in node pools. ACK supports on-demand image loading based on the Data Accelerator for Disaggregated Infrastructure (DADI) feature. This allows you to download images on demand and decompress the image data online, which greatly reduces the container startup time. | All regions | |
Alibaba Cloud Linux 3 Container-optimized supported | Alibaba Cloud Linux 3 Container-optimized is supported. Alibaba Cloud Linux 3.2104 LTS 64-bit Container-optimized images are optimized for container scenarios based on the default standard images for Alibaba Cloud Linux, which is a cloud-native operating system. Alibaba Cloud Linux 3 Container-optimized images are developed by Alibaba Cloud in-house based on the extensive practical experience of a large number of customers on ACK. Alibaba Cloud Linux 3 Container-optimized images are suitable for container scenarios that require higher business deployment density, faster startup speeds, and a higher level of security isolation. | All regions | ||
Kubernetes 1.32 supported | Kubernetes 1.32 is supported. You can create ACK clusters that run Kubernetes 1.32 or upgrade ACK clusters from earlier Kubernetes versions to Kubernetes 1.32. | All regions | ||
Resource utilization improvable by using ElasticQuotaTree and ack-kube-queue | Resource utilization can be improved by using ElasticQuotaTree and ack-kube-queue. To allow different teams and jobs to share computing resources in a cluster and ensure effective resource allocation and isolation, you can use ack-kube-queue, ElasticQuotaTree, and ack-scheduler. | All regions | None | |
Best practices for fine-grained resource management of ACK clusters by using resource groups released | A new topic is released to provide best practices for using resource groups to manage resources in ACK clusters in a fine-grained manner. You can use resource groups to sort the resources into groups and manage the resource groups. Resource groups allow you to sort resources into groups by department, project, and environment, and use Resource Access Management (RAM) to isolate resources and manage resource permissions in a fine-grained manner within a single Alibaba Cloud account. | All regions | ||
ACK One | Computing power of ACS available in ACK One registered clusters | Computing power of Alibaba Cloud Container Compute Service (ACS) is available in ACK One registered clusters. | All regions | |
Cross-cluster Service access supported by using domain names | Services can be accessed across clusters by using domain names. ACK One provides the multi-cluster Services (MCS) feature to allow you to access Services across Kubernetes clusters by using domain names. This achieves cross-cluster Service traffic routing without the need to modify your business code or modify the dnsConfig field or CoreDNS configurations for your business pods. | All regions | ||
Multi-cluster resources accessible by using the SDK for Go | Multi-cluster resources can be accessed by using the SDK for Go. If you want to integrate ACK One Fleet instances into the platform to access the resources of each cluster, you can use the SDK for Go. | All regions | ||
ACK Edge | ECS nodes supported for scaling activities in ACK Edge clusters | ECS nodes can be used for scaling activities in ACK Edge clusters. When the resources provided by on-premises machines in an ACK Edge cluster become insufficient, you can use the node auto scaling feature to automatically add ECS nodes to the cluster to supplement resource capacity for scheduling. | All regions | |
Elastic inference services deployable from LLMs in hybrid cloud environments | Large language models (LLMs) can be used to deploy elastic inference services in hybrid cloud environments. You can use the ack-kserve component and the auto scaling feature of ACK Edge clusters to deploy LLMs as elastic inference services in hybrid cloud environments. This helps you flexibly schedule resources in the cloud and resources in data centers and reduce the operational costs of inference services deployed by using LLMs. | All regions | ||
GPU sharing supported | GPU sharing is supported. GPU sharing allows you to schedule multiple pods to the same GPU to share the computing resources of the GPU. This improves GPU utilization and reduces costs.
| All regions | ||
Centralized management of ECS resources in multiple regions supported | A new topic is released to provide best practices for how to use an ACK Edge clusters to centrally manage compute resources that reside in multiple regions. This topic helps you implement full lifecycle management and efficient resource scheduling for cloud-native applications. | All regions |
References
To view the historical release notes for ACK, see Historical release notes (before 2025).