Document Center

All Products

Document Center

Container Service for Kubernetes:ACK 2025 release notes

Last Updated:Mar 25, 2026

This article covers the 2025 release notes for Container Service for Kubernetes (ACK).

Important

For the latest release notes for Container Service for Kubernetes (ACK), see Release notes.

December 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	Node pools support pay-as-you-go vulnerability remediation from Security Center	You can enable operating system (OS) CVE vulnerability remediation to scan nodes for security vulnerabilities, receive suggestions, and apply quick fixes in the console. Before you use this feature, you must activate Security Center Ultimate or purchase pay-as-you-go vulnerability remediation.	All regions	Remediate OS CVE vulnerabilities in node pools
	APIG Ingress support	APIG Ingress is a cloud-native API gateway built on the open-source Higress gateway. It is compatible with NGINX Ingress and ideal for API management and microservice scenarios.	All regions	Manage APIG Ingress
	Kagent support	Kagent is a framework for building, deploying, and running AI applications on Kubernetes. Once deployed, Kagent lets you use declarative APIs to create agents and MCP servers, and integrate with multiple large language models.	All regions	Kagent
	Manage tags for NAS, OSS, and CPFS with CNFS	CNFS allows you to add tags to NAS, OSS, and CPFS cloud storage resources. This enables fine-grained classification and permission management, improving resource governance efficiency.	All regions	Manage tags for NAS, OSS, and CPFS
	Use Kyverno as a policy engine	Kyverno is a Kubernetes-native policy engine that defines and enforces security, compliance, and automation policies by using a Policy-as-Code approach. Compared with OPA Gatekeeper, which is integrated into clusters by default, Kyverno allows you to define policies by using YAML without the need to learn Rego. Kyverno also supports mutating and generating resources at the admission stage. This makes it ideal for scenarios that require highly customized policies, automated O&M, or multi-cluster policy governance.	All regions	Use Kyverno as a policy engine
	Secure confidential container environments with remote attestation	PeerPod remote attestation ensures that confidential containers run in a genuine and untampered confidential computing environment, such as Intel TDX. It provides end-to-end security for sensitive workloads by automatically verifying nodes before container deployment and allowing applications to obtain environmental attestations on demand at runtime.	All regions	Use remote attestation to secure confidential container environments
	Deploy A2A protocol servers in Knative	Agent2Agent (A2A) is an open standard designed to enable seamless communication and collaboration between AI agents. By deploying an A2A server in Knative, you can leverage features such as auto scaling (including scale to zero) to achieve on-demand resource usage and rapid version iteration.	All regions	Deploy A2A in Knative
	ack-agent-gateway best practices	Agent2Agent (A2A) traffic governance and authentication: To enable AI agent applications to quickly serve external traffic, install the ack-agent-gateway extension based on the Gateway API for fine-grained management of A2A protocol traffic. Build an MCP service gateway: To expose MCP services in an ACK cluster to external LLMs, install the ack-agent-gateway extension based on the Gateway API to quickly and securely route MCP traffic.	All regions	Build an A2A service gateway Build an MCP service gateway
ACK One	FederatedHPA best practice based on vLLM custom metrics	Due to traffic fluctuations, large language model (LLM) online services are commonly deployed in a multi-cluster architecture. The ACK One multi-cluster solution is ideal for this scenario. This tutorial describes how to deploy a vLLM inference service in a cloud environment based on an ACK One fleet and use FederatedHPA for cross-cluster elastic scaling.	All regions	FederatedHPA best practice based on vLLM custom metrics

November 2025

Product	Feature name	Description	Region	Related documentation
Container Service for Kubernetes	Support for deploying and running GPU workloads in smart managed mode	After you enable smart managed mode for a cluster, you can use smart managed node pools to dynamically scale GPU resources. This significantly reduces costs for GPU workload scenarios with variable demand, such as online inference.	All	Deploy and run GPU workloads
	New servicemesh-operator component	The servicemesh-operator component simplifies the deployment, upgrade, and configuration management of Service Mesh (ASM) in ACK clusters, letting you quickly enable powerful ASM features such as traffic management, security, and observability.	All	servicemesh-operator
	New built-in FinOps rule library	You can configure security policies for Pods to validate Pod deployment and update requests. The ACK cluster policy management feature provides multiple built-in rule libraries, including Compliance, Infra, K8s-general, PSP, and FinOps.	All	Container security policy rule libraries
	Support for deploying MCP Server on Knative	By hosting MCP Server on Knative, you can leverage the benefits of its Serverless architecture, such as on-demand autoscaling and event-driven capabilities for AI services.	All	Deploy MCP Server on Knative
	Best practices for configuring rolling updates and graceful shutdown	To ensure zero-downtime application updates in ACK clusters, you can configure Deployment settings such as readiness probes, readinessGates, preStop hooks, and Server Load Balancer (SLB) graceful shutdown. This enables smooth traffic migration and ensures continuous high availability for your services.	All	Implement zero-downtime rolling deployments
Distributed Cloud Container Platform ACK One	Best practices for multi-cluster priority-based elastic scheduling	The ACK One fleet supports AI inference services. In multi-cluster scenarios that span across regions or hybrid clouds, you can set cluster priorities to prioritize resources from an Internet Data Center (IDC) or a primary region, while using resources on Alibaba Cloud or in a secondary region as backup capacity. Combined with inventory-aware scheduling, this approach ensures business continuity.	All	Multi-cluster priority-based elastic scheduling

October 2025

Product	Feature name	Description	Region	Related documentation
	Schedule GPUs with DRA	For AI training and inference scenarios that require sharing GPU resources, deploy the NVIDIA DRA driver in your ACK cluster to overcome the limitations of traditional device plugins. This enables dynamic GPU allocation and fine-grained resource control between pods through the Kubernetes DRA API, improving GPU utilization and reducing costs.	All	Schedule GPUs with DRA
ACK One	ACS GPU-HPN capacity reservation for registered clusters	Register your on-premises Kubernetes clusters with ACK One to use the GPU-HPN capacity reservation feature. This enables unified management and intelligent scheduling of GPU resources across hybrid cloud environments, providing stable, high-performance computing power for critical workloads like AI training and inference.	All	Example: Use ACS GPU HPN computing power in an ACK One registered cluster
ACK One	Collect control plane metrics with a self-managed Prometheus	For hybrid cloud environments that use a self-managed Prometheus system, you can monitor the control plane health of ACK One registered clusters. Install the Metrics Aggregator component and configure a ServiceMonitor to integrate core component metrics into your existing monitoring system. This enables unified alerting and observability.	All	Collect control plane metrics with a self-managed Prometheus
Cloud-native AI Suite	Submit eRDMA-accelerated PyTorch distributed training jobs with Arena	When network latency bottlenecks multi-node GPU training, use Arena to shorten model training cycles. Submit a PyTorch distributed training job and configure eRDMA network acceleration to achieve low-latency, high-throughput communication between nodes. This improves training efficiency and cluster utilization.	All	Submit eRDMA-accelerated PyTorch distributed training jobs with Arena

September 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	Support for Kubernetes 1.34	ACK now supports Kubernetes 1.34. You can create new clusters that run version 1.34 or upgrade existing compatible clusters to this version.	All	Kubernetes 1.34
	Support for hybrid cloud node pools	Create a hybrid cloud node pool in an ACK Pro cluster to manage on-premises servers and cloud resources from a single orchestration plane. Add existing hybrid cloud nodes to your cluster to enable elastic scaling and optimize costs by using your current IT assets.	All	Create and manage a hybrid cloud node pool
	Support for configuring DNS resolution for hybrid cloud node pools	When a hybrid cloud node pool resolves domain names through CoreDNS in the cloud, frequent access can increase the load on your leased line and may lead to DNS resolution failures if the connection is unstable. To mitigate these issues, you can configure NodeLocal DNSCache to cache DNS queries locally on each node.	All	Configure NodeLocal DNSCache for a hybrid cloud node pool
	Support for the Terway-Hybrid network plugin	Connecting a hybrid cloud node pool to an on-premises data center introduces complex network topologies and cross-domain routing requirements that standard container network plugins cannot handle. The Terway-Hybrid network plugin is designed for hybrid cloud node pools and ensures seamless network connectivity for Pods across your data center and the cloud.	All	Use the Terway-Hybrid network plugin
	RRSA authentication for ossfs 2.0	For applications that require persistent storage or need to share data between Pods, you can mount an OSS bucket as an ossfs 2.0 volume by using a dynamic PV. We recommend using RRSA for authentication because it enhances security by providing auto-rotating temporary credentials and Pod-level permission isolation. This method is ideal for production, multi-tenant, and other high-security environments.	All	Use an ossfs 2.0 dynamic volume
ACK One (Alibaba Cloud Distributed Cloud Container Platform)	Support for integrating cloud-based GPU compute power	By providing unified scheduling and O&M for heterogeneous computing resources, ACK One registered clusters significantly improve resource utilization.	All	Integrate cloud-based GPU compute power
	Migrate single-cluster applications to a fleet for multi-cluster distribution	Use the AMC command-line tool to deploy an application to multiple clusters. This reduces repetitive work, prevents configuration drift, and enables centralized management with automatic synchronization for future updates.	All	Migrate single-cluster applications to a fleet and distribute them to multiple clusters

August 2025

Product	Feature	Description	Region	Documentation
Container Service for Kubernetes	KV Cache-aware load balancing with intelligent inference routing	Designed for generative AI inference, KV Cache-aware load balancing dynamically routes requests to the optimal compute node, significantly improving the efficiency of large language model (LLM) services.	All regions	Use prefix cache-aware routing in precise mode
	Support for custom CNI plugins	While the default Terway and Flannel CNI plugins in ACK meet most container networking needs, some scenarios require other plugins. ACK now supports a Bring Your Own Container Network Interface (BYOCNI) mode, letting you install a custom CNI plugin in your cluster.	All regions	Use a custom CNI plugin in an ACK cluster
	Managed policy governance for smart managed mode clusters	Enable the security policy management feature to meet compliance requirements and enhance cluster security. Security policy rules include Infra, Compliance, PSP, and K8s-general.	All regions	Enable security policy management
	Knative support for ACS compute resources	You can now configure Knative Services to use compute resources from Alibaba Cloud Container Compute Service (ACS). This configuration lets you leverage diverse compute types and service levels to meet various workload demands and optimize costs.	All regions	Use ACS resources
	More flexible configurations for Gateway with Inference Extension	Customize inference extension configurations: You can adjust routing policies through annotations or modify or override the extension's deployment configuration by creating a ConfigMap. Customize Gateway configurations: You can modify the EnvoyProxy resource to adjust gateway parameters such as the Service type, replica count, and resource allocation.	All regions	Customize inference extension configurations
	Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters	Large language model (LLM) inference involves sensitive data and core model assets, placing them at risk of exposure in untrusted environments. The Confidential AI solution for ACK (ACK-CAI) mitigates this risk by integrating hardware-based confidential computing technologies, such as Intel TDX and GPU TEE, to provide end-to-end security for model inference.	All regions	Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters
Cloud-native AI Suite	Introducing the AI Serving Stack	As large language models (LLMs) become more widespread, efficiently deploying and managing them in production has become a key challenge for enterprises. The AI Serving Stack, built on ACK, is an end-to-end solution for cloud-native AI inference. It manages the entire LLM inference lifecycle with integrated capabilities for deployment, intelligent routing, elastic scaling, and in-depth observability. Whether you are getting started or running large-scale AI services, the AI Serving Stack simplifies cloud-native AI inference.	All regions	AI Serving Stack

July 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	Access ECS instance metadata in enforced mode only	You can use the metadata service to retrieve metadata from an ECS instance, such as its instance ID, VPC information, and ENI information. By default, the metadata access mode for nodes in an ACK cluster supports both normal and enforced modes. You can now configure nodes to use enforced mode only (IMDSv2) to enhance the security of the instance metadata service.	All regions	Access ECS instance metadata in enforced mode only
	Subscribe to images from international registries	You can now use the artifact subscription feature in an Enterprise Edition instance of Container Registry (ACR) to automatically synchronize images from international registries, such as Docker Hub, GCR, and Quay.	All regions	Obtain images from international registries through artifact subscription
	Mount NAS by using the EFC client through CNFS	Extreme File Cache (EFC) improves the performance of Apsara File Storage NAS with features like distributed caching. It supports high concurrency and parallel access to large-scale datasets, making it ideal for data-intensive containerized applications such as big data analytics, and AI training and inference. Compared to mounting NAS with the standard NFS protocol, using EFC accelerates file access and improves read/write performance.	All regions	Mount NAS by using the EFC client through CNFS
ACK One (Distributed Cloud Container Platform)	Console-based management for GitOps	You can now manage the full suite of GitOps capabilities from the console. This includes enabling or disabling the feature, configuring public access and ACLs, accessing the ApplicationSet UI, managing the Argo CD ConfigMap, restarting components, and viewing monitoring and log data.	All regions	GitOps quick start
	Argo CD ConfigMap configuration for multi-cluster GitOps	ACK One lets you manage GitOps-related features and permissions by configuring the Argo CD ConfigMap.	All regions	Configure the Argo CD ConfigMap
	Inventory-aware elastic scheduling for multi-cluster fleets	ACK One now provides an inventory-aware intelligent scheduler for multi-cluster fleets in multi-region deployments. When a cluster in the fleet lacks sufficient resources, the scheduler automatically deploys applications to another cluster that has available inventory. The target cluster then uses its instant elasticity feature to scale up nodes as needed, improving scheduling success rates and reducing resource costs.	All regions	Cross-region multi-cluster elastic scheduling based on inventory awareness
Container Service for Edge (ACK@Edge)	Configure a private connection for a leased line connection	ACK@Edge clusters can now connect to the cloud over a leased line connection. This enables edge nodes to securely and efficiently access cloud services such as ACK and Container Registry (ACR), resolving common issues like network conflicts and a lack of fixed IP addresses.	All regions	Configure a private connection for a leased line connection

June 2025

Product	Feature name	Description	Region	Related documentation
Container Service for Kubernetes	AI profiling	AI Profiling is a non-invasive performance analysis tool for Kubernetes container environments. Using eBPF and dynamic process injection, it detects container processes that run GPU tasks. You can dynamically start or stop performance data collection on running workloads without modifying application code. This on-demand attachment and detachment allows for detailed, real-time analysis of production services.	All regions	AI Profiling
	GPU node auto-healing	Node auto-healing now repairs instances affected by GPU hardware and software failures. ACK provides Kubernetes-native auto-healing for EGS and Lingjun node failures. This feature automates the entire operational lifecycle, from fault detection and alerting to automatic isolation, node drain, and automated repair. You can also require user authorization before initiating repairs, which enhances automated fault management and reduces cluster O&M costs.	All regions	Enable Node Auto-healing
	CPFS for AI static volumes	CPFS for AI delivers ultra-high throughput and IOPS with end-to-end RDMA network acceleration, making it ideal for AI computing scenarios like AIGC and autonomous driving. You can create CPFS for AI static volumes in your cluster and use them in your workloads.	All regions	Use CPFS for AI Static Volumes
	ACK VPD CNI component	ACK VPD CNI provides container network management capabilities for Lingjun nodes in ACK Pro clusters. As the container network CNI plugin for Lingjun nodes, it allocates and manages container network resources for nodes that use Lingjun connections.	All regions	ACK VPD CNI
	ack-kms-agent-webhook-injector component	The ack-kms-agent-webhook-injector injects the KMS Agent as a sidecar container into your Pods. Applications can then fetch credentials from a KMS instance through a local HTTP interface and cache them in memory. This approach avoids hard-coding sensitive information and enhances data security.	All regions	Import Alibaba Cloud KMS Service Credentials for Applications
	Expanded capabilities for Gateway with Inference Extension	Gateway with Inference Extension now supports multiple generative AI inference frameworks, including vLLM and SGLang. It enhances services deployed on these frameworks with features like canary releases, inference load balancing, and model name-based routing. You can also configure rate limiting and circuit breaking policies for your inference services.	All regions	Gateway with Inference Extension Traffic Management and Inference Service Management
	CAA solution for confidential containers on confidential virtual machines	For scenarios that require confidential computing, such as financial risk control and healthcare, you can deploy confidential computing workloads in ACK clusters by using the Cloud API Adaptor (CAA) solution. This solution uses Intel® TDX technology to protect sensitive data from external attacks and potential threats from the cloud provider, helping you meet industry compliance requirements.	All regions	Implement CAA Confidential Container Solution Based on Confidential VMs
Cloud-native AI Suite	Schedule Dify workflows with XXL-JOB	In many scenarios, Dify workflows require a scheduler to automate tasks such as risk monitoring, data analysis, content generation, and data synchronization. However, Dify does not include a built-in scheduler. This guide shows how to integrate XXL-JOB, a distributed task scheduler, to schedule and monitor your workflow applications and ensure their stability.	All regions	Schedule Dify Workflow Applications via XXL-JOB

May 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	Support for Kubernetes 1.33	ACK supports Kubernetes 1.33. You can create new clusters that run version 1.33 or upgrade existing clusters to this version.	All regions	Kubernetes 1.33
	Default installation of ack-ram-authenticator component	Starting with Kubernetes 1.33, new ACK managed clusters automatically install the latest version of the managed ack-ram-authenticator component, without consuming additional cluster node resources.	All regions	Product announcement: ack-ram-authenticator installed by default on ACK managed clusters starting with Kubernetes 1.33
	containerd 2.1.1 is available	containerd 2.1.1 introduces new features, such as the Node Resource Interface (NRI), Container Device Interface (CDI), and Sandbox API.	All regions	containerd runtime release notes
	Support for ossfs 2.0	ossfs 2.0 is a Filesystem in User Space (FUSE)-based client that lets you mount an OSS bucket as a local file system. This allows application containers to access data in OSS using standard POSIX file operations. Compared to ossfs 1.0, ossfs 2.0 delivers improved sequential read/write performance and higher throughput for concurrent small-file reads. It is ideal for workloads that require high storage access performance, such as AI training and inference, big data processing, and autonomous driving.	All regions	ossfs 2.0
ACK One	Use ApplicationSet to coordinate multi-environment deployments and application dependencies	This new best practice guide demonstrates how to combine the Progressive Syncs feature of Argo CD with the multi-environment orchestration capabilities of ApplicationSet. Learn to build an automated deployment system that manages application dependencies between development and pre-production environments.	All regions	Use ApplicationSet to coordinate multi-environment deployments and application dependencies

April 2025

Product	Feature	Description	Release region	Related documentation
Container Service for Kubernetes	Create and manage Lingjun node pools	You can create and manage a Lingjun node pool in an ACK managed cluster Pro.	All regions	Lingjun node pools
	Configure a node pool by specifying instance attributes	You can configure a node pool by specifying instance attributes, such as the number of vCPUs and the amount of memory. The node pool then automatically selects suitable instance types that meet your requirements during scale-out, improving the success rate of scaling operations.	All regions	Configure a node pool by specifying instance attributes
	Real-time AI Profiling	AI Profiling is a non-intrusive performance analysis tool that uses eBPF and dynamic process injection to perform online diagnostics of container processes running GPU tasks in Kubernetes. You can dynamically attach and detach the profiler to perform real-time analysis of live services without modifying your application code.	All regions	Use AI Profiling from the command line
	Enable preemption	When cluster resources are tight, high-priority tasks may fail to run. After you enable preemption, ACK Scheduler simulates scheduling decisions and evicts low-priority Pods to free compute resources, allowing high-priority tasks to start faster.	All regions	Enable preemption
	Access services through Gateway with Inference Extension	The Gateway with Inference Extension component is built on the Envoy Gateway project. It supports all the basic capabilities of the Gateway API and the extended resources of the open-source Envoy Gateway.	All regions	Access services through Gateway with Inference Extension
	Generative AI service enhancements	You can use the Gateway with Inference Extension component to implement features such as intelligent routing, efficient traffic management, canary releases for generative AI inference services, circuit breaking for inference requests, and traffic mirroring.	All regions	Generative AI service enhancements
	Back up and restore volumes from PVC to PVC	You can back up and restore cloud disk data within the same ACK cluster, or between ACK clusters in the same or different regions. After a backup completes in the source cluster, you can use the Backup Center to restore the data to a new set of persistent volume claims and their corresponding volumes in the same or a different cluster. The restored volumes can be mounted directly without modifying any workload YAML configurations.	All regions	Backup Center
	alibabacloud-privateca-issuer released	AlibabaCloud Private CA Issuer is now available. It lets you create and manage Alibaba Cloud PCA certificates in your cluster by using cert-manager. The issuer is available in the ACK App Market.	All regions	None
	Deploy a workload and implement load balancing in an ACK managed cluster (smart managed mode)	Learn how to deploy a workload in an ACK managed cluster (smart managed mode) and expose it to the internet by using an ALB Ingress. Once the steps are complete, you can access the application through a specified domain name for efficient management and load balancing of external traffic.	All regions	Deploy a workload and implement load balancing
	Datapath V2 best practices	Learn how to optimize the network configuration of your cluster after you enable Datapath V2 in a cluster that uses the Terway network plugin. This includes configuring Conntrack parameters and managing Identity resources to improve cluster performance and stability.	All regions	Datapath V2 best practices
	Dify component upgrade guide	Upgrade ack-dify from an earlier version to v1.0.0 or later. The process includes backing up data, installing the plugin migration tool, and enabling the new plugin ecosystem.	All regions	Upgrade the Dify component in an ACK cluster
ACK One	Use PrivateLink to resolve IP conflicts in a data center network	After you connect a Kubernetes cluster in your data center to an ACK One registered cluster over a leased line, IP address conflicts may occur when you use Serverless computing resources if other services in your internal network use the same CIDR block. You can use PrivateLink to resolve these IP address conflicts.	All regions	Use PrivateLink to resolve IP conflicts in a data center network
	Schedule ACS Pods across regions	An ACK One registered cluster can seamlessly integrate Serverless computing resources from multiple regions into a Kubernetes cluster. This enables dynamic scheduling and unified management of GPU resources across regions.	All regions	Schedule ACS Pods across regions
	Log collection	You can configure log collection by using SLS CRDs or environment variables to automatically collect container logs with Alibaba Cloud Log Service (SLS).	All regions	Collect logs by using SLS CRDs Collect logs by using environment variables
ACK Edge	Version 1.32 released	Version 1.32 is now supported. Features include optimizing requests from CoreDNS, kube-proxy, and kubelet to the kube-apiserver, reducing cloud-to-edge communication traffic, and more.	All regions	ACK Edge Kubernetes 1.32 release notes
ACK Edge	Network element configuration in a leased line environment	You can connect server devices from your on-premises data center to ACK for containerized management over the internet or a leased line. When connecting over a leased line, you must first configure the network elements of your infrastructure.	All regions	Configure network elements in a leased line environment
AI Engineering Suite	HistoryServer component support	The native Ray Dashboard is only available while a cluster is running, preventing access to historical logs and monitoring data after the cluster is terminated. The RayCluster HistoryServer solves this by collecting node logs in real time and persisting them to OSS while the cluster is running.	All regions	Install the HistoryServer component in ACK
AI Engineering Suite	KubeRay component support	You can deploy the KubeRay Operator component and integrate it with Alibaba Cloud SLS and Prometheus for monitoring. This enhances log management, system observability, and high availability.	All regions	Install the KubeRay component in ACK

March 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	ACK Pro managed clusters support intelligent hosting mode	When you create an ACK managed cluster, you can enable intelligent hosting mode to quickly provision a Kubernetes cluster that follows best practices. After the cluster is created, ACK automatically provisions an intelligent managed node pool. This node pool dynamically scales based on workload demand. ACK also handles all operational tasks for this node pool, including OS version upgrades, software updates, and security patching.	All regions	Intelligent hosting mode overview Create an ACK managed cluster (intelligent hosting mode)
	Enable tracing for control plane and data plane components	After you enable tracing for the cluster API Server or kubelet, trace data automatically flows to Managed Service for OpenTelemetry. This integration provides detailed trace visualizations, real-time topology maps, and other monitoring data.	All regions	Enable tracing for control plane components Enable tracing for data plane components
	High-risk KubeConfig SMS and email notifications	You can now receive SMS and email alerts for high-risk KubeConfig files that still pose a security risk after deletion.	All regions	None
	Intelligent routing and traffic management with ACK Gateway with Inference Extension	Use the ACK Gateway with Inference Extension component to enable intelligent routing and efficient traffic management for your inference services.	All regions	Implement intelligent routing and traffic management with ACK Gateway with Inference Extension
ACK One (Distributed Cloud Container Platform)	Unified component management for multi-cluster fleets	ACK One fleets provide cluster operators a unified, automated way to manage components. You can define baselines that include multiple components and their versions, and then deploy these baselines to multiple clusters. The feature also supports component configuration, batch deployments, and rollbacks to improve system stability.	All regions	Multi-cluster component management
ACK One (Distributed Cloud Container Platform)	Dynamic distribution and rescheduling	An ACK One fleet can use a PropagationPolicy to distribute workload replicas across member clusters based on their available resources. By default, the fleet's rescheduling feature automatically checks every two minutes for unschedulable Pods. If a Pod remains in this state for more than 30 seconds, the feature triggers a rescheduling of that replica.	All regions	Dynamic distribution and rescheduling
Cloud-native AI Suite	Set Slurm queue priorities	This new best practice guide explains how to configure queue policies in a Slurm environment for optimal task scheduling and performance when jobs are submitted or their states change.	All regions	Set Slurm queue priorities in an ACK cluster

February 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	Support for modifying control plane security groups and time zones	If the security group or time zone selected during cluster creation no longer meets your requirements, you can modify the control plane security group and the cluster time zone on the cluster's Basic Information page.	All	View cluster information
	Node pools support custom containerd configurations	You can customize containerd parameters for nodes in a node pool. For example, you can configure multiple mirror repositories for a specific image registry or bypass security certificate verification for a specific registry.	All	Customize containerd parameter configurations for a node pool
	Elasticity strength indicator for node pools	A node pool scale-out may fail due to insufficient instance inventory or an unsupported instance type in the selected availability zone. The elasticity strength indicator helps you assess the availability of your node pool configuration and the health of the instance supply, and provides configuration recommendations.	All	View the elasticity strength of a node pool
	Support for batch task orchestration	Argo Workflows is a Kubernetes-native workflow engine that orchestrates parallel tasks using `YAML` or `Python`. It simplifies the automation and management of containerized applications for use cases such as CI/CD pipelines, data processing, and machine learning. You can enable batch task orchestration by installing the `Argo Workflows` component and then create and manage workflow tasks with the Alibaba Cloud Argo CLI or the console.	All	Enable batch task orchestration
	GPU fault detection	The ack-node-problem-detector component provided by ACK is an enhanced version of the open-source `node-problem-detector` project that improves the monitoring of node anomaly events. It offers a comprehensive set of GPU-specific fault detection checks to enhance fault discovery in GPU scenarios. When a fault is detected, it generates a corresponding Kubernetes `Event` or Kubernetes `Node` `Condition` based on the fault type.	All	GPU fault detection and automatic isolation
ACK One	Schedule and distribute multi-cluster Spark jobs based on actual remaining resources	This topic describes how to use an ACK One fleet and the ACK Koordinator component to schedule and distribute multi-cluster Spark jobs based on actual remaining resources, rather than requested resources. This approach maximizes idle resource utilization across multiple clusters and uses priority control with hybrid deployment of online and offline workloads to ensure the stability of online services.	All	Schedule and distribute multi-cluster Spark jobs based on actual remaining resources
ACK Edge	Support for adding pod vSwitches	In ENS edge scenarios, if an ACK Edge cluster uses the Terway Edge plugin, you can add a new pod vSwitch to increase the number of available IP addresses for the cluster. This is useful when the existing vSwitch runs out of IP addresses or you need to expand the pod CIDR block.	All	Add a pod vSwitch
ACK Edge	GPU resource monitoring	An ACK Edge cluster can manage GPU nodes in data centers and at the edge, providing unified management of heterogeneous compute power across multiple regions and environments. You can integrate Prometheus monitoring with your ACK Edge cluster to provide your on-premises and edge GPU nodes with the same level of observability as your cloud resources.	All	Best practices for monitoring GPU resources of an ACK Edge cluster
Cloud Native AI Suite	Deploy a DeepSeek distilled model inference service on ACK	This topic explains how to use KServe to deploy a production-ready DeepSeek distilled model inference service on ACK, using the `DeepSeek-R1-Distill-Qwen-7B` model as an example.	All	Deploy a DeepSeek distilled model inference service on ACK
Cloud Native AI Suite	Tutorial: Deploy a full-parameter DeepSeek inference service on ACK using multi-machine distributed deployment	This tutorial provides a solution for the distributed deployment of the large-scale `DeepSeek-R1-671B` model on ACK. This solution uses a hybrid parallel strategy and the `Arena` tool to efficiently deploy the model across two nodes. It also shows you how to integrate the deployed `DeepSeek-R1` service with the Dify platform to quickly build an enterprise-grade Q&A system that supports long-context understanding.	All	Tutorial: Deploy a DeepSeek full-parameter inference service by using multi-machine distributed deployment on ACK

January 2025

Product	Feature	Description	Region	Related documentation
Container Service for Kubernetes	On-demand image acceleration for node pools	ACK supports on-demand loading of container images, powered by DADI (Data Accelerator for Disaggregated Infrastructure). This technology eliminates full image downloads and decompresses data on the fly, significantly reducing application startup time.	All Regions	Accelerate container startup by using on-demand image loading
	Support for Alibaba Cloud Linux 3 Container Optimized Edition	Alibaba Cloud Linux 3 Container Optimized Edition (Alibaba Cloud Linux 3.2104 LTS 64-bit Container Optimized Edition) is an image based on the standard Alibaba Cloud Linux image and optimized for containerized environments. Drawing on extensive customer experience with ACK, Alibaba Cloud developed this cloud-native operating system to meet the demands of container scenarios, such as higher deployment density, faster startup speeds, and enhanced security isolation.	All Regions	Alibaba Cloud Linux 3 Container Optimized Edition image Operating system
	Support for Kubernetes 1.32	ACK supports Kubernetes 1.32. You can create new clusters that run version 1.32 or upgrade existing clusters to this version.	All Regions	(End of support) Kubernetes 1.32
	Improve resource utilization with ElasticQuotaTree and task queues	ack-kube-queue, ElasticQuotaTree, and ack-scheduler enable fair and isolated resource allocation, allowing different teams and tasks to share compute resources within a cluster.	All Regions	N/A
	Best practice: Fine-grained resource control with resource groups	For more efficient management, organize your ACK resources into resource groups based on dimensions like department, project, or environment. By combining resource groups with Resource Access Management (RAM), you can implement resource isolation and fine-grained permission management within a single Alibaba Cloud account.	All Regions	Use resource groups for fine-grained resource control
ACK One	Connect ACK One registered clusters to ACS compute power	An ACK One registered cluster can use container compute power from ACS.	All Regions	Schedule pods to run on ACS by using virtual nodes
	Cross-cluster service access using native service domain names	ACK One uses MultiClusterService to enable cross-cluster access through native service domain names. You can route traffic across clusters by using the native service directly, without modifying your application code, pod DNS configurations, or CoreDNS settings.	All Regions	Access services across clusters by using native service domain names
	Access multi-cluster resources using the Go SDK	Use the Go SDK to integrate an ACK One fleet into your platform and access member cluster resources.	All Regions	Access multi-cluster resources by using the Go SDK
ACK Edge	Cloud node scaling	When on-premises node resources are insufficient, the auto-scaling feature scales out cloud-based nodes for your ACK Edge cluster to increase scheduling capacity.	All Regions	Cloud ECS node elasticity
	Deploy elastic inference services for LLMs in a hybrid cloud	By installing the ack-kserve component and using the cloud elasticity of ACK Edge clusters, you can deploy elastic LLM inference services in a hybrid cloud. This allows you to flexibly schedule on-premises and cloud resources, reducing the operational costs of your LLM inference services.	All Regions	Use the ack-kserve component Deploy elastic LLM inference in a hybrid cloud scenario
	GPU sharing and scheduling	GPU sharing allows multiple pods to share the compute resources of a single GPU card. This improves GPU utilization and reduces costs. Cloud node pools in ACK Edge clusters fully support shared GPU scheduling, GPU memory fencing, and computing power fencing. Edge node pools in ACK Edge clusters support only shared GPU scheduling, and do not support GPU memory fencing or computing power fencing.	All Regions	Use GPU sharing and scheduling
	GPU sharing and scheduling		All Regions	Use GPU sharing and scheduling
	Centrally manage ECS resources across regions	This best practice shows how to use an ACK Edge cluster to centrally manage compute resources distributed across different regions. This approach enables full lifecycle management and efficient resource scheduling for cloud-native applications.	All Regions	Centrally manage ECS resources across regions