All Products
Search
Document Center

Container Service for Kubernetes:ACK 2025 release notes

Last Updated:Mar 25, 2026

This article covers the 2025 release notes for Container Service for Kubernetes (ACK).

Important

For the latest release notes for Container Service for Kubernetes (ACK), see Release notes.

December 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

Node pools support pay-as-you-go vulnerability remediation from Security Center

You can enable operating system (OS) CVE vulnerability remediation to scan nodes for security vulnerabilities, receive suggestions, and apply quick fixes in the console. Before you use this feature, you must activate Security Center Ultimate or purchase pay-as-you-go vulnerability remediation.

All regions

Remediate OS CVE vulnerabilities in node pools

APIG Ingress support

APIG Ingress is a cloud-native API gateway built on the open-source Higress gateway. It is compatible with NGINX Ingress and ideal for API management and microservice scenarios.

All regions

Manage APIG Ingress

Kagent support

Kagent is a framework for building, deploying, and running AI applications on Kubernetes. Once deployed, Kagent lets you use declarative APIs to create agents and MCP servers, and integrate with multiple large language models.

All regions

Kagent

Manage tags for NAS, OSS, and CPFS with CNFS

CNFS allows you to add tags to NAS, OSS, and CPFS cloud storage resources. This enables fine-grained classification and permission management, improving resource governance efficiency.

All regions

Manage tags for NAS, OSS, and CPFS

Use Kyverno as a policy engine

Kyverno is a Kubernetes-native policy engine that defines and enforces security, compliance, and automation policies by using a Policy-as-Code approach. Compared with OPA Gatekeeper, which is integrated into clusters by default, Kyverno allows you to define policies by using YAML without the need to learn Rego. Kyverno also supports mutating and generating resources at the admission stage. This makes it ideal for scenarios that require highly customized policies, automated O&M, or multi-cluster policy governance.

All regions

Use Kyverno as a policy engine

Secure confidential container environments with remote attestation

PeerPod remote attestation ensures that confidential containers run in a genuine and untampered confidential computing environment, such as Intel TDX. It provides end-to-end security for sensitive workloads by automatically verifying nodes before container deployment and allowing applications to obtain environmental attestations on demand at runtime.

All regions

Use remote attestation to secure confidential container environments

Deploy A2A protocol servers in Knative

Agent2Agent (A2A) is an open standard designed to enable seamless communication and collaboration between AI agents. By deploying an A2A server in Knative, you can leverage features such as auto scaling (including scale to zero) to achieve on-demand resource usage and rapid version iteration.

All regions

Deploy A2A in Knative

ack-agent-gateway best practices

  • Agent2Agent (A2A) traffic governance and authentication: To enable AI agent applications to quickly serve external traffic, install the ack-agent-gateway extension based on the Gateway API for fine-grained management of A2A protocol traffic.

  • Build an MCP service gateway: To expose MCP services in an ACK cluster to external LLMs, install the ack-agent-gateway extension based on the Gateway API to quickly and securely route MCP traffic.

All regions

ACK One

FederatedHPA best practice based on vLLM custom metrics

Due to traffic fluctuations, large language model (LLM) online services are commonly deployed in a multi-cluster architecture. The ACK One multi-cluster solution is ideal for this scenario. This tutorial describes how to deploy a vLLM inference service in a cloud environment based on an ACK One fleet and use FederatedHPA for cross-cluster elastic scaling.

All regions

FederatedHPA best practice based on vLLM custom metrics

November 2025

Product

Feature name

Description

Region

Related documentation

Container Service for Kubernetes

Support for deploying and running GPU workloads in smart managed mode

After you enable smart managed mode for a cluster, you can use smart managed node pools to dynamically scale GPU resources. This significantly reduces costs for GPU workload scenarios with variable demand, such as online inference.

All

Deploy and run GPU workloads

New servicemesh-operator component

The servicemesh-operator component simplifies the deployment, upgrade, and configuration management of Service Mesh (ASM) in ACK clusters, letting you quickly enable powerful ASM features such as traffic management, security, and observability.

All

servicemesh-operator

New built-in FinOps rule library

You can configure security policies for Pods to validate Pod deployment and update requests. The ACK cluster policy management feature provides multiple built-in rule libraries, including Compliance, Infra, K8s-general, PSP, and FinOps.

All

Container security policy rule libraries

Support for deploying MCP Server on Knative

By hosting MCP Server on Knative, you can leverage the benefits of its Serverless architecture, such as on-demand autoscaling and event-driven capabilities for AI services.

All

Deploy MCP Server on Knative

Best practices for configuring rolling updates and graceful shutdown

To ensure zero-downtime application updates in ACK clusters, you can configure Deployment settings such as readiness probes, readinessGates, preStop hooks, and Server Load Balancer (SLB) graceful shutdown. This enables smooth traffic migration and ensures continuous high availability for your services.

All

Implement zero-downtime rolling deployments

Distributed Cloud Container Platform ACK One

Best practices for multi-cluster priority-based elastic scheduling

The ACK One fleet supports AI inference services. In multi-cluster scenarios that span across regions or hybrid clouds, you can set cluster priorities to prioritize resources from an Internet Data Center (IDC) or a primary region, while using resources on Alibaba Cloud or in a secondary region as backup capacity. Combined with inventory-aware scheduling, this approach ensures business continuity.

All

Multi-cluster priority-based elastic scheduling

October 2025

Product

Feature name

Description

Region

Related documentation

Schedule GPUs with DRA

For AI training and inference scenarios that require sharing GPU resources, deploy the NVIDIA DRA driver in your ACK cluster to overcome the limitations of traditional device plugins. This enables dynamic GPU allocation and fine-grained resource control between pods through the Kubernetes DRA API, improving GPU utilization and reducing costs.

All

Schedule GPUs with DRA

ACK One

ACS GPU-HPN capacity reservation for registered clusters

Register your on-premises Kubernetes clusters with ACK One to use the GPU-HPN capacity reservation feature. This enables unified management and intelligent scheduling of GPU resources across hybrid cloud environments, providing stable, high-performance computing power for critical workloads like AI training and inference.

All

Example: Use ACS GPU HPN computing power in an ACK One registered cluster

Collect control plane metrics with a self-managed Prometheus

For hybrid cloud environments that use a self-managed Prometheus system, you can monitor the control plane health of ACK One registered clusters. Install the Metrics Aggregator component and configure a ServiceMonitor to integrate core component metrics into your existing monitoring system. This enables unified alerting and observability.

All

Collect control plane metrics with a self-managed Prometheus

Cloud-native AI Suite

Submit eRDMA-accelerated PyTorch distributed training jobs with Arena

When network latency bottlenecks multi-node GPU training, use Arena to shorten model training cycles. Submit a PyTorch distributed training job and configure eRDMA network acceleration to achieve low-latency, high-throughput communication between nodes. This improves training efficiency and cluster utilization.

All

Submit eRDMA-accelerated PyTorch distributed training jobs with Arena

September 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

Support for Kubernetes 1.34

ACK now supports Kubernetes 1.34. You can create new clusters that run version 1.34 or upgrade existing compatible clusters to this version.

All

Kubernetes 1.34

Support for hybrid cloud node pools

Create a hybrid cloud node pool in an ACK Pro cluster to manage on-premises servers and cloud resources from a single orchestration plane. Add existing hybrid cloud nodes to your cluster to enable elastic scaling and optimize costs by using your current IT assets.

All

Create and manage a hybrid cloud node pool

Support for configuring DNS resolution for hybrid cloud node pools

When a hybrid cloud node pool resolves domain names through CoreDNS in the cloud, frequent access can increase the load on your leased line and may lead to DNS resolution failures if the connection is unstable. To mitigate these issues, you can configure NodeLocal DNSCache to cache DNS queries locally on each node.

All

Configure NodeLocal DNSCache for a hybrid cloud node pool

Support for the Terway-Hybrid network plugin

Connecting a hybrid cloud node pool to an on-premises data center introduces complex network topologies and cross-domain routing requirements that standard container network plugins cannot handle. The Terway-Hybrid network plugin is designed for hybrid cloud node pools and ensures seamless network connectivity for Pods across your data center and the cloud.

All

Use the Terway-Hybrid network plugin

RRSA authentication for ossfs 2.0

For applications that require persistent storage or need to share data between Pods, you can mount an OSS bucket as an ossfs 2.0 volume by using a dynamic PV. We recommend using RRSA for authentication because it enhances security by providing auto-rotating temporary credentials and Pod-level permission isolation. This method is ideal for production, multi-tenant, and other high-security environments.

All

Use an ossfs 2.0 dynamic volume

ACK One (Alibaba Cloud Distributed Cloud Container Platform)

Support for integrating cloud-based GPU compute power

By providing unified scheduling and O&M for heterogeneous computing resources, ACK One registered clusters significantly improve resource utilization.

All

Integrate cloud-based GPU compute power

Migrate single-cluster applications to a fleet for multi-cluster distribution

Use the AMC command-line tool to deploy an application to multiple clusters. This reduces repetitive work, prevents configuration drift, and enables centralized management with automatic synchronization for future updates.

All

Migrate single-cluster applications to a fleet and distribute them to multiple clusters

August 2025

Product

Feature

Description

Region

Documentation

Container Service for Kubernetes

KV Cache-aware load balancing with intelligent inference routing

Designed for generative AI inference, KV Cache-aware load balancing dynamically routes requests to the optimal compute node, significantly improving the efficiency of large language model (LLM) services.

All regions

Use prefix cache-aware routing in precise mode

Support for custom CNI plugins

While the default Terway and Flannel CNI plugins in ACK meet most container networking needs, some scenarios require other plugins. ACK now supports a Bring Your Own Container Network Interface (BYOCNI) mode, letting you install a custom CNI plugin in your cluster.

All regions

Use a custom CNI plugin in an ACK cluster

Managed policy governance for smart managed mode clusters

Enable the security policy management feature to meet compliance requirements and enhance cluster security. Security policy rules include Infra, Compliance, PSP, and K8s-general.

All regions

Enable security policy management

Knative support for ACS compute resources

You can now configure Knative Services to use compute resources from Alibaba Cloud Container Compute Service (ACS). This configuration lets you leverage diverse compute types and service levels to meet various workload demands and optimize costs.

All regions

Use ACS resources

More flexible configurations for Gateway with Inference Extension

  • Customize inference extension configurations: You can adjust routing policies through annotations or modify or override the extension's deployment configuration by creating a ConfigMap.

  • Customize Gateway configurations: You can modify the EnvoyProxy resource to adjust gateway parameters such as the Service type, replica count, and resource allocation.

All regions

Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters

Large language model (LLM) inference involves sensitive data and core model assets, placing them at risk of exposure in untrusted environments. The Confidential AI solution for ACK (ACK-CAI) mitigates this risk by integrating hardware-based confidential computing technologies, such as Intel TDX and GPU TEE, to provide end-to-end security for model inference.

All regions

Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters

Cloud-native AI Suite

Introducing the AI Serving Stack

As large language models (LLMs) become more widespread, efficiently deploying and managing them in production has become a key challenge for enterprises. The AI Serving Stack, built on ACK, is an end-to-end solution for cloud-native AI inference. It manages the entire LLM inference lifecycle with integrated capabilities for deployment, intelligent routing, elastic scaling, and in-depth observability. Whether you are getting started or running large-scale AI services, the AI Serving Stack simplifies cloud-native AI inference.

All regions

AI Serving Stack

July 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

Access ECS instance metadata in enforced mode only

You can use the metadata service to retrieve metadata from an ECS instance, such as its instance ID, VPC information, and ENI information. By default, the metadata access mode for nodes in an ACK cluster supports both normal and enforced modes. You can now configure nodes to use enforced mode only (IMDSv2) to enhance the security of the instance metadata service.

All regions

Access ECS instance metadata in enforced mode only

Subscribe to images from international registries

You can now use the artifact subscription feature in an Enterprise Edition instance of Container Registry (ACR) to automatically synchronize images from international registries, such as Docker Hub, GCR, and Quay.

All regions

Obtain images from international registries through artifact subscription

Mount NAS by using the EFC client through CNFS

Extreme File Cache (EFC) improves the performance of Apsara File Storage NAS with features like distributed caching. It supports high concurrency and parallel access to large-scale datasets, making it ideal for data-intensive containerized applications such as big data analytics, and AI training and inference. Compared to mounting NAS with the standard NFS protocol, using EFC accelerates file access and improves read/write performance.

All regions

Mount NAS by using the EFC client through CNFS

ACK One (Distributed Cloud Container Platform)

Console-based management for GitOps

You can now manage the full suite of GitOps capabilities from the console. This includes enabling or disabling the feature, configuring public access and ACLs, accessing the ApplicationSet UI, managing the Argo CD ConfigMap, restarting components, and viewing monitoring and log data.

All regions

GitOps quick start

Argo CD ConfigMap configuration for multi-cluster GitOps

ACK One lets you manage GitOps-related features and permissions by configuring the Argo CD ConfigMap.

All regions

Configure the Argo CD ConfigMap

Inventory-aware elastic scheduling for multi-cluster fleets

ACK One now provides an inventory-aware intelligent scheduler for multi-cluster fleets in multi-region deployments. When a cluster in the fleet lacks sufficient resources, the scheduler automatically deploys applications to another cluster that has available inventory. The target cluster then uses its instant elasticity feature to scale up nodes as needed, improving scheduling success rates and reducing resource costs.

All regions

Cross-region multi-cluster elastic scheduling based on inventory awareness

Container Service for Edge (ACK@Edge)

Configure a private connection for a leased line connection

ACK@Edge clusters can now connect to the cloud over a leased line connection. This enables edge nodes to securely and efficiently access cloud services such as ACK and Container Registry (ACR), resolving common issues like network conflicts and a lack of fixed IP addresses.

All regions

Configure a private connection for a leased line connection

June 2025

Product

Feature name

Description

Region

Related documentation

Container Service for Kubernetes

AI profiling

AI Profiling is a non-invasive performance analysis tool for Kubernetes container environments. Using eBPF and dynamic process injection, it detects container processes that run GPU tasks. You can dynamically start or stop performance data collection on running workloads without modifying application code. This on-demand attachment and detachment allows for detailed, real-time analysis of production services.

All regions

AI Profiling

GPU node auto-healing

Node auto-healing now repairs instances affected by GPU hardware and software failures.

ACK provides Kubernetes-native auto-healing for EGS and Lingjun node failures. This feature automates the entire operational lifecycle, from fault detection and alerting to automatic isolation, node drain, and automated repair. You can also require user authorization before initiating repairs, which enhances automated fault management and reduces cluster O&M costs.

All regions

Enable Node Auto-healing

CPFS for AI static volumes

CPFS for AI delivers ultra-high throughput and IOPS with end-to-end RDMA network acceleration, making it ideal for AI computing scenarios like AIGC and autonomous driving. You can create CPFS for AI static volumes in your cluster and use them in your workloads.

All regions

Use CPFS for AI Static Volumes

ACK VPD CNI component

ACK VPD CNI provides container network management capabilities for Lingjun nodes in ACK Pro clusters. As the container network CNI plugin for Lingjun nodes, it allocates and manages container network resources for nodes that use Lingjun connections.

All regions

ACK VPD CNI

ack-kms-agent-webhook-injector component

The ack-kms-agent-webhook-injector injects the KMS Agent as a sidecar container into your Pods. Applications can then fetch credentials from a KMS instance through a local HTTP interface and cache them in memory. This approach avoids hard-coding sensitive information and enhances data security.

All regions

Import Alibaba Cloud KMS Service Credentials for Applications

Expanded capabilities for Gateway with Inference Extension

Gateway with Inference Extension now supports multiple generative AI inference frameworks, including vLLM and SGLang. It enhances services deployed on these frameworks with features like canary releases, inference load balancing, and model name-based routing. You can also configure rate limiting and circuit breaking policies for your inference services.

All regions

Gateway with Inference Extension Traffic Management and Inference Service Management

CAA solution for confidential containers on confidential virtual machines

For scenarios that require confidential computing, such as financial risk control and healthcare, you can deploy confidential computing workloads in ACK clusters by using the Cloud API Adaptor (CAA) solution. This solution uses Intel® TDX technology to protect sensitive data from external attacks and potential threats from the cloud provider, helping you meet industry compliance requirements.

All regions

Implement CAA Confidential Container Solution Based on Confidential VMs

Cloud-native AI Suite

Schedule Dify workflows with XXL-JOB

In many scenarios, Dify workflows require a scheduler to automate tasks such as risk monitoring, data analysis, content generation, and data synchronization. However, Dify does not include a built-in scheduler. This guide shows how to integrate XXL-JOB, a distributed task scheduler, to schedule and monitor your workflow applications and ensure their stability.

All regions

Schedule Dify Workflow Applications via XXL-JOB

May 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

Support for Kubernetes 1.33

ACK supports Kubernetes 1.33. You can create new clusters that run version 1.33 or upgrade existing clusters to this version.

All regions

Kubernetes 1.33

Default installation of ack-ram-authenticator component

Starting with Kubernetes 1.33, new ACK managed clusters automatically install the latest version of the managed ack-ram-authenticator component, without consuming additional cluster node resources.

All regions

Product announcement: ack-ram-authenticator installed by default on ACK managed clusters starting with Kubernetes 1.33

containerd 2.1.1 is available

containerd 2.1.1 introduces new features, such as the Node Resource Interface (NRI), Container Device Interface (CDI), and Sandbox API.

All regions

containerd runtime release notes

Support for ossfs 2.0

ossfs 2.0 is a Filesystem in User Space (FUSE)-based client that lets you mount an OSS bucket as a local file system. This allows application containers to access data in OSS using standard POSIX file operations. Compared to ossfs 1.0, ossfs 2.0 delivers improved sequential read/write performance and higher throughput for concurrent small-file reads. It is ideal for workloads that require high storage access performance, such as AI training and inference, big data processing, and autonomous driving.

All regions

ossfs 2.0

ACK One

Use ApplicationSet to coordinate multi-environment deployments and application dependencies

This new best practice guide demonstrates how to combine the Progressive Syncs feature of Argo CD with the multi-environment orchestration capabilities of ApplicationSet. Learn to build an automated deployment system that manages application dependencies between development and pre-production environments.

All regions

Use ApplicationSet to coordinate multi-environment deployments and application dependencies

April 2025

Product

Feature

Description

Release region

Related documentation

Container Service for Kubernetes

Create and manage Lingjun node pools

You can create and manage a Lingjun node pool in an ACK managed cluster Pro.

All regions

Lingjun node pools

Configure a node pool by specifying instance attributes

You can configure a node pool by specifying instance attributes, such as the number of vCPUs and the amount of memory. The node pool then automatically selects suitable instance types that meet your requirements during scale-out, improving the success rate of scaling operations.

All regions

Configure a node pool by specifying instance attributes

Real-time AI Profiling

AI Profiling is a non-intrusive performance analysis tool that uses eBPF and dynamic process injection to perform online diagnostics of container processes running GPU tasks in Kubernetes. You can dynamically attach and detach the profiler to perform real-time analysis of live services without modifying your application code.

All regions

Use AI Profiling from the command line

Enable preemption

When cluster resources are tight, high-priority tasks may fail to run. After you enable preemption, ACK Scheduler simulates scheduling decisions and evicts low-priority Pods to free compute resources, allowing high-priority tasks to start faster.

All regions

Enable preemption

Access services through Gateway with Inference Extension

The Gateway with Inference Extension component is built on the Envoy Gateway project. It supports all the basic capabilities of the Gateway API and the extended resources of the open-source Envoy Gateway.

All regions

Access services through Gateway with Inference Extension

Generative AI service enhancements

You can use the Gateway with Inference Extension component to implement features such as intelligent routing, efficient traffic management, canary releases for generative AI inference services, circuit breaking for inference requests, and traffic mirroring.

All regions

Generative AI service enhancements

Back up and restore volumes from PVC to PVC

You can back up and restore cloud disk data within the same ACK cluster, or between ACK clusters in the same or different regions. After a backup completes in the source cluster, you can use the Backup Center to restore the data to a new set of persistent volume claims and their corresponding volumes in the same or a different cluster. The restored volumes can be mounted directly without modifying any workload YAML configurations.

All regions

Backup Center

alibabacloud-privateca-issuer released

AlibabaCloud Private CA Issuer is now available. It lets you create and manage Alibaba Cloud PCA certificates in your cluster by using cert-manager. The issuer is available in the ACK App Market.

All regions

None

Deploy a workload and implement load balancing in an ACK managed cluster (smart managed mode)

Learn how to deploy a workload in an ACK managed cluster (smart managed mode) and expose it to the internet by using an ALB Ingress. Once the steps are complete, you can access the application through a specified domain name for efficient management and load balancing of external traffic.

All regions

Deploy a workload and implement load balancing

Datapath V2 best practices

Learn how to optimize the network configuration of your cluster after you enable Datapath V2 in a cluster that uses the Terway network plugin. This includes configuring Conntrack parameters and managing Identity resources to improve cluster performance and stability.

All regions

Datapath V2 best practices

Dify component upgrade guide

Upgrade ack-dify from an earlier version to v1.0.0 or later. The process includes backing up data, installing the plugin migration tool, and enabling the new plugin ecosystem.

All regions

Upgrade the Dify component in an ACK cluster

ACK One

Use PrivateLink to resolve IP conflicts in a data center network

After you connect a Kubernetes cluster in your data center to an ACK One registered cluster over a leased line, IP address conflicts may occur when you use Serverless computing resources if other services in your internal network use the same CIDR block. You can use PrivateLink to resolve these IP address conflicts.

All regions

Use PrivateLink to resolve IP conflicts in a data center network

Schedule ACS Pods across regions

An ACK One registered cluster can seamlessly integrate Serverless computing resources from multiple regions into a Kubernetes cluster. This enables dynamic scheduling and unified management of GPU resources across regions.

All regions

Schedule ACS Pods across regions

Log collection

You can configure log collection by using SLS CRDs or environment variables to automatically collect container logs with Alibaba Cloud Log Service (SLS).

All regions

ACK Edge

Version 1.32 released

Version 1.32 is now supported. Features include optimizing requests from CoreDNS, kube-proxy, and kubelet to the kube-apiserver, reducing cloud-to-edge communication traffic, and more.

All regions

ACK Edge Kubernetes 1.32 release notes

Network element configuration in a leased line environment

You can connect server devices from your on-premises data center to ACK for containerized management over the internet or a leased line. When connecting over a leased line, you must first configure the network elements of your infrastructure.

All regions

Configure network elements in a leased line environment

AI Engineering Suite

HistoryServer component support

The native Ray Dashboard is only available while a cluster is running, preventing access to historical logs and monitoring data after the cluster is terminated. The RayCluster HistoryServer solves this by collecting node logs in real time and persisting them to OSS while the cluster is running.

All regions

Install the HistoryServer component in ACK

KubeRay component support

You can deploy the KubeRay Operator component and integrate it with Alibaba Cloud SLS and Prometheus for monitoring. This enhances log management, system observability, and high availability.

All regions

Install the KubeRay component in ACK

March 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

ACK Pro managed clusters support intelligent hosting mode

When you create an ACK managed cluster, you can enable intelligent hosting mode to quickly provision a Kubernetes cluster that follows best practices.

After the cluster is created, ACK automatically provisions an intelligent managed node pool. This node pool dynamically scales based on workload demand. ACK also handles all operational tasks for this node pool, including OS version upgrades, software updates, and security patching.

All regions

Enable tracing for control plane and data plane components

After you enable tracing for the cluster API Server or kubelet, trace data automatically flows to Managed Service for OpenTelemetry. This integration provides detailed trace visualizations, real-time topology maps, and other monitoring data.

All regions

High-risk KubeConfig SMS and email notifications

You can now receive SMS and email alerts for high-risk KubeConfig files that still pose a security risk after deletion.

All regions

None

Intelligent routing and traffic management with ACK Gateway with Inference Extension

Use the ACK Gateway with Inference Extension component to enable intelligent routing and efficient traffic management for your inference services.

All regions

Implement intelligent routing and traffic management with ACK Gateway with Inference Extension

ACK One (Distributed Cloud Container Platform)

Unified component management for multi-cluster fleets

ACK One fleets provide cluster operators a unified, automated way to manage components. You can define baselines that include multiple components and their versions, and then deploy these baselines to multiple clusters. The feature also supports component configuration, batch deployments, and rollbacks to improve system stability.

All regions

Multi-cluster component management

Dynamic distribution and rescheduling

An ACK One fleet can use a PropagationPolicy to distribute workload replicas across member clusters based on their available resources. By default, the fleet's rescheduling feature automatically checks every two minutes for unschedulable Pods. If a Pod remains in this state for more than 30 seconds, the feature triggers a rescheduling of that replica.

All regions

Dynamic distribution and rescheduling

Cloud-native AI Suite

Set Slurm queue priorities

This new best practice guide explains how to configure queue policies in a Slurm environment for optimal task scheduling and performance when jobs are submitted or their states change.

All regions

Set Slurm queue priorities in an ACK cluster

February 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

Support for modifying control plane security groups and time zones

If the security group or time zone selected during cluster creation no longer meets your requirements, you can modify the control plane security group and the cluster time zone on the cluster's Basic Information page.

All

View cluster information

Node pools support custom containerd configurations

You can customize containerd parameters for nodes in a node pool. For example, you can configure multiple mirror repositories for a specific image registry or bypass security certificate verification for a specific registry.

All

Customize containerd parameter configurations for a node pool

Elasticity strength indicator for node pools

A node pool scale-out may fail due to insufficient instance inventory or an unsupported instance type in the selected availability zone. The elasticity strength indicator helps you assess the availability of your node pool configuration and the health of the instance supply, and provides configuration recommendations.

All

View the elasticity strength of a node pool

Support for batch task orchestration

Argo Workflows is a Kubernetes-native workflow engine that orchestrates parallel tasks using YAML or Python. It simplifies the automation and management of containerized applications for use cases such as CI/CD pipelines, data processing, and machine learning. You can enable batch task orchestration by installing the Argo Workflows component and then create and manage workflow tasks with the Alibaba Cloud Argo CLI or the console.

All

Enable batch task orchestration

GPU fault detection

The ack-node-problem-detector component provided by ACK is an enhanced version of the open-source node-problem-detector project that improves the monitoring of node anomaly events. It offers a comprehensive set of GPU-specific fault detection checks to enhance fault discovery in GPU scenarios. When a fault is detected, it generates a corresponding Kubernetes Event or Kubernetes Node Condition based on the fault type.

All

GPU fault detection and automatic isolation

ACK One

Schedule and distribute multi-cluster Spark jobs based on actual remaining resources

This topic describes how to use an ACK One fleet and the ACK Koordinator component to schedule and distribute multi-cluster Spark jobs based on actual remaining resources, rather than requested resources. This approach maximizes idle resource utilization across multiple clusters and uses priority control with hybrid deployment of online and offline workloads to ensure the stability of online services.

All

Schedule and distribute multi-cluster Spark jobs based on actual remaining resources

ACK Edge

Support for adding pod vSwitches

In ENS edge scenarios, if an ACK Edge cluster uses the Terway Edge plugin, you can add a new pod vSwitch to increase the number of available IP addresses for the cluster. This is useful when the existing vSwitch runs out of IP addresses or you need to expand the pod CIDR block.

All

Add a pod vSwitch

GPU resource monitoring

An ACK Edge cluster can manage GPU nodes in data centers and at the edge, providing unified management of heterogeneous compute power across multiple regions and environments. You can integrate Prometheus monitoring with your ACK Edge cluster to provide your on-premises and edge GPU nodes with the same level of observability as your cloud resources.

All

Best practices for monitoring GPU resources of an ACK Edge cluster

Cloud Native AI Suite

Deploy a DeepSeek distilled model inference service on ACK

This topic explains how to use KServe to deploy a production-ready DeepSeek distilled model inference service on ACK, using the DeepSeek-R1-Distill-Qwen-7B model as an example.

All

Deploy a DeepSeek distilled model inference service on ACK

Tutorial: Deploy a full-parameter DeepSeek inference service on ACK using multi-machine distributed deployment

This tutorial provides a solution for the distributed deployment of the large-scale DeepSeek-R1-671B model on ACK. This solution uses a hybrid parallel strategy and the Arena tool to efficiently deploy the model across two nodes. It also shows you how to integrate the deployed DeepSeek-R1 service with the Dify platform to quickly build an enterprise-grade Q&A system that supports long-context understanding.

All

Tutorial: Deploy a DeepSeek full-parameter inference service by using multi-machine distributed deployment on ACK

January 2025

Product

Feature

Description

Region

Related documentation

Container Service for Kubernetes

On-demand image acceleration for node pools

ACK supports on-demand loading of container images, powered by DADI (Data Accelerator for Disaggregated Infrastructure). This technology eliminates full image downloads and decompresses data on the fly, significantly reducing application startup time.

All Regions

Accelerate container startup by using on-demand image loading

Support for Alibaba Cloud Linux 3 Container Optimized Edition

Alibaba Cloud Linux 3 Container Optimized Edition (Alibaba Cloud Linux 3.2104 LTS 64-bit Container Optimized Edition) is an image based on the standard Alibaba Cloud Linux image and optimized for containerized environments. Drawing on extensive customer experience with ACK, Alibaba Cloud developed this cloud-native operating system to meet the demands of container scenarios, such as higher deployment density, faster startup speeds, and enhanced security isolation.

All Regions

Support for Kubernetes 1.32

ACK supports Kubernetes 1.32. You can create new clusters that run version 1.32 or upgrade existing clusters to this version.

All Regions

(End of support) Kubernetes 1.32

Improve resource utilization with ElasticQuotaTree and task queues

ack-kube-queue, ElasticQuotaTree, and ack-scheduler enable fair and isolated resource allocation, allowing different teams and tasks to share compute resources within a cluster.

All Regions

N/A

Best practice: Fine-grained resource control with resource groups

For more efficient management, organize your ACK resources into resource groups based on dimensions like department, project, or environment. By combining resource groups with Resource Access Management (RAM), you can implement resource isolation and fine-grained permission management within a single Alibaba Cloud account.

All Regions

Use resource groups for fine-grained resource control

ACK One

Connect ACK One registered clusters to ACS compute power

An ACK One registered cluster can use container compute power from ACS.

All Regions

Schedule pods to run on ACS by using virtual nodes

Cross-cluster service access using native service domain names

ACK One uses MultiClusterService to enable cross-cluster access through native service domain names. You can route traffic across clusters by using the native service directly, without modifying your application code, pod DNS configurations, or CoreDNS settings.

All Regions

Access services across clusters by using native service domain names

Access multi-cluster resources using the Go SDK

Use the Go SDK to integrate an ACK One fleet into your platform and access member cluster resources.

All Regions

Access multi-cluster resources by using the Go SDK

ACK Edge

Cloud node scaling

When on-premises node resources are insufficient, the auto-scaling feature scales out cloud-based nodes for your ACK Edge cluster to increase scheduling capacity.

All Regions

Cloud ECS node elasticity

Deploy elastic inference services for LLMs in a hybrid cloud

By installing the ack-kserve component and using the cloud elasticity of ACK Edge clusters, you can deploy elastic LLM inference services in a hybrid cloud. This allows you to flexibly schedule on-premises and cloud resources, reducing the operational costs of your LLM inference services.

All Regions

GPU sharing and scheduling

GPU sharing allows multiple pods to share the compute resources of a single GPU card. This improves GPU utilization and reduces costs.

  • Cloud node pools in ACK Edge clusters fully support shared GPU scheduling, GPU memory fencing, and computing power fencing.

  • Edge node pools in ACK Edge clusters support only shared GPU scheduling, and do not support GPU memory fencing or computing power fencing.

All Regions

Use GPU sharing and scheduling

Centrally manage ECS resources across regions

This best practice shows how to use an ACK Edge cluster to centrally manage compute resources distributed across different regions. This approach enables full lifecycle management and efficient resource scheduling for cloud-native applications.

All Regions

Centrally manage ECS resources across regions