This article covers the 2025 release notes for Container Service for Kubernetes (ACK).
For the latest release notes for Container Service for Kubernetes (ACK), see Release notes.
December 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | Node pools support pay-as-you-go vulnerability remediation from Security Center | You can enable operating system (OS) CVE vulnerability remediation to scan nodes for security vulnerabilities, receive suggestions, and apply quick fixes in the console. Before you use this feature, you must activate Security Center Ultimate or purchase pay-as-you-go vulnerability remediation. | All regions | |
APIG Ingress support | APIG Ingress is a cloud-native API gateway built on the open-source Higress gateway. It is compatible with NGINX Ingress and ideal for API management and microservice scenarios. | All regions | ||
Kagent support | Kagent is a framework for building, deploying, and running AI applications on Kubernetes. Once deployed, Kagent lets you use declarative APIs to create agents and MCP servers, and integrate with multiple large language models. | All regions | ||
Manage tags for NAS, OSS, and CPFS with CNFS | CNFS allows you to add tags to NAS, OSS, and CPFS cloud storage resources. This enables fine-grained classification and permission management, improving resource governance efficiency. | All regions | ||
Use Kyverno as a policy engine | Kyverno is a Kubernetes-native policy engine that defines and enforces security, compliance, and automation policies by using a Policy-as-Code approach. Compared with OPA Gatekeeper, which is integrated into clusters by default, Kyverno allows you to define policies by using YAML without the need to learn Rego. Kyverno also supports mutating and generating resources at the admission stage. This makes it ideal for scenarios that require highly customized policies, automated O&M, or multi-cluster policy governance. | All regions | ||
Secure confidential container environments with remote attestation | PeerPod remote attestation ensures that confidential containers run in a genuine and untampered confidential computing environment, such as Intel TDX. It provides end-to-end security for sensitive workloads by automatically verifying nodes before container deployment and allowing applications to obtain environmental attestations on demand at runtime. | All regions | Use remote attestation to secure confidential container environments | |
Deploy A2A protocol servers in Knative | Agent2Agent (A2A) is an open standard designed to enable seamless communication and collaboration between AI agents. By deploying an A2A server in Knative, you can leverage features such as auto scaling (including scale to zero) to achieve on-demand resource usage and rapid version iteration. | All regions | ||
ack-agent-gateway best practices |
| All regions | ||
ACK One | FederatedHPA best practice based on vLLM custom metrics | Due to traffic fluctuations, large language model (LLM) online services are commonly deployed in a multi-cluster architecture. The ACK One multi-cluster solution is ideal for this scenario. This tutorial describes how to deploy a vLLM inference service in a cloud environment based on an ACK One fleet and use FederatedHPA for cross-cluster elastic scaling. | All regions |
November 2025
Product | Feature name | Description | Region | Related documentation |
Container Service for Kubernetes | Support for deploying and running GPU workloads in smart managed mode | After you enable smart managed mode for a cluster, you can use smart managed node pools to dynamically scale GPU resources. This significantly reduces costs for GPU workload scenarios with variable demand, such as online inference. | All | |
New servicemesh-operator component | The servicemesh-operator component simplifies the deployment, upgrade, and configuration management of Service Mesh (ASM) in ACK clusters, letting you quickly enable powerful ASM features such as traffic management, security, and observability. | All | ||
New built-in FinOps rule library | You can configure security policies for Pods to validate Pod deployment and update requests. The ACK cluster policy management feature provides multiple built-in rule libraries, including Compliance, Infra, K8s-general, PSP, and FinOps. | All | ||
Support for deploying MCP Server on Knative | By hosting MCP Server on Knative, you can leverage the benefits of its Serverless architecture, such as on-demand autoscaling and event-driven capabilities for AI services. | All | ||
Best practices for configuring rolling updates and graceful shutdown | To ensure zero-downtime application updates in ACK clusters, you can configure Deployment settings such as readiness probes, readinessGates, preStop hooks, and Server Load Balancer (SLB) graceful shutdown. This enables smooth traffic migration and ensures continuous high availability for your services. | All | ||
Distributed Cloud Container Platform ACK One | Best practices for multi-cluster priority-based elastic scheduling | The ACK One fleet supports AI inference services. In multi-cluster scenarios that span across regions or hybrid clouds, you can set cluster priorities to prioritize resources from an Internet Data Center (IDC) or a primary region, while using resources on Alibaba Cloud or in a secondary region as backup capacity. Combined with inventory-aware scheduling, this approach ensures business continuity. | All |
October 2025
Product | Feature name | Description | Region | Related documentation |
Schedule GPUs with DRA | For AI training and inference scenarios that require sharing GPU resources, deploy the NVIDIA DRA driver in your ACK cluster to overcome the limitations of traditional device plugins. This enables dynamic GPU allocation and fine-grained resource control between pods through the Kubernetes DRA API, improving GPU utilization and reducing costs. | All | ||
ACK One | ACS GPU-HPN capacity reservation for registered clusters | Register your on-premises Kubernetes clusters with ACK One to use the GPU-HPN capacity reservation feature. This enables unified management and intelligent scheduling of GPU resources across hybrid cloud environments, providing stable, high-performance computing power for critical workloads like AI training and inference. | All | Example: Use ACS GPU HPN computing power in an ACK One registered cluster |
Collect control plane metrics with a self-managed Prometheus | For hybrid cloud environments that use a self-managed Prometheus system, you can monitor the control plane health of ACK One registered clusters. Install the Metrics Aggregator component and configure a ServiceMonitor to integrate core component metrics into your existing monitoring system. This enables unified alerting and observability. | All | Collect control plane metrics with a self-managed Prometheus | |
Cloud-native AI Suite | Submit eRDMA-accelerated PyTorch distributed training jobs with Arena | When network latency bottlenecks multi-node GPU training, use Arena to shorten model training cycles. Submit a PyTorch distributed training job and configure eRDMA network acceleration to achieve low-latency, high-throughput communication between nodes. This improves training efficiency and cluster utilization. | All | Submit eRDMA-accelerated PyTorch distributed training jobs with Arena |
September 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | Support for Kubernetes 1.34 | ACK now supports Kubernetes 1.34. You can create new clusters that run version 1.34 or upgrade existing compatible clusters to this version. | All | |
Support for hybrid cloud node pools | Create a hybrid cloud node pool in an ACK Pro cluster to manage on-premises servers and cloud resources from a single orchestration plane. Add existing hybrid cloud nodes to your cluster to enable elastic scaling and optimize costs by using your current IT assets. | All | ||
Support for configuring DNS resolution for hybrid cloud node pools | When a hybrid cloud node pool resolves domain names through CoreDNS in the cloud, frequent access can increase the load on your leased line and may lead to DNS resolution failures if the connection is unstable. To mitigate these issues, you can configure NodeLocal DNSCache to cache DNS queries locally on each node. | All | ||
Support for the Terway-Hybrid network plugin | Connecting a hybrid cloud node pool to an on-premises data center introduces complex network topologies and cross-domain routing requirements that standard container network plugins cannot handle. The Terway-Hybrid network plugin is designed for hybrid cloud node pools and ensures seamless network connectivity for Pods across your data center and the cloud. | All | ||
RRSA authentication for ossfs 2.0 | For applications that require persistent storage or need to share data between Pods, you can mount an OSS bucket as an ossfs 2.0 volume by using a dynamic PV. We recommend using RRSA for authentication because it enhances security by providing auto-rotating temporary credentials and Pod-level permission isolation. This method is ideal for production, multi-tenant, and other high-security environments. | All | ||
ACK One (Alibaba Cloud Distributed Cloud Container Platform) | Support for integrating cloud-based GPU compute power | By providing unified scheduling and O&M for heterogeneous computing resources, ACK One registered clusters significantly improve resource utilization. | All | |
Migrate single-cluster applications to a fleet for multi-cluster distribution | Use the AMC command-line tool to deploy an application to multiple clusters. This reduces repetitive work, prevents configuration drift, and enables centralized management with automatic synchronization for future updates. | All | Migrate single-cluster applications to a fleet and distribute them to multiple clusters |
August 2025
Product | Feature | Description | Region | Documentation |
Container Service for Kubernetes | KV Cache-aware load balancing with intelligent inference routing | Designed for generative AI inference, KV Cache-aware load balancing dynamically routes requests to the optimal compute node, significantly improving the efficiency of large language model (LLM) services. | All regions | |
Support for custom CNI plugins | While the default Terway and Flannel CNI plugins in ACK meet most container networking needs, some scenarios require other plugins. ACK now supports a Bring Your Own Container Network Interface (BYOCNI) mode, letting you install a custom CNI plugin in your cluster. | All regions | ||
Managed policy governance for smart managed mode clusters | Enable the security policy management feature to meet compliance requirements and enhance cluster security. Security policy rules include Infra, Compliance, PSP, and K8s-general. | All regions | ||
Knative support for ACS compute resources | You can now configure Knative Services to use compute resources from Alibaba Cloud Container Compute Service (ACS). This configuration lets you leverage diverse compute types and service levels to meet various workload demands and optimize costs. | All regions | ||
More flexible configurations for Gateway with Inference Extension |
| All regions | ||
Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters | Large language model (LLM) inference involves sensitive data and core model assets, placing them at risk of exposure in untrusted environments. The Confidential AI solution for ACK (ACK-CAI) mitigates this risk by integrating hardware-based confidential computing technologies, such as Intel TDX and GPU TEE, to provide end-to-end security for model inference. | All regions | Securely deploy vLLM inference services in ACK heterogeneous confidential computing clusters | |
Cloud-native AI Suite | Introducing the AI Serving Stack | As large language models (LLMs) become more widespread, efficiently deploying and managing them in production has become a key challenge for enterprises. The AI Serving Stack, built on ACK, is an end-to-end solution for cloud-native AI inference. It manages the entire LLM inference lifecycle with integrated capabilities for deployment, intelligent routing, elastic scaling, and in-depth observability. Whether you are getting started or running large-scale AI services, the AI Serving Stack simplifies cloud-native AI inference. | All regions |
July 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | Access ECS instance metadata in enforced mode only | You can use the metadata service to retrieve metadata from an ECS instance, such as its instance ID, VPC information, and ENI information. By default, the metadata access mode for nodes in an ACK cluster supports both normal and enforced modes. You can now configure nodes to use enforced mode only (IMDSv2) to enhance the security of the instance metadata service. | All regions | |
Subscribe to images from international registries | You can now use the artifact subscription feature in an Enterprise Edition instance of Container Registry (ACR) to automatically synchronize images from international registries, such as Docker Hub, GCR, and Quay. | All regions | Obtain images from international registries through artifact subscription | |
Mount NAS by using the EFC client through CNFS | Extreme File Cache (EFC) improves the performance of Apsara File Storage NAS with features like distributed caching. It supports high concurrency and parallel access to large-scale datasets, making it ideal for data-intensive containerized applications such as big data analytics, and AI training and inference. Compared to mounting NAS with the standard NFS protocol, using EFC accelerates file access and improves read/write performance. | All regions | ||
ACK One (Distributed Cloud Container Platform) | Console-based management for GitOps | You can now manage the full suite of GitOps capabilities from the console. This includes enabling or disabling the feature, configuring public access and ACLs, accessing the ApplicationSet UI, managing the Argo CD ConfigMap, restarting components, and viewing monitoring and log data. | All regions | |
Argo CD ConfigMap configuration for multi-cluster GitOps | ACK One lets you manage GitOps-related features and permissions by configuring the Argo CD ConfigMap. | All regions | ||
Inventory-aware elastic scheduling for multi-cluster fleets | ACK One now provides an inventory-aware intelligent scheduler for multi-cluster fleets in multi-region deployments. When a cluster in the fleet lacks sufficient resources, the scheduler automatically deploys applications to another cluster that has available inventory. The target cluster then uses its instant elasticity feature to scale up nodes as needed, improving scheduling success rates and reducing resource costs. | All regions | Cross-region multi-cluster elastic scheduling based on inventory awareness | |
Container Service for Edge (ACK@Edge) | Configure a private connection for a leased line connection | ACK@Edge clusters can now connect to the cloud over a leased line connection. This enables edge nodes to securely and efficiently access cloud services such as ACK and Container Registry (ACR), resolving common issues like network conflicts and a lack of fixed IP addresses. | All regions |
June 2025
Product | Feature name | Description | Region | Related documentation |
Container Service for Kubernetes | AI profiling | AI Profiling is a non-invasive performance analysis tool for Kubernetes container environments. Using eBPF and dynamic process injection, it detects container processes that run GPU tasks. You can dynamically start or stop performance data collection on running workloads without modifying application code. This on-demand attachment and detachment allows for detailed, real-time analysis of production services. | All regions | |
GPU node auto-healing | Node auto-healing now repairs instances affected by GPU hardware and software failures. ACK provides Kubernetes-native auto-healing for EGS and Lingjun node failures. This feature automates the entire operational lifecycle, from fault detection and alerting to automatic isolation, node drain, and automated repair. You can also require user authorization before initiating repairs, which enhances automated fault management and reduces cluster O&M costs. | All regions | ||
CPFS for AI static volumes | CPFS for AI delivers ultra-high throughput and IOPS with end-to-end RDMA network acceleration, making it ideal for AI computing scenarios like AIGC and autonomous driving. You can create CPFS for AI static volumes in your cluster and use them in your workloads. | All regions | ||
ACK VPD CNI component | ACK VPD CNI provides container network management capabilities for Lingjun nodes in ACK Pro clusters. As the container network CNI plugin for Lingjun nodes, it allocates and manages container network resources for nodes that use Lingjun connections. | All regions | ||
ack-kms-agent-webhook-injector component | The ack-kms-agent-webhook-injector injects the KMS Agent as a sidecar container into your Pods. Applications can then fetch credentials from a KMS instance through a local HTTP interface and cache them in memory. This approach avoids hard-coding sensitive information and enhances data security. | All regions | Import Alibaba Cloud KMS Service Credentials for Applications | |
Expanded capabilities for Gateway with Inference Extension | Gateway with Inference Extension now supports multiple generative AI inference frameworks, including vLLM and SGLang. It enhances services deployed on these frameworks with features like canary releases, inference load balancing, and model name-based routing. You can also configure rate limiting and circuit breaking policies for your inference services. | All regions | Gateway with Inference Extension Traffic Management and Inference Service Management | |
CAA solution for confidential containers on confidential virtual machines | For scenarios that require confidential computing, such as financial risk control and healthcare, you can deploy confidential computing workloads in ACK clusters by using the Cloud API Adaptor (CAA) solution. This solution uses Intel® TDX technology to protect sensitive data from external attacks and potential threats from the cloud provider, helping you meet industry compliance requirements. | All regions | Implement CAA Confidential Container Solution Based on Confidential VMs | |
Cloud-native AI Suite | Schedule Dify workflows with XXL-JOB | In many scenarios, Dify workflows require a scheduler to automate tasks such as risk monitoring, data analysis, content generation, and data synchronization. However, Dify does not include a built-in scheduler. This guide shows how to integrate XXL-JOB, a distributed task scheduler, to schedule and monitor your workflow applications and ensure their stability. | All regions |
May 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | Support for Kubernetes 1.33 | ACK supports Kubernetes 1.33. You can create new clusters that run version 1.33 or upgrade existing clusters to this version. | All regions | |
Default installation of ack-ram-authenticator component | Starting with Kubernetes 1.33, new ACK managed clusters automatically install the latest version of the managed ack-ram-authenticator component, without consuming additional cluster node resources. | All regions | ||
containerd 2.1.1 is available | containerd 2.1.1 introduces new features, such as the Node Resource Interface (NRI), Container Device Interface (CDI), and Sandbox API. | All regions | ||
Support for ossfs 2.0 | ossfs 2.0 is a Filesystem in User Space (FUSE)-based client that lets you mount an OSS bucket as a local file system. This allows application containers to access data in OSS using standard POSIX file operations. Compared to ossfs 1.0, ossfs 2.0 delivers improved sequential read/write performance and higher throughput for concurrent small-file reads. It is ideal for workloads that require high storage access performance, such as AI training and inference, big data processing, and autonomous driving. | All regions | ||
ACK One | Use ApplicationSet to coordinate multi-environment deployments and application dependencies | This new best practice guide demonstrates how to combine the Progressive Syncs feature of Argo CD with the multi-environment orchestration capabilities of ApplicationSet. Learn to build an automated deployment system that manages application dependencies between development and pre-production environments. | All regions | Use ApplicationSet to coordinate multi-environment deployments and application dependencies |
April 2025
Product | Feature | Description | Release region | Related documentation |
Container Service for Kubernetes | Create and manage Lingjun node pools | You can create and manage a Lingjun node pool in an ACK managed cluster Pro. | All regions | |
Configure a node pool by specifying instance attributes | You can configure a node pool by specifying instance attributes, such as the number of vCPUs and the amount of memory. The node pool then automatically selects suitable instance types that meet your requirements during scale-out, improving the success rate of scaling operations. | All regions | ||
Real-time AI Profiling | AI Profiling is a non-intrusive performance analysis tool that uses eBPF and dynamic process injection to perform online diagnostics of container processes running GPU tasks in Kubernetes. You can dynamically attach and detach the profiler to perform real-time analysis of live services without modifying your application code. | All regions | ||
Enable preemption | When cluster resources are tight, high-priority tasks may fail to run. After you enable preemption, ACK Scheduler simulates scheduling decisions and evicts low-priority Pods to free compute resources, allowing high-priority tasks to start faster. | All regions | ||
Access services through Gateway with Inference Extension | The Gateway with Inference Extension component is built on the Envoy Gateway project. It supports all the basic capabilities of the Gateway API and the extended resources of the open-source Envoy Gateway. | All regions | ||
Generative AI service enhancements | You can use the Gateway with Inference Extension component to implement features such as intelligent routing, efficient traffic management, canary releases for generative AI inference services, circuit breaking for inference requests, and traffic mirroring. | All regions | ||
Back up and restore volumes from PVC to PVC | You can back up and restore cloud disk data within the same ACK cluster, or between ACK clusters in the same or different regions. After a backup completes in the source cluster, you can use the Backup Center to restore the data to a new set of persistent volume claims and their corresponding volumes in the same or a different cluster. The restored volumes can be mounted directly without modifying any workload YAML configurations. | All regions | ||
alibabacloud-privateca-issuer released | AlibabaCloud Private CA Issuer is now available. It lets you create and manage Alibaba Cloud PCA certificates in your cluster by using cert-manager. The issuer is available in the ACK App Market. | All regions | None | |
Deploy a workload and implement load balancing in an ACK managed cluster (smart managed mode) | Learn how to deploy a workload in an ACK managed cluster (smart managed mode) and expose it to the internet by using an ALB Ingress. Once the steps are complete, you can access the application through a specified domain name for efficient management and load balancing of external traffic. | All regions | ||
Datapath V2 best practices | Learn how to optimize the network configuration of your cluster after you enable Datapath V2 in a cluster that uses the Terway network plugin. This includes configuring Conntrack parameters and managing Identity resources to improve cluster performance and stability. | All regions | ||
Dify component upgrade guide | Upgrade ack-dify from an earlier version to v1.0.0 or later. The process includes backing up data, installing the plugin migration tool, and enabling the new plugin ecosystem. | All regions | ||
ACK One | Use PrivateLink to resolve IP conflicts in a data center network | After you connect a Kubernetes cluster in your data center to an ACK One registered cluster over a leased line, IP address conflicts may occur when you use Serverless computing resources if other services in your internal network use the same CIDR block. You can use PrivateLink to resolve these IP address conflicts. | All regions | Use PrivateLink to resolve IP conflicts in a data center network |
Schedule ACS Pods across regions | An ACK One registered cluster can seamlessly integrate Serverless computing resources from multiple regions into a Kubernetes cluster. This enables dynamic scheduling and unified management of GPU resources across regions. | All regions | ||
Log collection | You can configure log collection by using SLS CRDs or environment variables to automatically collect container logs with Alibaba Cloud Log Service (SLS). | All regions | ||
ACK Edge | Version 1.32 released | Version 1.32 is now supported. Features include optimizing requests from CoreDNS, kube-proxy, and kubelet to the kube-apiserver, reducing cloud-to-edge communication traffic, and more. | All regions | |
Network element configuration in a leased line environment | You can connect server devices from your on-premises data center to ACK for containerized management over the internet or a leased line. When connecting over a leased line, you must first configure the network elements of your infrastructure. | All regions | ||
AI Engineering Suite | HistoryServer component support | The native Ray Dashboard is only available while a cluster is running, preventing access to historical logs and monitoring data after the cluster is terminated. The RayCluster HistoryServer solves this by collecting node logs in real time and persisting them to OSS while the cluster is running. | All regions | |
KubeRay component support | You can deploy the KubeRay Operator component and integrate it with Alibaba Cloud SLS and Prometheus for monitoring. This enhances log management, system observability, and high availability. | All regions |
March 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | ACK Pro managed clusters support intelligent hosting mode | When you create an ACK managed cluster, you can enable intelligent hosting mode to quickly provision a Kubernetes cluster that follows best practices. After the cluster is created, ACK automatically provisions an intelligent managed node pool. This node pool dynamically scales based on workload demand. ACK also handles all operational tasks for this node pool, including OS version upgrades, software updates, and security patching. | All regions | |
Enable tracing for control plane and data plane components | After you enable tracing for the cluster API Server or kubelet, trace data automatically flows to Managed Service for OpenTelemetry. This integration provides detailed trace visualizations, real-time topology maps, and other monitoring data. | All regions | ||
High-risk KubeConfig SMS and email notifications | You can now receive SMS and email alerts for high-risk KubeConfig files that still pose a security risk after deletion. | All regions | None | |
Intelligent routing and traffic management with ACK Gateway with Inference Extension | Use the ACK Gateway with Inference Extension component to enable intelligent routing and efficient traffic management for your inference services. | All regions | Implement intelligent routing and traffic management with ACK Gateway with Inference Extension | |
ACK One (Distributed Cloud Container Platform) | Unified component management for multi-cluster fleets | ACK One fleets provide cluster operators a unified, automated way to manage components. You can define baselines that include multiple components and their versions, and then deploy these baselines to multiple clusters. The feature also supports component configuration, batch deployments, and rollbacks to improve system stability. | All regions | |
Dynamic distribution and rescheduling | An ACK One fleet can use a PropagationPolicy to distribute workload replicas across member clusters based on their available resources. By default, the fleet's rescheduling feature automatically checks every two minutes for unschedulable Pods. If a Pod remains in this state for more than 30 seconds, the feature triggers a rescheduling of that replica. | All regions | ||
Cloud-native AI Suite | Set Slurm queue priorities | This new best practice guide explains how to configure queue policies in a Slurm environment for optimal task scheduling and performance when jobs are submitted or their states change. | All regions |
February 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | Support for modifying control plane security groups and time zones | If the security group or time zone selected during cluster creation no longer meets your requirements, you can modify the control plane security group and the cluster time zone on the cluster's Basic Information page. | All | |
Node pools support custom containerd configurations | You can customize containerd parameters for nodes in a node pool. For example, you can configure multiple mirror repositories for a specific image registry or bypass security certificate verification for a specific registry. | All | Customize containerd parameter configurations for a node pool | |
Elasticity strength indicator for node pools | A node pool scale-out may fail due to insufficient instance inventory or an unsupported instance type in the selected availability zone. The elasticity strength indicator helps you assess the availability of your node pool configuration and the health of the instance supply, and provides configuration recommendations. | All | ||
Support for batch task orchestration | Argo Workflows is a Kubernetes-native workflow engine that orchestrates parallel tasks using | All | ||
GPU fault detection | The ack-node-problem-detector component provided by ACK is an enhanced version of the open-source | All | ||
ACK One | Schedule and distribute multi-cluster Spark jobs based on actual remaining resources | This topic describes how to use an ACK One fleet and the ACK Koordinator component to schedule and distribute multi-cluster Spark jobs based on actual remaining resources, rather than requested resources. This approach maximizes idle resource utilization across multiple clusters and uses priority control with hybrid deployment of online and offline workloads to ensure the stability of online services. | All | Schedule and distribute multi-cluster Spark jobs based on actual remaining resources |
ACK Edge | Support for adding pod vSwitches | In ENS edge scenarios, if an ACK Edge cluster uses the Terway Edge plugin, you can add a new pod vSwitch to increase the number of available IP addresses for the cluster. This is useful when the existing vSwitch runs out of IP addresses or you need to expand the pod CIDR block. | All | |
GPU resource monitoring | An ACK Edge cluster can manage GPU nodes in data centers and at the edge, providing unified management of heterogeneous compute power across multiple regions and environments. You can integrate Prometheus monitoring with your ACK Edge cluster to provide your on-premises and edge GPU nodes with the same level of observability as your cloud resources. | All | Best practices for monitoring GPU resources of an ACK Edge cluster | |
Cloud Native AI Suite | Deploy a DeepSeek distilled model inference service on ACK | This topic explains how to use KServe to deploy a production-ready DeepSeek distilled model inference service on ACK, using the | All | |
Tutorial: Deploy a full-parameter DeepSeek inference service on ACK using multi-machine distributed deployment | This tutorial provides a solution for the distributed deployment of the large-scale | All |
January 2025
Product | Feature | Description | Region | Related documentation |
Container Service for Kubernetes | On-demand image acceleration for node pools | ACK supports on-demand loading of container images, powered by DADI (Data Accelerator for Disaggregated Infrastructure). This technology eliminates full image downloads and decompresses data on the fly, significantly reducing application startup time. | All Regions | Accelerate container startup by using on-demand image loading |
Support for Alibaba Cloud Linux 3 Container Optimized Edition | Alibaba Cloud Linux 3 Container Optimized Edition (Alibaba Cloud Linux 3.2104 LTS 64-bit Container Optimized Edition) is an image based on the standard Alibaba Cloud Linux image and optimized for containerized environments. Drawing on extensive customer experience with ACK, Alibaba Cloud developed this cloud-native operating system to meet the demands of container scenarios, such as higher deployment density, faster startup speeds, and enhanced security isolation. | All Regions | ||
Support for Kubernetes 1.32 | ACK supports Kubernetes 1.32. You can create new clusters that run version 1.32 or upgrade existing clusters to this version. | All Regions | ||
Improve resource utilization with ElasticQuotaTree and task queues | ack-kube-queue, ElasticQuotaTree, and ack-scheduler enable fair and isolated resource allocation, allowing different teams and tasks to share compute resources within a cluster. | All Regions | N/A | |
Best practice: Fine-grained resource control with resource groups | For more efficient management, organize your ACK resources into resource groups based on dimensions like department, project, or environment. By combining resource groups with Resource Access Management (RAM), you can implement resource isolation and fine-grained permission management within a single Alibaba Cloud account. | All Regions | ||
ACK One | Connect ACK One registered clusters to ACS compute power | An ACK One registered cluster can use container compute power from ACS. | All Regions | |
Cross-cluster service access using native service domain names | ACK One uses MultiClusterService to enable cross-cluster access through native service domain names. You can route traffic across clusters by using the native service directly, without modifying your application code, pod DNS configurations, or CoreDNS settings. | All Regions | Access services across clusters by using native service domain names | |
Access multi-cluster resources using the Go SDK | Use the Go SDK to integrate an ACK One fleet into your platform and access member cluster resources. | All Regions | ||
ACK Edge | Cloud node scaling | When on-premises node resources are insufficient, the auto-scaling feature scales out cloud-based nodes for your ACK Edge cluster to increase scheduling capacity. | All Regions | |
Deploy elastic inference services for LLMs in a hybrid cloud | By installing the ack-kserve component and using the cloud elasticity of ACK Edge clusters, you can deploy elastic LLM inference services in a hybrid cloud. This allows you to flexibly schedule on-premises and cloud resources, reducing the operational costs of your LLM inference services. | All Regions | ||
GPU sharing and scheduling | GPU sharing allows multiple pods to share the compute resources of a single GPU card. This improves GPU utilization and reduces costs.
| All Regions | ||
Centrally manage ECS resources across regions | This best practice shows how to use an ACK Edge cluster to centrally manage compute resources distributed across different regions. This approach enables full lifecycle management and efficient resource scheduling for cloud-native applications. | All Regions |