New Features

Container Service for Kubernetes (ACK) - Node Auto-recovery in ACK Managed Pro Clusters Now Covers GPU-accelerated Nodes

The node auto-recovery feature for ACK managed Pro clusters now includes automatic GPU fault recovery, reducing the O&M costs of managing GPU-accelerated nodes.
Content

Intended customers: all users. Details: We have enhanced the node auto-recovery feature for managed node pools to provide a fully automated, end-to-end workflow for handling GPU software and hardware anomalies. This new capability includes: automatic fault detection, notifications and alerts, automatic isolation of faulty GPU cards or entire nodes, and automatic remediation. To ensure you have full control, remediation actions can be configured to require user authorization before execution. This authorization step can be integrated with your own business platforms for seamless workflow management. This new feature minimizes the business impact of GPU failures and significantly lowers your O&M costs.

7th Gen ECS Is Now Available

Increase instance computing power by up to 40% and Fully equipped with TPM chips.
Powered by Third-generation Intel® Xeon® Scalable processors (Ice Lake).

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.