All Products
Search
Document Center

Container Service for Kubernetes:Ray on ACK

Last Updated:Feb 28, 2026

Ray is an open-source unified framework for scaling AI and Python applications. It simplifies distributed computing by providing an API that lets developers write parallel processing and distributed workloads without managing complex infrastructure. Ray supports multiple programming paradigms, including parallel processing, the actor model, and distributed object storage, and is widely adopted in the machine learning industry.

The Ray computing framework consists of three layers: Ray AI Libraries, Ray Core, and Ray clusters. For more information about Ray, see Ray.

image.svg

KubeRay overview

KubeRay is an open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It provides a declarative Kubernetes API and three custom resources -- RayCluster, RayJob, and RayService -- for running Ray workloads on Kubernetes.

Container Service for Kubernetes (ACK) is one of the first services to participate in the Certified Kubernetes Conformance Program in the world. ACK provides high-performance containerized application management services and supports lifecycle management for enterprise-class containerized applications. You can use KubeRay to create Ray clusters in ACK clusters in the same way you create ACK clusters in the cloud.

Ray on ACK integrates with Alibaba Cloud services to extend your Ray clusters with additional capabilities:

Alibaba Cloud serviceCapability
Simple Log ServiceLog management
Managed Service for PrometheusObservability
Tair (Redis OSS-Compatible)Improved availability
Ray autoscaler + ACK autoscalerOn-demand resource scaling
image.png

For more information about how to install the KubeRay operator, see Install Kuberay-Operator.

Kuberay-Operator

To deploy and manage Ray clusters quickly, we recommend that you install KubeRay in your ACK cluster from the Add-ons page of the ACK console. ACK provides the Kuberay-Operator component, which is based on the open-source KubeRay operator. This component enables your Ray clusters to use the following ACK capabilities:

  • Scheduling

  • Elastic quotas

  • Priority-based resource scheduling

  • Integration with Simple Log Service, Managed Service for Prometheus, and Object Storage Service (OSS)

After you install Kuberay-Operator from the Add-ons page of the ACK console, ACK automatically installs and manages Kuberay-Operator. In addition, ACK creates the RayCluster, RayJob, and RayService custom resources on the data plane of the cluster.

image.png

Custom resources

RayCluster

A RayCluster creates a Ray cluster on pods in an ACK cluster. Each Ray cluster consists of a head pod and several worker pods. For more information about the RayCluster custom resource, see RayCluster Configuration.

image.png

RayJob

A RayJob (in K8sJobMode mode) manages a RayCluster and a Kubernetes batch job. The RayCluster builds a Ray cluster on Kubernetes pods to provide computing resources, while the Kubernetes batch job runs the ray job submit command to submit a Ray job to the RayCluster. For more information about the RayJob custom resource, see RayJob Configuration.

image.png

RayService

A RayService manages a RayCluster and Ray Serve applications. The RayCluster builds a Ray cluster on Kubernetes pods to provide computing resources. The Ray Serve applications are deployed in the Ray cluster for model deployment and inference.

Shared responsibilities for Ray on ACK

When you use KubeRay to run Ray workloads in ACK clusters, Alibaba Cloud and you share responsibility for the security and management of the environment.

Alibaba Cloud responsibilities

Kuberay-Operator is managed by ACK. ACK provides security protection for Kuberay-Operator:

  • Ensures that images used by Kuberay-Operator comply with security hardening standards to prevent potential vulnerabilities.

  • Ensures the stability and availability of Kuberay-Operator.

  • Maintains the Kuberay-Operator versions to ensure availability.

  • Enables management of the RayCluster, RayJob, and RayService custom resources for Kuberay-Operator.

Customer responsibilities

When you use the RayCluster, RayJob, and RayService custom resources to deploy and manage Ray clusters and applications in ACK clusters, you are responsible for the security protection and configuration updates of Ray applications.

  • Follow the best practices for Ray cluster protection.

  • Update and maintain the container images used by Ray head pods and worker pods.

  • Update and maintain the Ray versions of Ray head pods and worker pods.

  • Configure resource requirements for Ray clusters, including CPU, GPU, and memory.

  • Monitor the status of Ray applications and ensure their availability.

For more information, see Shared responsibility model.