Spark on Container Service for Kubernetes (ACK) is a solution provided by ACK that leverages Spark on Kubernetes to enable rapid construction of an efficient, flexible, and scalable Spark big data processing platform, utilizing the enterprise-level container application management capabilities provided by ACK.
Introduction to Spark on ACK
Apache Spark is a compute engine tailored for large-scale data processing and is widely utilized in scenarios such as data analysis and machine learning. Beginning with version 2.3, Spark has enabled job submission to Kubernetes clusters (Running Spark on Kubernetes).
Spark Operator is an Operator designed for running Spark workloads on Kubernetes clusters. It automates the management of the Spark job lifecycle in a way native to Kubernetes, including configuration, submission, and retry processes.
The Spark on ACK solution customizes and enhances components like Spark Operator, ensuring compatibility with open-source versions and expanding its capabilities. It integrates seamlessly with the Alibaba Cloud ecosystem, offering features such as log retention, Object Storage Service, and observability, allowing for the quick establishment of a flexible, efficient, and scalable big data processing platform.
Features and advantages
Simplified development and operations
Portability: Enables packaging of Spark applications and their dependencies into container images for easy migration between Kubernetes clusters.
Observability: Allows job status monitoring through the Spark History Server and integrates with Simple Log Service and Managed Service for Prometheus for enhanced job observability.
Workflow orchestration: Use workflow orchestration engines such as Apache Airflow and Argo Workflows to manage Spark jobs, automate data pipeline scheduling, and ensure consistent deployments across environments. This enhances operational efficiency and reduces migration costs.
Multi-version support: Allows multiple versions of Spark jobs to run concurrently in a single ACK cluster.
Job scheduling and resource management
Job queue management: Seamlessly integrated with ack-kube-queue, this feature offers flexible management of job queues and resource quotas, automatically optimizing the allocation of resources for workloads and enhancing the utilization of cluster resources.
Multiple scheduling strategies: Leverage the existing scheduling capabilities of the ACK scheduler to support a variety of batch scheduling strategies, such as Gang Scheduling and Capacity Scheduling.
Multi-architecture scheduling: Supports hybrid use of x86 and Arm architecture Elastic Compute Service (ECS) resources to improve efficiency and reduce costs.
Multi-cluster scheduling: Utilize the ACK One multi-cluster fleet to distribute Spark jobs across various clusters, enhancing resource utilization across multiple clusters.
Elastic computing power supply: Offers customizable resource priority scheduling and the integration of various elastic solutions, such as node autoscaling and instant elasticity. It also allows for the use of Elastic Container Instance and Alibaba Cloud Container Compute Service (ACS) computing resources without maintaining Elastic Compute Service instances, enabling on-demand and flexible scaling options.
Colocation of multiple types of workloads: Seamlessly integrated with ack-koordinator, this feature supports the colocation of various types of workloads, thereby enhancing the utilization of cluster resources.
Performance and stability optimization
Shuffle performance optimization: Configures Spark jobs to use Celeborn as the Remote Shuffle Service, achieving storage-compute separation and enhancing Shuffle performance and stability.
Data access acceleration: Utilizes the data orchestration and access acceleration capabilities provided by Fluid to speed up data access for Spark jobs, thereby improving performance.
Overall architecture
The architecture of Spark on ACK allows for quick job submission through Spark Operator and leverages the observability, scheduling, and resource elasticity features of ACK and Alibaba Cloud products.
Client: Submit Spark jobs to the ACK cluster using command line tools like kubectl and Arena.
Workflow: Orchestrate and submit Spark jobs to the ACK cluster using frameworks such as Apache Airflow and Argo Workflows.
Observability: Enhance the observability of your system by using Spark History Server, Simple Log Service, and Managed Service for Prometheus. This includes monitoring job statuses, along with collecting and analyzing job logs and metrics.
Spark Operator: Automate Spark job lifecycle management, including configuration, submission, and retry.
Remote Shuffle Service (RSS): Use Apache Celeborn as RSS to enhance the performance and stability of Spark jobs.
Cache: Employ Fluid as a distributed cache system for data access and acceleration.
Cloud infrastructure: During the job execution, utilize Alibaba Cloud infrastructures, including computing resources such as ECS instances, eIlastic container instances, ACS clusters, storage resources such as disks, File Storage NAS (NAS) file systems, Object Storage Service (OSS) buckets, and network resources such as elastic network interfaces (ENIs), virtual private clouds (VPCs), and Server Load Balancer (SLB) instances.
Billing overview
The installation of components for running Spark jobs in an ACK cluster is free. However, the costs for the ACK cluster itself, including cluster management fees and associated cloud product fees, are charged as usual. For more information, see Billing overview.
Additional cloud product fees, such as those for collecting logs with Simple Log Service or for reading and writing data in OSS/NAS by Spark jobs, are charged by each respective cloud product. Refer to the operation documents below for further details.
Getting Started
Running Spark jobs in an ACK cluster typically involves a series of steps, such as basic usage, observability, and advanced configuration, which you can select and configure according to your needs.
Basic usage
Process | Description |
Build a Spark container image | You can choose to directly use the Spark container image provided by the open-source community or customize it based on the open-source container image and push it to your own image repository. Below is a Dockerfile example. You can modify this Dockerfile as needed, such as replacing the Spark base image or adding dependent JAR packages, then build the image and push it to the image repository. |
Create a dedicated namespace | Create one or more dedicated namespaces for Spark jobs (this tutorial uses |
Use Spark Operator to run Spark jobs | Deploy the ack-spark-operator component and configure For more information, see Use Spark Operator to run Spark jobs. |
Read and write OSS data | There are multiple ways for Spark jobs to access Alibaba Cloud OSS data, including Hadoop Aliyun SDK, Hadoop AWS SDK, and JindoSDK. Depending on the SDK you choose, you need to include the corresponding dependencies in the Spark container image and configure Hadoop-related parameters in the Spark job. For more information, see Read and write OSS data in Spark jobs. |
Observability
Process | Description |
Deploy Spark History Server | Deploy ack-spark-history-server in the Next, mount the same NAS file system when submitting Spark jobs and configure Spark to write event logs to the same path. You will then be able to view the job from the Spark History Server. Below is an example job. For more information, see Use Spark History Server to view information about Spark jobs. |
Configure Simple Log Service to collect Spark logs | When running many Spark jobs in the cluster, it is recommended to use Simple Log Service to uniformly collect all Spark job logs for querying and analyzing the stdout and stderr logs of Spark containers. For more information, see Use Simple Log Service to collect the logs of Spark jobs. |
Performance Optimization
Process | Description |
Improve Shuffle performance through RSS | Shuffle is an important operation in distributed computing, often accompanied by a large amount of disk IO, data serialization, and network IO, which can easily lead to OOM and data retrieval failures (Fetch failures). To optimize Shuffle performance and stability and improve the quality of computing services, you can use Apache Celeborn as the Remote Shuffle Service (RSS) in the Spark job configuration. For more information, see Use Celeborn as RSS in Spark jobs. |
Define elastic resource scheduling priority | By using Elastic Container Instance-based pods and configuring appropriate scheduling strategies, you can create on-demand and pay based on actual resource usage, effectively reducing the cost waste caused by idle cluster resources. In scenarios where ECS instances and elastic container instances are mixed, you can also specify scheduling priorities. You do not need to modify scheduling-related configurations in SparkApplication. The ACK scheduler will automatically complete Pod scheduling based on the configured elastic strategy. You can flexibly customize the mixed use of various elastic resources (such as ECS instances and elastic container instances) as needed. For more information, see Use elastic container instances to run Spark jobs. |
Configure dynamic resource allocation | Dynamic Resource Allocation (DRA) can dynamically adjust the computing resources used by jobs based on the size of the workload. You can enable dynamic resource allocation for Spark jobs to avoid long job execution times due to insufficient resources or resource waste due to excess resources. For more information, see Configure dynamic resource allocation for Spark jobs. |
Use Fluid to accelerate data access | If your data is located in a data center or encounters a performance bottleneck during data access, you can use the data access and distributed cache orchestration capabilities provided by Fluid to accelerate data access. For more information, see Use Fluid to accelerate data access for Spark applications. |
References
You can use Managed Service for Prometheus under the Application Monitoring > Cluster Pod Monitoring tab.
For Spark job queue management and resource allocation optimization, refer to Use ack-kube-queue to manage AI and machine learning workloads.
Implement fine-grained resource quota management with ElasticQuotaTree to enhance resource utilization. For details, see Improve resource utilization by using ElasticQuotaTree and ack-kube-queue.
For multi-cluster Spark job scheduling and distribution to fully utilize idle resources across multiple ACK clusters, see Use idle resources to schedule and distribute Spark jobs in multiple clusters.