This topic compares the performance of Container Service for Kubernetes (ACK)-based Spark SQL queries on 1 TB of data before and after the Alluxio distributed caching service is used.
The following table lists the ACK cluster configurations.
|Cluster type||Standard dedicated cluster|
|Number of worker nodes||20|
- Apache Spark: 2.4.5
- Alluxio: 2.3.0
- Spark configurations
Parameter Value spark.driver.cores 5 spark.driver.memory (MB) 20480 spark.executor.cores 7 spark.executor.memory (MB) 20480 spark.executor.instances 20
The following table lists the amount of time consumed by the tests based on different benchmarks. The queries are performed on 1 TB of data one after another.
|Benchmark||Total time consumed by 104 queries (Unit: minutes)|
|Spark with OSS||180|
|Spark with Alluxio Cold||145|
|Spark with Alluxio Warm||137|