You can run Spark jobs on Alibaba Cloud Container Service for Kubernetes (ACK) and
accelerate data access by using the Alluxio distributed cache. ACK provides Spark
Operator to simplify the procedure of submitting Spark jobs. ACK also provides Spark
History Server to record the historical data of Spark jobs. This simplifies problem
troubleshooting. This topic describes how to set up a Spark runtime environment.
Background information
Perform the following steps to set up a Spark runtime environment:
The cluster configurations are subject to the following considerations:
When you set the instance type of worker nodes, select Big Data Network Performance Enhanced, and then select ecs.d1ne.6xlarge. Set the number of worker nodes to 20.
Each worker node of the ecs.d1ne.6xlarge instance type has twelve 5 TB HDDs. Before you mount the HDDs, partition and format
them. For more information, see Partition and format a data disk larger than 2 TiB.
After you partition and format the HDDs, mount them to the ACK cluster. Then, run
the df -h command to query HDD information. Figure 1 shows an example of the command output.
The 12 file paths under the /mnt directory are used in the configuration file of Alluxio. ACK provides a simplified
method to mount data disks. This method reduces your workload if the cluster has a
large number of nodes. For more information, see Use LVM to manage local storage.
Figure 1. Mounted data disks
Create an OSS bucket
You must create an Object Storage Service (OSS) bucket to store data, including the
test data generated by TPC-DS, test results, and test logs. The following examples
are based on bucket cloudnativeai. For more information about how to create an OSS bucket, see Create buckets.
Install the ack-spark-operator add-on
The ack-spark-operator add-on simplifies the procedure of submitting Spark jobs to a cluster.
In the left-side navigation pane, choose Marketplace > App Catalog.
On the App Catalog page, click ack-spark-operator.
On the App Catalog - ack-spark-operator page, find the Deploy section on the right, and click Create.
Install the ack-spark-history-server add-on
ACK Spark History Server generates event logs during the execution of the Spark job
and provides a user interface to simplify troubleshooting.
When you create the ack-spark-history-server add-on, you must specify parameters related to the OSS bucket on the Parameters tab. The OSS bucket is used to store historical data of the Spark job. For more information
about how to create the ack-spark-history-server add-on, see Install the ack-spark-operator add-on.
The following code block shows the parameters related to the OSS bucket:
oss:
enableOSS: false
# Please input your accessKeyId
alibabaCloudAccessKeyId: ""
# Please input your accessKeySecret
alibabaCloudAccessKeySecret: ""
# oss bucket endpoint such as oss-cn-beijing.aliyuncs.com
alibabaCloudOSSEndpoint: ""
# oss file path such as oss://bucket-name/path
eventsDir: "oss://cloudnativeai/spark/spark-events"
Run the following command to check whether ack-spark-history-server is installed:
kubectl get service ack-spark-history-server -n {YOUR-NAMESPACE}
Install Alluxio
You must run the Helm command to install Alluxio on ACK.
Run the following command to download the installation file of Alluxio:
wget http://kubeflow.oss-cn-beijing.aliyuncs.com/alluxio-0.6.8.tgz
tar -xvf alluxio-0.6.8.tgz
Create and configure the config.yaml file in the directory where the installation file of Alluxio is saved.
For more information about how to configure the file, see config.yaml.
The following list describes the key configurations:
Modify the following parameters based on the information of the OSS bucket: AccessKey
ID, AccessKey secret, OSS server endpoint, and UNIX File System (UFS).
# Site properties for all the components
properties:
fs.oss.accessKeyId: YOUR-ACCESS-KEY-ID
fs.oss.accessKeySecret: YOUR-ACCESS-KEY-SECRET
fs.oss.endpoint: oss-cn-beijing-internal.aliyuncs.com
alluxio.master.mount.table.root.ufs: oss://cloudnativeai/
alluxio.master.persistence.blacklist: .staging,_temporary
alluxio.security.stale.channel.purge.interval: 365d
alluxio.user.metrics.collection.enabled: 'true'
alluxio.user.short.circuit.enabled: 'true'
alluxio.user.file.write.tier.default: 1
alluxio.user.block.size.bytes.default: 64MB #default 64MB
alluxio.user.file.writetype.default: CACHE_THROUGH
alluxio.user.file.metadata.load.type: ONCE
alluxio.user.file.readtype.default: CACHE
#alluxio.worker.allocator.class: alluxio.worker.block.allocator.MaxFreeAllocator
alluxio.worker.allocator.class: alluxio.worker.block.allocator.RoundRobinAllocator
alluxio.worker.file.buffer.size: 128MB
alluxio.worker.evictor.class: alluxio.worker.block.evictor.LRUEvictor
alluxio.job.master.client.threads: 5000
alluxio.job.worker.threadpool.size: 300
In the tieredstore section, the values of mediumtype are the IDs of the data disks on a worker node, and the values of path are the paths where the data disks are mounted.