When a Kubernetes Job reads training data or batch input directly from Object Storage Service (OSS), each file fetch crosses the network on every run. Fluid (deployed as the ack-fluid Helm chart) places a JindoFS-backed caching layer between your ACK Serverless pods and OSS. The first time a pod reads a file, JindoFS fetches it from OSS and writes it to local cache. Every subsequent read comes from cache — cutting access time from tens of seconds to under one second for repeated workloads.
This topic walks you through deploying Fluid, configuring a Dataset and JindoRuntime backed by an OSS bucket, and running a Kubernetes Job that reads from cache.
Prerequisites
Before you begin, ensure that you have:
-
An ACK Serverless cluster running Kubernetes 1.18 or later with CoreDNS installed. See Create a cluster.
-
kubectl configured to connect to the cluster. See Obtain the KubeConfig of a cluster and use kubectl to connect to the cluster.
-
An activated OSS instance with a bucket ready. See Activate OSS, then create a bucket in the console.
Limitations
Fluid's data access acceleration conflicts with the virtual node scheduling feature of ACK Serverless clusters. You cannot use both at the same time. See Enable the virtual node scheduling policy for a cluster.
To avoid this conflict, all JindoRuntime cache worker pods and application pods must include the alibabacloud.com/burst-resource: eci_only annotation, which disables virtual node scheduling on those pods. This annotation appears in the YAML examples later in this topic.
Deploy the Fluid control plane
If you have already installed open-source Fluid, uninstall it before deploying the ack-fluid component.
-
In the ACK console, click Clusters in the left navigation pane.
-
Click the name of your cluster. In the left navigation pane, click Applications > Helm.
-
On the Helm page, click Deploy.
-
In the Basic Information step, set the following parameters, then click Next.
The default release name is
ack-fluidand the default namespace isfluid-system. If you specify different values, a Confirm dialog appears. Click Yes to revert to the defaults.Parameter Value Source Marketplace Chart Search for and select ack-fluid -
In the Parameters step, click OK.
-
Verify that the Fluid control plane is running:
-
Dataset Controller — manages the full lifecycle of Dataset Custom Resources (CRs) introduced by Fluid.
-
Fluid Webhook — injects a sidecar container into application pods that need data access, enabling transparent caching in serverless scenarios.
The Fluid control plane also includes controller components for JindoFS, JuiceFS, and Alluxio. These controllers are not created during initial deployment — the pod for each caching system scales on demand only when you configure that system.
kubectl get pod -n fluid-systemExpected output:
NAME READY STATUS RESTARTS AGE dataset-controller-d99998f79-dgkmh 1/1 Running 0 2m48s fluid-webhook-55c6d9d497-dmrzb 1/1 Running 0 2m49sThe two components serve distinct roles:
-
Accelerate data access
Step 1: Upload test data to OSS
-
Create a 2 GB test file. This topic uses test as an example.
-
Upload the file to your OSS bucket using ossutil. See Install ossutil.
Step 2: Create the Dataset and JindoRuntime resources
Fluid represents your data source as two Custom Resources (CRs):
-
Dataset — declares the URL of data in the external storage system.
-
JindoRuntime — declares the caching system and its configuration.
Fluid uses lazy loading: on first access, it fetches data from OSS and writes it to local cache. Jobs that access the same data repeatedly benefit most from this approach — the first run warms the cache; subsequent runs read entirely from cache. To eliminate first-run latency, pre-warm the cache before submitting your Job.
-
Create a Secret to store the OSS credentials:
kubectl create secret generic oss-access-key \ --from-literal=fs.oss.accessKeyId=<access_key_id> \ --from-literal=fs.oss.accessKeySecret=<access_key_secret> -
Create a file named
dataset.yamlwith the following content:Parameter Description mountPointOSS path to mount, in the format oss://<bucket_name>/<bucket_path>. Setpathto/for a single mount point.fs.oss.endpointOSS bucket endpoint. Use an internal endpoint (e.g., oss-cn-hangzhou-internal.aliyuncs.com) for better performance when the cluster and bucket are in the same region. Use a public endpoint (e.g.,oss-cn-hangzhou.aliyuncs.com) otherwise.replicasNumber of JindoRuntime cache worker pods. Controls the total cache capacity available to the distributed caching system. alibabacloud.com/burst-resource: eci_onlyDisables virtual node scheduling on the cache worker pods. Required because Fluid conflicts with the virtual node scheduling feature (see Limitations). k8s.aliyun.com/eci-use-specsElastic Container Instance (ECI) instance spec for each cache worker pod. k8s.aliyun.com/eci-image-cacheEnables instance image cache to speed up pod startup. tieredstore.levels.mediumtypeCache medium. Supported values: MEM(memory),SSD(solid-state drive),HDD(hard disk drive). See Strategy 2: Select a cache medium.tieredstore.levels.volumeTypeVolume type for the cache medium. Use emptyDirfor memory or system disk (prevents residual cache from affecting node availability). UsehostPathfor data disks, and setpathto the disk's mount point on the host. Default:hostPath.tieredstore.levels.pathPath for the cache medium. Supports only a single path. tieredstore.levels.quotaMaximum cache capacity per worker, for example 10Gi.tieredstore.levels.high/lowHigh and low watermarks for cache eviction. apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: mounts: - mountPoint: oss://<bucket_name>/<bucket_path> name: demo path: / options: fs.oss.endpoint: oss-<region>.aliyuncs.com encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: oss-access-key key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: oss-access-key key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: # Number of cache worker nodes replicas: 2 worker: podMetadata: annotations: # Required: disable virtual node scheduling (conflicts with Fluid — see Limitations) alibabacloud.com/burst-resource: eci_only # ECI instance spec for the JindoFS cache worker pod k8s.aliyun.com/eci-use-specs: <eci_instance_spec> # Enable instance image cache to speed up pod startup k8s.aliyun.com/eci-image-cache: "true" tieredstore: levels: # 10 GiB of memory cache per worker node - mediumtype: MEM volumeType: emptyDir path: /dev/shm quota: 10Gi high: "0.99" low: "0.99"Key parameters:
-
Apply the manifest:
kubectl create -f dataset.yaml -
Wait about one to two minutes for the caching system to deploy, then verify the Dataset status:
If you run this command immediately after applying the manifest,
PHASEmay showNotBoundwhile the caching system is still initializing. Wait one to two minutes and run the command again.kubectl get dataset demo-datasetExpected output:
NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE demo-dataset 1.16GiB 0.00B 20.00GiB 0.0% Bound 2m58sPHASE: Boundconfirms the Dataset deployed successfully. The other columns show how much data is in OSS, how much is already cached, and the total cache capacity across all worker nodes.
Step 3 (Optional): Pre-warm the cache
Because Fluid uses lazy loading, the first Job run fetches data from OSS — which can take tens of seconds for large datasets. If your application is latency-sensitive on first access, or if you know exactly which files will be needed, pre-warming pulls data into cache before any Job runs so that even the first run reads from cache.
-
Create a file named
dataload.yaml:apiVersion: data.fluid.io/v1alpha1 kind: DataLoad metadata: name: data-warmup spec: dataset: name: demo-dataset namespace: default loadMetadata: true -
Start the pre-warm job:
kubectl create -f dataload.yamlMonitor progress until the status shows
Complete:NAME DATASET PHASE AGE DURATION data-warmup demo-dataset Complete 99s 58sThe output shows that the data cache warm-up took about
58s.
Step 4: Create a Job application
All pods that mount the demo-dataset PersistentVolumeClaim (PVC) read from the JindoFS cache automatically — no application code changes needed. The alibabacloud.com/fluid-sidecar-target: eci label tells Fluid Webhook to inject the caching sidecar into the pod.
-
Create a file named
job.yaml:apiVersion: batch/v1 kind: Job metadata: name: demo-app spec: template: metadata: labels: alibabacloud.com/fluid-sidecar-target: eci annotations: # Required: disable virtual node scheduling (conflicts with Fluid — see Limitations) alibabacloud.com/burst-resource: eci_only # ECI instance spec for the application pod k8s.aliyun.com/eci-use-specs: ecs.g7.4xlarge spec: containers: - name: demo image: anolis-registry.cn-zhangjiakou.cr.aliyuncs.com/openanolis/nginx:1.14.1-8.6 command: - /bin/bash args: - -c - du -sh /data && time cp -r /data/ /tmp volumeMounts: - mountPath: /data name: demo restartPolicy: Never volumes: - name: demo persistentVolumeClaim: claimName: demo-dataset backoffLimit: 4 -
Submit the Job:
kubectl create -f job.yaml -
Check the Job logs after it completes:
kubectl logs demo-app-jwktf -c demoExpected output:
1.2G /data real 0m0.992s user 0m0.004s sys 0m0.674sThe output shows that the
realtime for copying the file is only0m0.992s.
Step 5: Clean up
Clean up resources to avoid incurring unnecessary charges.
-
Delete the Job:
kubectl delete job demo-app -
Delete the Dataset. This also removes the associated caching system components:
ImportantCleanup takes about one minute. Wait until all caching system pods are fully deleted before proceeding.
kubectl delete dataset demo-dataset -
Scale down the Fluid control plane:
kubectl get deployments.apps -n fluid-system | awk 'NR>1 {print $1}' | xargs kubectl scale deployments -n fluid-system --replicas=0To use the data access feature again, scale the control plane back up before creating new Dataset and JindoRuntime resources:
kubectl scale -n fluid-system deployment dataset-controller --replicas=1 kubectl scale -n fluid-system deployment fluid-webhook --replicas=1