All Products
Search
Document Center

Container Service for Kubernetes:Accelerate Jobs

Last Updated:Mar 26, 2026

Fluid uses JindoRuntime to accelerate access to data stored in Object Storage Service (OSS) in serverless cloud computing scenarios. This topic walks you through cache mode acceleration for Kubernetes Jobs — from uploading your dataset to cleaning up resources.

Cache mode vs. no cache mode: Cache mode stores OSS data locally in JindoFS, so pods read from memory or disk instead of the network. In the example at the end of this topic, cache mode takes 1.76 seconds to replicate the test file, compared to 23.644 seconds in no cache mode — roughly 14x faster.

Mode Time to replicate test file Relative speed
Cache mode (this topic) 1.76 s ~14x faster
No cache mode 23.644 s baseline

Prerequisites

Before you begin, make sure you have:

  • An ACK Pro cluster running a non-ContainerOS OS with Kubernetes 1.18 or later. For more information, see Create an ACK Pro cluster.

    Important

    ack-fluid does not support ContainerOS. If your cluster uses ContainerOS, the component will fail to deploy.

  • The cloud-native AI suite installed with the ack-fluid component deployed.

    • If you have not yet installed the cloud-native AI suite, enable Fluid acceleration during installation. For more information, see Deploy the cloud-native AI suite.

    • If you have already installed the suite, go to the Cloud-native AI Suite page in the ACK console and deploy the ack-fluid component.

    Important

    If you have already installed open source Fluid, uninstall it before deploying ack-fluid.

  • Virtual nodes deployed in the ACK Pro cluster. For more information, see Schedule pods to elastic container instances through virtual nodes.

  • A kubectl client connected to the ACK Pro cluster. For more information, see Connect to a cluster by using kubectl.

  • OSS activated and a bucket created. For more information, see Activate OSS and Create buckets.

Limitations

This feature is mutually exclusive with the elastic scheduling feature of ACK. For more information, see Configure priority-based resource scheduling.

Step 1: Upload the test dataset to the OSS bucket

  1. Download a test dataset of 2 GB. This example uses the BERT wwm_uncased_L-24_H-1024_A-16 dataset.

  2. Upload the dataset to your OSS bucket using ossutil. For more information, see Install ossutil.

Step 2: Create a Dataset and JindoRuntime

Deploy the Dataset and JindoRuntime resources to define your data source and configure caching. Deployment typically takes a few minutes.

  1. Create secret.yaml with the following content to store your OSS credentials:

    apiVersion: v1
    kind: Secret
    metadata:
      name: access-key
    stringData:
      fs.oss.accessKeyId: ****
      fs.oss.accessKeySecret: ****
  2. Deploy the Secret:

    kubectl create -f secret.yaml
  3. Create dataset.yaml with the following content. The file defines two resources:

    • Dataset: tells Fluid where to find the remote data and how to authenticate.

    • JindoRuntime: enables JindoFS caching in your cluster.

    Important

    The default dataset access mode is read-only. To use read/write mode, see Configure the access mode of a dataset.

    Cache medium mediumtype Recommended volumeType When to use
    Memory MEM emptyDir Fastest access; use emptyDir to prevent residual cache data on the node
    Local system disk SSD or HDD emptyDir Use when memory is insufficient; emptyDir prevents residual cache data
    Local data disk SSD or HDD hostPath Set path to the mount path of the data disk on the host

    Parameter reference

    Parameter Description
    mountPoint The UFS path to mount, in the format oss://<oss_bucket>/<bucket_dir>. Do not include endpoint information. Example: oss://mybucket/path/to/dir. If you have only one mount target, set path to /.
    fs.oss.endpoint The public or internal endpoint of your OSS bucket. Using an internal endpoint (oss-cn-<region>-internal.aliyuncs.com) is recommended when your ACK cluster and OSS bucket are in the same region.
    fs.oss.accessKeyId The AccessKey ID used to access the bucket.
    fs.oss.accessKeySecret The AccessKey secret used to access the bucket.
    replicas The number of JindoFS worker nodes to create.
    mediumtype The cache medium type. Valid values: HDD, SSD, MEM. For recommended configurations, see Policy 2: Select proper cache media.
    volumeType The volume type for the cache medium. Valid values: emptyDir and hostPath. Default value: hostPath. Use emptyDir for memory or local system disks to prevent residual cache data; use hostPath for local data disks. For recommended configurations, see Policy 2: Select proper cache media.
    path The cache path on the node. Only one path can be specified.
    quota The maximum cache size. For example, 5Gi sets the limit to 5 GiB.
    high The upper storage watermark. JindoFS starts evicting cache when usage exceeds this ratio.
    low The lower storage watermark. JindoFS stops evicting cache when usage drops to this ratio.
    apiVersion: data.fluid.io/v1alpha1
    kind: Dataset
    metadata:
      name: serverless-data
    spec:
      mounts:
      - mountPoint: oss://<oss_bucket>/<bucket_dir>
        name: demo
        path: /
        options:
          fs.oss.endpoint: <oss_endpoint>
        encryptOptions:
          - name: fs.oss.accessKeyId
            valueFrom:
              secretKeyRef:
                name: access-key
                key: fs.oss.accessKeyId
          - name: fs.oss.accessKeySecret
            valueFrom:
              secretKeyRef:
                name: access-key
                key: fs.oss.accessKeySecret
    ---
    apiVersion: data.fluid.io/v1alpha1
    kind: JindoRuntime
    metadata:
      name: serverless-data
    spec:
      replicas: 1
      tieredstore:
        levels:
          - mediumtype: MEM
            volumeType: emptyDir
            path: /dev/shm
            quota: 5Gi
            high: "0.95"
            low: "0.7"

    Choose a cache medium and volume type Pick the combination that fits your workload before configuring the parameters:

  4. Deploy the Dataset and JindoRuntime:

    kubectl create -f dataset.yaml
  5. Verify that the Dataset is ready:

    kubectl get dataset serverless-data

    Expected output:

    NAME              UFS TOTAL SIZE   CACHED   CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
    serverless-data   1.16GiB          0.00B    5.00GiB          0.0%                Bound   2m8s

    The Dataset is ready when PHASE shows Bound.

  6. Verify that the JindoRuntime is ready:

    kubectl get jindo serverless-data

    Expected output:

    NAME              MASTER PHASE   WORKER PHASE   FUSE PHASE   AGE
    serverless-data   Ready          Ready          Ready        2m51s

    The JindoRuntime is ready when FUSE PHASE shows Ready.

(Optional) Step 3: Prefetch data

Prefetching loads OSS data into the local JindoFS cache before your Job runs. This eliminates cold-start latency on the first run and is recommended if your dataset is large or your network connection to OSS is slow.

  1. Create dataload.yaml with the following content:

    apiVersion: data.fluid.io/v1alpha1
    kind: DataLoad
    metadata:
      name: serverless-data-warmup
    spec:
      dataset:
        name: serverless-data
        namespace: default
      loadMetadata: true
  2. Deploy the DataLoad:

    kubectl create -f dataload.yaml
  3. Check prefetching progress:

    kubectl get dataload

    Expected output:

    NAME                     DATASET           PHASE      AGE     DURATION
    serverless-data-warmup   serverless-data   Complete   2m49s   45s

    Prefetching is complete when PHASE shows Complete. In this example, it takes 45 seconds.

  4. Verify the cache fill:

    kubectl get dataset

    Expected output:

    NAME              UFS TOTAL SIZE   CACHED    CACHE CAPACITY   CACHED PERCENTAGE   PHASE   AGE
    serverless-data   1.16GiB          1.16GiB   5.00GiB          100.0%              Bound   5m20s

    CACHED PERCENTAGE shows 100.0% after prefetching, confirming that all data is cached locally. Before prefetching, CACHED PERCENTAGE shows 0.0%.

Step 4: Run a Job to access OSS data

Create a Deployment that mounts the cached dataset. Fluid handles the pod adaptation automatically — no changes to your application container are required.

Choose the target compute environment for your pods:

Deploy an application pod as an Elastic Container Instance

Add the alibabacloud.com/fluid-sidecar-target=eci label to declare that the pod runs as an Elastic Container Instance (ECI). Fluid automatically converts the pod to an ECI-compatible format when it is created.

  1. Create job.yaml with the following content:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: model-serving
    spec:
      selector:
        matchLabels:
          app: model-serving
      template:
        metadata:
          labels:
            app: model-serving
            alibabacloud.com/fluid-sidecar-target: eci
            alibabacloud.com/eci: "true"
        spec:
          containers:
            - image: fluidcloudnative/serving
              name: serving
              ports:
                - name: http1
                  containerPort: 8080
              env:
                - name: TARGET
                  value: "World"
              volumeMounts:
                - mountPath: /data
                  name: data
          volumes:
            - name: data
              persistentVolumeClaim:
                claimName: serverless-data

Deploy an application pod as an ACS pod

Add the alibabacloud.com/fluid-sidecar-target=acs label to declare that the pod uses Alibaba Cloud Container Compute Service (ACS) compute resources. Fluid automatically adapts the pod to the ACS environment when it is created.

Important
  • ack-fluid v1.0.11 or later is required to access cached Fluid data in ACS application containers.

  • Accessing cached Fluid data in ACS containers relies on advanced ACS pod features. Submit a support ticket to enable this feature.

  1. Create job.yaml with the following content:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: model-serving
    spec:
      selector:
        matchLabels:
          app: model-serving
      template:
        metadata:
          labels:
            app: model-serving
            alibabacloud.com/fluid-sidecar-target: acs
            alibabacloud.com/acs: "true"
            alibabacloud.com/compute-qos: default
            alibabacloud.com/compute-class: general-purpose
        spec:
          containers:
            - image: fluidcloudnative/serving
              name: serving
              ports:
                - name: http1
                  containerPort: 8080
              env:
                - name: TARGET
                  value: "World"
              volumeMounts:
                - mountPath: /data
                  name: data
          volumes:
            - name: data
              persistentVolumeClaim:
                claimName: serverless-data

Deploy and verify

  1. Deploy the Job:

    kubectl create -f job.yaml
  2. Check the container log to verify data access performance:

    kubectl logs demo-app--1-7zqdm -c demo

    Expected output:

    real    0m1.760s
    user    0m0.002s
    sys     0m0.740s

    The real time of 0m1.760s shows how long it took to replicate the file from the cached dataset. Compared to no cache mode (23.644 seconds), cache mode is roughly 14x faster.

Step 5: Clean up

After testing, delete the resources to avoid leaving orphaned cache data on your nodes.

  1. Delete the Job:

    kubectl delete job demo-app
  2. Delete the Dataset (this also removes the associated JindoRuntime):

    kubectl delete dataset serverless-data

What's next