Fluid's mutating webhook injects affinity rules into pod specs so that kube-scheduler places application pods on the nodes that hold cached data—or on nodes in the same zone or region as the cache when no cache-holding node is available. With Fluid, you can:
-
Schedule pods to the node holding cached data (node affinity, weight 100)
-
Schedule pods to any node in the same zone as the cache (zone affinity, weight 50)
-
Schedule pods to any node in the same region as the cache (region affinity, weight 20)
-
Force pods onto cache-holding nodes when data locality is critical (required affinity)
-
Steer pods that do not use datasets away from cache-holding nodes to reduce resource contention
Limitations
-
Supported only on ACK Pro clusters.
-
Incompatible with Elastic Container Instance-based scheduling and priority-based resource scheduling.
-
If
spec.affinityorspec.nodeSelectoris already set in a pod spec, Fluid skips affinity injection for that pod.
Prerequisites
Before you begin, ensure that you have:
-
An ACK Pro cluster running Kubernetes 1.18 or later. For more information, see Create an ACK Pro cluster.
-
The cloud-native AI suite and ack-fluid 1.0.6 or later deployed in the cluster. For more information, see Deploy the cloud-native AI suite.
ImportantIf you already have open-source Fluid installed, uninstall it before deploying the ack-fluid component.
-
A kubectl client connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
How it works
When a pod is created, Fluid's mutating webhook reads the pod's labels and the Dataset it references, then injects nodeAffinity rules into the pod spec before kube-scheduler evaluates placement. The injected rules use preferredDuringSchedulingIgnoredDuringExecution (soft affinity) by default. To force hard placement on cache-holding nodes, add the label fluid.io/dataset.<dataset_name>.sched: required to the pod.
The fallback order is: node → zone → region. If a pod cannot be placed on a node with cached data, kube-scheduler falls back to a node in the same zone, then the same region, based on the configured weights.
Pods that do not reference a dataset are handled by the PreferNodesWithoutCache plugin, which steers them away from nodes reserved for data caching.
Configure the scheduling policy
Default configuration
The scheduling policy is stored in the webhook-plugins ConfigMap in the fluid-system namespace. To inspect it:
kubectl get cm -n fluid-system webhook-plugins -oyaml
Expected output:
apiVersion: v1
data:
pluginsProfile: |
pluginConfig:
- args: |
preferred:
# fluid.io/node: built-in, name cannot be changed. Schedules pods to the node holding cached data.
- name: fluid.io/node
weight: 100
# topology.kubernetes.io/zone: schedules pods to nodes in the same zone as the cache. Adjust key for your cluster.
- name: topology.kubernetes.io/zone
weight: 50
# topology.kubernetes.io/region: schedules pods to nodes in the same region as the cache. Adjust key for your cluster.
- name: topology.kubernetes.io/region
weight: 20
# required: applies when a pod carries the label fluid.io/dataset.{dataset name}.sched=required
required:
- fluid.io/node
name: NodeAffinityWithCache
plugins:
serverful:
withDataset:
- RequireNodeWithFuse
- NodeAffinityWithCache
- MountPropagationInjector
withoutDataset:
- PreferNodesWithoutCache
serverless:
withDataset:
- FuseSidecar
withoutDataset: []
The preferred list controls soft affinity weights. The required list controls which label must match when a pod opts into hard scheduling. The fluid.io/node name in both sections cannot be changed.
Custom configuration
ACK clusters may use different node labels to represent topology. To add, remove, or replace topology keys:
-
Edit the ConfigMap:
kubectl edit -n fluid-system cm webhook-plugins -
Modify the
preferredlist. Two common scenarios are described below. Example: Ignore node-level cache affinity Comment outfluid.io/nodeto drop node-level placement preference. Fluid still prefers zone and region affinity.preferred: # - name: fluid.io/node # commented out: node affinity disabled # weight: 100 - name: topology.kubernetes.io/zone weight: 50 - name: topology.kubernetes.io/region weight: 20Example: Add node pool affinity Insert a custom topology key between node-level and zone-level affinity to prefer scheduling within the same node pool.
preferred: - name: fluid.io/node weight: 100 - name: alibabacloud.com/nodepool-id # custom topology key weight: 80 - name: topology.kubernetes.io/zone weight: 50 - name: topology.kubernetes.io/region weight: 20 -
Restart the Fluid webhook to apply the changes:
kubectl rollout restart deployment -n fluid-system fluid-webhook
Examples
Example 1: Preferred node affinity (soft scheduling)
This example schedules a pod to the node that holds the cached data. If no cache-holding node is schedulable, kube-scheduler falls back to zone and region affinity based on the configured weights.
-
Create a Secret with your OSS credentials:
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET> -
Create a Dataset and a JindoRuntime:
ImportantThis example uses JindoRuntime. To use other cache runtimes, see Use EFC to accelerate access to NAS or CPFS. For JindoFS and Object Storage Service (OSS) acceleration, see Use JindoFS to accelerate access to OSS.
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8" -
Create the application pod with affinity injection enabled:
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" # enables Fluid affinity injection spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-dataset # PVC auto-created by Fluid, named after the Dataset -
Verify that Fluid injected the affinity rule:
kubectl get pod nginx -oyamlThe pod spec should contain a
preferredDuringSchedulingIgnoredDuringExecutionrule targeting the cache-holding node:spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true" weight: 100 -
Confirm the pod was scheduled to a cache-holding node:
kubectl get pod nginx -o custom-columns=NAME:metadata.name,NODE:.spec.nodeNameThe node shown should be one of the JindoRuntime worker nodes where data is cached.
Example 2: Preferred zone affinity (soft scheduling)
This example schedules pods to any node in the zone where cached data resides. Both node-level and zone-level affinity rules are injected, so kube-scheduler first tries cache-holding nodes, then other nodes in the same zone.
To enable zone affinity, pin the Dataset and JindoRuntime master to a specific zone.
-
Create a Secret (same as Example 1):
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET> -
Create a Dataset and a JindoRuntime, both pinned to the target zone:
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - "<ZONE_ID>" # e.g., cn-beijing-i mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 master: nodeSelector: topology.kubernetes.io/zone: <ZONE_ID> # e.g., cn-beijing-i tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8"The
nodeAffinity.required.nodeSelectorTermsconstraint on the Dataset tells Fluid which zone the cache lives in. Fluid reads this to generate the zone-level affinity rule injected into application pods. -
Create the application pod:
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-dataset -
Verify the injected affinity rules:
kubectl get pod nginx -oyamlBoth a node-level and a zone-level affinity rule should appear:
spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true" weight: 100 - preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - <ZONE_ID> # e.g., cn-beijing-i weight: 50 -
Confirm the pod was scheduled to a node in the target zone:
kubectl get pod nginx -o custom-columns=NAME:metadata.name,NODE:.spec.nodeName kubectl get node <node_name> --show-labels | grep topology.kubernetes.io/zoneThe node should have the
topology.kubernetes.io/zone=<ZONE_ID>label.
Example 3: Required node affinity (hard scheduling)
This example forces the pod onto a node that holds cached data. If no cache-holding node is schedulable, the pod stays pending. Use this when data locality is non-negotiable—for example, in latency-sensitive training jobs.
-
Create a Secret (same as Example 1):
apiVersion: v1 kind: Secret metadata: name: mysecret stringData: fs.oss.accessKeyId: <ACCESS_KEY_ID> fs.oss.accessKeySecret: <ACCESS_KEY_SECRET> -
Create a Dataset and a JindoRuntime (same as Example 1):
apiVersion: data.fluid.io/v1alpha1 kind: Dataset metadata: name: demo-dataset spec: mounts: - mountPoint: oss://<oss_bucket>/<bucket_dir> options: fs.oss.endpoint: <oss_endpoint> name: hadoop path: "/" encryptOptions: - name: fs.oss.accessKeyId valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeyId - name: fs.oss.accessKeySecret valueFrom: secretKeyRef: name: mysecret key: fs.oss.accessKeySecret --- apiVersion: data.fluid.io/v1alpha1 kind: JindoRuntime metadata: name: demo-dataset spec: replicas: 2 tieredstore: levels: - mediumtype: MEM path: /dev/shm quota: 10G high: "0.99" low: "0.8" -
Create the application pod with the required-scheduling label:
apiVersion: v1 kind: Pod metadata: name: nginx labels: fuse.serverful.fluid.io/inject: "true" fluid.io/dataset.demo-dataset.sched: required # forces hard affinity for demo-dataset spec: containers: - name: nginx image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6 volumeMounts: - mountPath: /data name: data-vol volumes: - name: data-vol persistentVolumeClaim: claimName: demo-datasetThe label format is
fluid.io/dataset.<dataset_name>.sched: required. Replace<dataset_name>with the name of your Dataset. -
Verify the injected affinity rule:
kubectl get pod nginx -oyamlA
requiredDuringSchedulingIgnoredDuringExecutionrule should appear, blocking the pod from scheduling to any node that does not hold the cached data:spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: fluid.io/s-default-demo-dataset operator: In values: - "true" -
Confirm the pod was scheduled to a cache-holding node:
kubectl get pod nginx -o custom-columns=NAME:metadata.name,NODE:.spec.nodeNameThe node shown should be one of the JindoRuntime worker nodes. If no such node is available, the pod remains in
Pendingstate until a cache-holding node becomes schedulable.