This topic describes how to install and use the OSS Connector for AI/ML component in Kubernetes deployments to load model data from OSS for inference services.
Install the Connector component
To use the Connector in a Kubernetes deployment, you must install the Connector component in your application container. Choose one of the following installation methods based on your use case.
Method 1: Init Container
Use an Init Container to download and extract the Connector component to a shared volume before the application container starts.
initContainers:
# Init Container: runs before the application container to prepare dependencies
- name: install-connector
image: busybox
command:
- sh
- -c
- |
# Download the Connector DEB package
wget -q https://gosspublic.alicdn.com/oss-connector/oss-connector-lib-1.2.0.x86_64.deb -O /tmp/connector.deb
# Create a temp directory and extract the DEB package
mkdir -p /tmp/extract && cd /tmp/extract
ar x /tmp/connector.deb
# Extract only the required .so file to the shared directory
# The /shared directory is mounted as a volume visible to the main container
mkdir -p /shared/usr/local/lib
tar -xf data.tar.gz -O ./usr/local/lib/libossc_preload.so > /shared/usr/local/lib/libossc_preload.so
volumeMounts:
# Mount the connector-lib volume at /shared
# The main container also mounts this volume for file sharing
- name: connector-lib
mountPath: /shared
containers:
- name: vllm
image: vllm/vllm-openai:latest
volumeMounts:
# Mount the connector-lib volume at /usr/local/lib in the container
# This makes the .so file extracted by the Init Container available
- name: connector-lib
mountPath: /usr/local/lib
subPath: usr/local/lib
volumes:
# Shared volume for passing files between the Init Container and the main container
- name: connector-lib
emptyDir: {}
Method 2: Install at startup
Download and install the Connector component directly in the container startup command.
containers:
- name: vllm
image: vllm/vllm-openai:latest
command: ["/bin/bash", "-c"]
args:
- |
# Download and install the Connector DEB package
wget https://gosspublic.alicdn.com/oss-connector/oss-connector-lib-1.2.0.x86_64.deb
dpkg -i oss-connector-lib-1.2.0.x86_64.deb
# Start the model serving process
ENABLE_CONNECTOR=1 python3 -m vllm.entrypoints.openai.api_server --model ${MODEL_DIR} ...
Method 3: Custom Dockerfile
Build a custom image with the Connector pre-installed, using the official vLLM image vllm/vllm-openai as the base.
FROM vllm/vllm-openai:latest
RUN wget https://gosspublic.alicdn.com/oss-connector/oss-connector-lib-1.2.0.x86_64.deb && \
dpkg -i oss-connector-lib-1.2.0.x86_64.deb
Build and push the image:
docker build -t myregistry/vllm-with-connector:latest .
docker push myregistry/vllm-with-connector:latest
Use the image with the pre-installed Connector component:
containers:
- name: vllm
image: myregistry/vllm-with-connector:latest
Comparison of installation methods
|
Method |
Use case |
Pros |
Cons |
|
Init Container / Install at startup |
Quick testing, validation, development, when you do not want to modify the base image |
No custom image required, flexible deployment, simple configuration |
Downloads and extracts on every startup, longer startup time, requires external network access |
|
Custom Dockerfile |
Production environments, long-running services, large-scale clusters |
Fastest startup, self-contained image, high stability, reusable |
Requires maintaining a custom image, version-locked |
Deploy a model inference service
The following example uses the official vLLM image vllm/vllm-openai with the Init Container method to install the Connector component and deploy an inference service that loads model data from OSS.
# ConfigMap: stores the Connector configuration file
# Purpose: mounts the Connector config as a file inside the container
# Mount path: /etc/oss-connector/config.json
apiVersion: v1
kind: ConfigMap
metadata:
name: connector-config
data:
config.json: |
{
"logLevel": 1,
"logPath": "/var/log/oss-connector/connector.log",
"auditPath": "/var/log/oss-connector/audit.log",
"expireTimeSec": 120,
"prefetch": {
"vcpus": 16,
"workers": 16
}
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: model-connector-deployment
spec:
selector:
matchLabels:
app: model-connector
template:
metadata:
labels:
app: model-connector
spec:
# Init Container: runs before the application container to prepare dependencies
initContainers:
- name: install-connector
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
# Download the Connector DEB package
wget -q https://gosspublic.alicdn.com/oss-connector/oss-connector-lib-1.2.0.x86_64.deb -O /tmp/connector.deb
# Create a temp directory and extract the DEB package
mkdir -p /tmp/extract && cd /tmp/extract
ar x /tmp/connector.deb
# Extract only the required .so file to the shared directory
# The /shared directory is mounted as a volume visible to the main container
mkdir -p /shared/usr/local/lib
tar -xf data.tar.gz -O ./usr/local/lib/libossc_preload.so > /shared/usr/local/lib/libossc_preload.so
volumeMounts:
# Mount the connector-lib volume at /shared
# The main container also mounts this volume for file sharing
- name: connector-lib
mountPath: /shared
containers:
- name: vllm
image: vllm/vllm-openai:latest
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: "16"
memory: "70Gi"
limits:
cpu: "20"
memory: "80Gi"
command: ["/bin/bash", "-c"]
args:
- |
# Add the Connector library to LD_PRELOAD to intercept file system calls
export LD_PRELOAD="/usr/local/lib/libossc_preload.so${LD_PRELOAD:+:$LD_PRELOAD}"
# Start the vLLM server
# ENABLE_CONNECTOR=1 enables OSS Connector for accelerated model loading
# The model path ${MODEL_DIR}/qwen/Qwen3-8B/ points to an OSS path,
# intercepted and redirected by the Connector
ENABLE_CONNECTOR=1 python3 -m vllm.entrypoints.openai.api_server \
--model ${MODEL_DIR}/qwen/Qwen3-8B/ \
--trust-remote-code \
--tensor-parallel-size 1 \
--disable-custom-all-reduce
env:
# OSS access configuration: specify the internal endpoint and region
- name: OSS_ENDPOINT
value: "oss-cn-beijing-internal.aliyuncs.com"
- name: OSS_REGION
value: "cn-beijing"
# Root path on OSS. The Connector maps local paths to this OSS path
- name: OSS_PATH
value: "oss://examplebucket/"
# Local mount directory (intercepted by the Connector, data is loaded from OSS)
- name: MODEL_DIR
value: "/var/model"
# Read OSS access credentials from the Secret (oss-access-key-connector)
# Create the Secret in advance:
# kubectl create secret generic oss-access-key-connector \
# --from-literal=key=<OSS_ACCESS_KEY_ID> \
# --from-literal=secret=<OSS_ACCESS_KEY_SECRET>
- name: OSS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: oss-access-key-connector
key: key
- name: OSS_ACCESS_KEY_SECRET
valueFrom:
secretKeyRef:
name: oss-access-key-connector
key: secret
volumeMounts:
# Mount the Connector configuration file
- name: connector-config
mountPath: /etc/oss-connector/
# Mount the connector-lib volume at /usr/local/lib in the container
# This makes the .so file extracted by the Init Container available
- name: connector-lib
mountPath: /usr/local/lib
subPath: usr/local/lib
terminationGracePeriodSeconds: 10
volumes:
# Connector configuration ConfigMap
- name: connector-config
configMap:
name: connector-config
# Shared volume for passing files between the Init Container and the main container
- name: connector-lib
emptyDir: {}
Deploy a multi-instance model broadcast service
The following example uses an image with the Connector pre-installed (myregistry/vllm-with-connector) to enable model broadcast across multiple replicas. For more information about model broadcast, see Model Broadcast.
# ConfigMap: stores the Connector configuration template
# Purpose: provides a template (config.json.tmpl) that the init container renders
# into the final config file for the main container
# Template mount path (init container): /tmpl/config.json.tmpl
# Rendered output path (main container): /etc/oss-connector/config.json
apiVersion: v1
kind: ConfigMap
metadata:
name: connector-config
data:
config.json.tmpl: |
{
"logLevel": 1,
"logPath": "/var/log/oss-connector/connector.log",
"auditPath": "/var/log/oss-connector/audit.log",
"expireTimeSec": 120,
"prefetch": {
"vcpus": 16,
"workers": 16
},
"broadcast": {
"enableBroadcast": true,
"tenant": "${REDIS_TENANT}",
"db": {
"host": "${REDIS_HOST}",
"port": 6379,
"username": "${REDIS_USERNAME}",
"password": "${REDIS_PASSWORD}"
}
},
"bindPort": 19989
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: model-connector-deployment
spec:
replicas: 2
selector:
matchLabels:
app: model-connector
template:
metadata:
labels:
app: model-connector
spec:
# Init container: renders the config template into the final config.json
# before the main container starts
initContainers:
- name: render-config
image: busybox
command: ["/bin/sh", "-c"]
args:
# Use sed to replace template placeholders with actual environment variable values
- |
sed -e "s|\${REDIS_HOST}|$REDIS_HOST|g" \
-e "s|\${REDIS_USERNAME}|$REDIS_USERNAME|g" \
-e "s|\${REDIS_PASSWORD}|$REDIS_PASSWORD|g" \
-e "s|\${REDIS_TENANT}|$REDIS_TENANT|g" \
/tmpl/config.json.tmpl > /etc/oss-connector/config.json
env:
# Read Redis connection details from the Secret (redis-secret)
# Create the Secret in advance:
# kubectl create secret generic redis-secret \
# --from-literal=host=<host> \
# --from-literal=username=<username> \
# --from-literal=password=<password>
- name: REDIS_HOST
valueFrom:
secretKeyRef:
name: redis-secret
key: host
- name: REDIS_USERNAME
valueFrom:
secretKeyRef:
name: redis-secret
key: username
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: password
- name: REDIS_TENANT
value: "broadcast-demo"
volumeMounts:
# Mount the ConfigMap template as the input for sed
- name: connector-config
mountPath: /tmpl
# Mount the shared emptyDir to write the rendered config for the main container
- name: rendered-config
mountPath: /etc/oss-connector
containers:
- name: vllm
image: myregistry/vllm-with-connector:latest
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: "16"
memory: "70Gi"
limits:
cpu: "20"
memory: "80Gi"
command: ["/bin/bash", "-c"]
args:
- |
# Add the Connector library to LD_PRELOAD to intercept file system calls
export LD_PRELOAD="/usr/local/lib/libossc_preload.so${LD_PRELOAD:+:$LD_PRELOAD}"
# Start the vLLM server
# ENABLE_CONNECTOR=1 enables OSS Connector for accelerated model loading
ENABLE_CONNECTOR=1 python3 -m vllm.entrypoints.openai.api_server \
--model ${MODEL_DIR}/qwen/Qwen3-8B/ \
--trust-remote-code \
--tensor-parallel-size 1 \
--disable-custom-all-reduce
env:
# OSS access configuration: specify the internal endpoint and region
- name: OSS_ENDPOINT
value: "oss-cn-beijing-internal.aliyuncs.com"
- name: OSS_REGION
value: "cn-beijing"
# Root path on OSS. The Connector maps local paths to this OSS path
- name: OSS_PATH
value: "oss://examplebucket/"
# Local mount directory (intercepted by the Connector, data is loaded from OSS)
- name: MODEL_DIR
value: "/var/model"
# Read OSS access credentials from the Secret (oss-access-key-connector)
- name: OSS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: oss-access-key-connector
key: key
- name: OSS_ACCESS_KEY_SECRET
valueFrom:
secretKeyRef:
name: oss-access-key-connector
key: secret
volumeMounts:
# Connector configuration (rendered by init container)
- name: rendered-config
mountPath: /etc/oss-connector/
terminationGracePeriodSeconds: 10
volumes:
# Connector config template ConfigMap (mounted in init container)
- name: connector-config
configMap:
name: connector-config
# Rendered config (shared from init container to main container)
- name: rendered-config
emptyDir: {}