All Products
Search
Document Center

Object Storage Service:Performance metrics of OSS accelerator

Last Updated:Mar 20, 2026

OSS accelerator caches OSS data at edge nodes close to your compute resources, reducing the distance data must travel on each request. This page shows benchmark results across five workloads so you can estimate the impact for your scenario.

The gains range from ~1.6× to 10×. The higher your data volume, request concurrency, and throughput demand, the more headroom OSS accelerator creates.

Benchmark summary

WorkloadSpeedup vs. OSSKey metric
Batch downloads (ossutil)~10× faster2.2 MB/s → 24 MB/s
ML/DL training data reads~1.6× fasterUp to 123,043 img/s
Download response latency~10× lower latencyP50 and P999 both reduced
Data lake queries (analytics)2–2.5× faster (large scans)85% of local ESSD CacheFS speed
Simulation training (containers)60% shorter training time100 Gbps → 300 Gbps peak bandwidth

Batch downloads with ossutil

Test setup: ossutil cp command downloading 10,000 objects (100 KB each, 976 MB total) from an OSS bucket to a local computer. Compares the OSS internal endpoint against the OSS accelerator accelerated endpoint with data preloading enabled.

ToolOSS internal endpointOSS accelerator accelerated endpoint
ossutil2.2 MB/s24 MB/s

Result: ~10× faster. OSS accelerator significantly improves throughput for batch data transfers using tools like ossutil.

Machine learning and deep learning

Test setup: Reading data from OssIterableDataset and OssMapDataset datasets created by OSS Connector for AI/ML. Dataset: 10,000,000 objects averaging 100 KB each (1 TB total).

ParameterValue
Dataloader batch size256
Dataloader workers32
TransformNo preprocessing (object.read(), returns object.key and object.label)
Dataset typeOSS internal endpointOSS accelerator accelerated endpoint
OssIterableDataset99,920 img/s123,043 img/s
OssMapDataset56,564 img/s78,264 img/s

Result: ~1.6× faster. OSS Connector for AI/ML already handles high-concurrency access at high bandwidth on its own. OSS accelerator adds further throughput on top of that baseline.

Download response latency

Test setup: Downloading 10 MB objects multiple times, measuring response latency in milliseconds with OSS accelerator disabled (direct OSS access) and enabled.

P50 is the 50th percentile — half of all requests complete within this time. P999 is the 99.9th percentile — effectively the worst-case tail latency. Tail latency matters for interactive workloads: even if median response is fast, slow outliers degrade the user experience.

Latency comparison chart: P50 and P999 with and without OSS accelerator

Result: ~10× lower latency. The improvement appears at both P50 and P999, so OSS accelerator reduces both typical and worst-case response times.

Data lakes and data warehouses

Test setup: Query performance on a lineitem table (~2 billion rows, 760 GB), comparing local ESSD CacheFS, direct OSS access, and OSS accelerator.

ScenarioLocal ESSD CacheFSOSSOSS accelerator
Point queries382 ms2,451 ms1,160 ms
Random queries on 1,000 rows438 ms3,786 ms1,536 ms
Random queries on 10% of data130,564 ms345,707 ms134,659 ms
Full scan171,548 ms398,681 ms197,134 ms

Results:

  • Full scans and large random queries (10% of data): OSS accelerator is 2–2.5× faster than direct OSS access and reaches ~85% of local ESSD CacheFS performance.

  • Point queries and small random queries (1,000 rows): OSS accelerator is 1.5–3× faster than direct OSS access and reaches ~30% of local ESSD CacheFS performance. The fixed per-request latency of 8–10 ms limits gains when individual requests are very small.

Simulation training for containers and autonomous driving

Test setup: Simultaneous container startup to pull images, maps, and log data for simulation training. Total OSS data: 204 TB.

Storage configurationData volumePeak bandwidthTraining duration
OSS only204 TB100 Gbps2.2 hours
OSS + OSS accelerator204 TB (OSS) + 128 TB (OSS accelerator cache)300 Gbps40 minutes

Result: 60% reduction in total training time. The 3× bandwidth increase — from 100 Gbps to 300 Gbps — drives the speedup when many containers read data simultaneously at startup.

What's next