OSS Connector for AI/ML optimizes data throughput for PyTorch training jobs that read datasets from or write checkpoints to Object Storage Service (OSS), so you can focus on your training code rather than storage I/O management.
Prerequisites
Before you begin, ensure that you have:
A 64-bit x86 Linux environment
glibc 2.17 or later
Python 3.8–3.12
PyTorch 2.0 or later
(Optional) A Linux kernel that supports
userfaultfd, required only for the OSS checkpoint feature
The following example uses Ubuntu. To check whether your kernel supports userfaultfd, run:
sudo grep CONFIG_USERFAULTFD /boot/config-$(uname -r)CONFIG_USERFAULTFD=y— the OSS checkpoint feature is available.CONFIG_USERFAULTFD=n— the OSS checkpoint feature is not available on this kernel.
Install OSS Connector for AI/ML
The following example uses Python 3.12. Replace pip3.12 with the pip version that matches your Python installation.
Install the package in the container generated using Linux or an image based on Linux:
pip3.12 install osstorchconnectorVerify the installation:
pip3.12 show osstorchconnectorA successful installation returns package metadata that includes fields such as
Name: osstorchconnectorandVersion: x.x.x:
What's next
Configure access credentials and connector settings so OSS Connector for AI/ML can communicate with OSS. See Configure OSS Connector for AI/ML.