This topic describes how to use the Alibaba Cloud Linux 3 AI Extension Edition with Alibaba Cloud AI container images to improve performance.
Enable the keentune optimization tool.
The keentune optimization tool is pre-installed on the Alibaba Cloud Linux 3 AI Extension Edition image. This tool provides optimizations for various scenarios. Follow these steps to enable optimization for AI scenarios.
systemctl stop tuned systemctl disable tuned systemctl start keentune-target systemctl enable keentune-target systemctl enable keentuned systemctl start keentuned keentune profile set ai_train.profile
The keentune optimizations require an OS restart to take effect. To disable the optimizations, run the keentune profile rollback command. This change also requires an OS restart.Install Docker.
To install Docker and its related components, see Train models using PyTorch GPU images.
Obtain the test image.
docker pull ac2-registry.cn-hangzhou.cr.aliyuncs.com/ac2/openclip-bevformer:v0.1-torch2.6-cuda12.6-py3.10-ubuntu22.04Download the datasets.
The container image does not include datasets. After you download the image, run the following commands to download the required model datasets:
OpenCLIP training and inference datasets
# Download the training dataset mkdir -p /workspace/dataset && cd /workspace/dataset wget -O mscoco.parquet https://hf-mirror.com/datasets/ChristophSchuhmann/MS_COCO_2017_URL_TEXT/resolve/main/mscoco.parquet?download=true pip3 install img2dataset webdataset==0.2.86 numpy==1.23.5 --ignore-installed && NO_ALBUMENTATIONS_UPDATE=1 img2dataset --url_list mscoco.parquet --input_format "parquet" --url_col "URL" --caption_col "TEXT" --output_format webdataset --output_folder COCO_2017_Captions-webdataset-592k-256x256-296shards --processes_count 16 --thread_count 64 --image_size 256 --number_sample_per_shard 2000 # Download the inference dataset mkdir -p /workspace/dataset && cd /workspace/dataset wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar --no-check-certificate mkdir -p ILSVRC2012_img_val tar xvf ILSVRC2012_img_val.tar -C ILSVRC2012_img_val cd ILSVRC2012_img_val/ && wget https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh && bash valprep.shFor more information, see GitHub - mlfoundations/open_clip: An open source implementation of CLIP.
BEVFormer training dataset
mkdir -p /workspace/BEVFormer/data && cd /workspace/BEVFormer/data wget https://d36yt3mvayqw5m.cloudfront.net/public/v1.0/v1.0-mini.tgz mkdir -p nuscenes && tar -xzf v1.0-mini.tgz -C ./nuscenes/ wget https://d36yt3mvayqw5m.cloudfront.net/public/v1.0/can_bus.zip unzip -q can_bus.zip cd /workspace/BEVFormer && python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --version v1.0-mini --canbus ./dataFor more information, see BEVFormer/docs/prepare_dataset.md at master · fundamentalvision/BEVFormer.
Run tests in the container image.
OpenCLIP
Training command:
cd /workspace/open_clip/src && torchrun --nproc_per_node 8 -m open_clip_train.main --model RN50 --train-data /workspace/dataset/COCO_2017_Captions-webdataset-592k-256x256-296shards/\{00000..00295\}.tar --train-num-samples 591753 --dataset-type webdataset --batch-size 1152 --precision amp --workers 8 --epochs 4 --log-every-n-steps 1 --torchcompileInference command:
cd /workspace/open_clip/src && torchrun --nproc_per_node 1 -m open_clip_train.main --imagenet-val /workspace/dataset/ILSVRC2012_img_val --model RN50 --batch-size 1152 --workers 8 --pretrained openai
BEVFormer
Training command:
cd /workspace/BEVFormer && TORCH_ALLOW_TF32_CUBLAS_OVERRIDE=1 tools/dist_train.sh projects/configs/bevformer/bevformer_base.py 8