Importing PyTorch on a GPU-accelerated Linux instance that runs Alibaba Cloud Linux 3 may fail with the following error due to a CUDA version mismatch:
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 235, in <module>
from torch._C import * # noqa: F403
ImportError: /usr/local/lib/python3.8/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12
Cause
The installed CUDA version is incompatible with the installed PyTorch version.
Running sudo pip3 install torch installs PyTorch 2.1.2, which requires CUDA 12.1. However, the GPU-accelerated instance auto-installs CUDA 12.0. This version mismatch causes the nvJitLinkAddData symbol error.
For CUDA-to-PyTorch version mappings, see Previous PyTorch Versions.
Diagnose the issue
Confirm the version mismatch by running the following commands on the instance:
Check the installed CUDA version:
nvcc --version
Check the CUDA version that PyTorch expects:
python3 -c "import torch; print(torch.version.cuda)"
If the two versions differ, apply one of the following solutions.
Solution
These solutions apply when Auto-install GPU Driver was selected on the Public Images tab in the Image section during instance purchase in the Elastic Compute Service (ECS) console.
-
Method 1: Manually install CUDA
Install CUDA 12.1 on the existing instance. This approach does not require a new instance.
For installation steps, see NVIDIA CUDA Installation Guide for Linux.
After installation, verify the fix:
python3 -c "import torch; print(torch.__version__)"
-
Method 2: Install CUDA by using a custom script
This method requires releasing the existing instance and purchasing a new one with a custom startup script that installs the correct CUDA version.
-
Release the existing GPU-accelerated instance.
For more information, see Release instances.
-
Purchase a new GPU-accelerated instance.
For more information, see Create a GPU-accelerated instance. Configure the following settings:
-
On the Public Images tab in the Image section, do not select Auto-install GPU Driver.
-
In the User Data part of the Advanced Settings(Optional) section, enter the following custom script. This script installs the NVIDIA Tesla driver 535.154.05 and CUDA 12.1.1:
Sample custom script
#!/bin/sh
#Please input version to install
DRIVER_VERSION="535.154.05"
CUDA_VERSION="12.1.1"
CUDNN_VERSION="8.9.7.29"
IS_INSTALL_eRDMA="FALSE"
IS_INSTALL_RDMA="FALSE"
INSTALL_DIR="/root/auto_install"
#using .run to install driver and cuda
auto_install_script="auto_install_v4.0.sh"
script_download_url=$(curl http://100.100.100.200/latest/meta-data/source-address | head -1)"/opsx/ecs/linux/binary/script/${auto_install_script}"
echo $script_download_url
rm -rf $INSTALL_DIR
mkdir -p $INSTALL_DIR
cd $INSTALL_DIR && wget -t 10 --timeout=10 $script_download_url && bash ${INSTALL_DIR}/${auto_install_script} $DRIVER_VERSION $CUDA_VERSION $CUDNN_VERSION $IS_INSTALL_RDMA $IS_INSTALL_eRDMA
After the instance starts, verify the fix:
python3 -c "import torch; print(torch.__version__)"
-
Method 3: Modify the instance user data and change the OS
This method reinstalls CUDA on the existing instance by modifying the user data script and replacing the OS.
-
Stop the GPU-accelerated instance.
For more information, see Stop instances.
-
On the Instance page, find the stopped GPU-accelerated instance and click the
icon in the Actions column. In the Instance Settings section, click Set User Data.
-
Modify the user data and click OK.
Change the values of the DRIVER_VERSION, CUDA_VERSION, and CUDNN_VERSION parameters to the following versions:
...
DRIVER_VERSION="535.154.05"
CUDA_VERSION="12.1.1"
CUDNN_VERSION="8.9.7.29"
...

-
Change the OS of the GPU-accelerated instance.
For more information, see Replace the operating system (system disk) of an instance.
After the instance restarts, the system reinstalls the NVIDIA Tesla driver, CUDA, and cuDNN with the updated versions.
Verify the fix:
python3 -c "import torch; print(torch.__version__)"