GPU-accelerated instances installed with NVIDIA Tesla drivers can deliver higher computing performance or provide smoother graphics display effects in general computing scenarios such as deep learning and AI scenarios, and in graphics acceleration scenarios such as Open Graphics Library (OpenGL), Direct3D, and cloud gaming scenarios. If you do not install a Tesla driver when you create a GPU-accelerated compute-optimized Linux instance, you must install the Tesla driver after you create the instance. This topic describes how to manually install a Tesla driver on a GPU-accelerated compute-optimized Linux instance.
Procedure
This topic is suitable for all GPU-accelerated compute-optimized Linux instances. For more information, see GPU-accelerated compute-accelerated instance families. You can install only Tesla drivers that run the same OS as the instances. For example, you can install only a Linux Tesla driver on a GPU-accelerated compute-optimized Linux instance.
Step 1: Download a Tesla driver
Visit the NVIDIA Driver Downloads page on the NVIDIA official website.
NoteFor more information about how to install and configure an NVIDIA driver, see NVIDIA Driver Installation Quickstart Guide.
Configure filters and click Search to search for a driver that is suitable for your instance.
The following table describes the filters.
Filter
Description
Example
Product Type
Product Series
Product
From the Product Type, Product Series, and Product drop-down lists, select values based on the GPU with which your GPU-accelerated compute-optimized instance is configured.
NoteFor more information about how to view the details of a GPU-accelerated instance, such as the instance ID, instance type, and OS, see View instance information.
Data Center / Tesla
A-Series
NVDIA A10
Operating System
Select a Linux version based on the image of the instance.
Linux 64-bit
CUDA Toolkit
Select a CUDA Toolkit version.
11.4
Language
Select a language for the driver.
Chinese (Simplified)
Recommended/Beta
By default, All is selected. You can use the default setting.
All
The following table describes the GPU information about specific GPU-accelerated compute-optimized instance families.
Instance family
gn5
gn5i
gn6v
gn6i
gn6e
gn7
gn7i
gn7e
Product Type
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Data Center / Tesla
Product Series
P-Series
P-Series
V-Series
T-Series
V-Series
A-Series
A-Series
A-Series
NoteThe preceding table lists only the GPU information about specific popular GPU-accelerated compute-optimized instance families. Instances that use the same GPU have the same GPU information, such as the same product type, product series, and product family. For example, instances of the ebmgn7i and gn7i instance families use NVIDIA A10 GPUs. Therefore, the product type, product series, and product family of the instances are the same.
In the search result, find the driver version that you want to download, such as version 470.161.03, and click the driver name.
On the driver details page, click Download.
In the Download section, right-click Agree & Download and select Copy URL.
Use one of the following methods to connect to your GPU-accelerated compute-optimized Linux instance.
Method
References
Workbench
Virtual Network Computing (VNC)
Append the download address that you copied in Substep 5 to the
wget
command and run the command to download the installation package of the driver.Sample command:
wget https://us.download.nvidia.com/tesla/470.161.03/NVIDIA-Linux-x86_64-470.161.03.run
Step 2: Install the Tesla driver
The method of installing a Tesla driver on an instance varies based on the OS of the instance. The following section describes how to install a Tesla driver on different OSs.
CentOS
Run the following command to check whether the kernel-devel and kernel-headers packages are installed on the GPU-accelerated instance:
rpm -qa | grep $(uname -r)
If the command output includes the version information about the kernel-devel and kernel-headers packages, the packages are installed. Sample command output:
kernel-3.10.0-1062.18.1.el7.x86_64 kernel-devel-3.10.0-1062.18.1.el7.x86_64 kernel-headers-3.10.0-1062.18.1.el7.x86_64
If the command output does not include the version information about the kernel-devel (kernel-devel-*) and kernel-headers (kernel-headers-*) packages, you must download and install the packages of the required version. For more information, see kernel-devel and kernel-headers.
ImportantMake sure that the kernel-devel version is the same as the kernel version. Otherwise, a compilation error occurs when you install RPM Package Manager (RPM) for your driver. Therefore, check the kernel version in the command output before you download the kernel-devel version. In the preceding command output, the kernel version is 3.10.0-1062.18.1.el7.x86_64.
Grant the permissions on the installation package of your Tesla driver and install the driver.
In this example, a Linux 64-bit Tesla driver is used. We recommend that you use a .run installation package for your Tesla driver, such as the NVIDIA-Linux-x86_64-xxxx.run package. Run the following commands to grant the execution permissions on the installation package and install the Tesla driver:
NoteIf the installation package of your Tesla driver is in another format, such as the .deb or .rpm format, refer to NVIDIA CUDA Installation Guide for Linux for the installation method.
chmod +x NVIDIA-Linux-x86_64-xxxx.run
sh NVIDIA-Linux-x86_64-xxxx.run
Run the following command to check whether the Tesla driver is installed:
nvidia-smi
If the following command output is displayed, the Tesla driver is installed.
(Optional) Enable the persistence mode (Persistence-M) by using the NVIDIA Persistence Daemon.
After the Tesla driver is installed, Persistence-M is in the disabled (
off
) state by default. A Tesla driver can achieve more stable performance when Persistence-M is enabled. We recommend that you enable Persistence-M by using the NVIDIA Persistence Daemon to ensure business continuity. For more information, see Persistence Daemon.NotePersistence-M is a term for a user-settable driver property that keeps a GPU in the initialized state.
NVIDIA provides the Persistence Mode (Legacy) method to enable Persistence-M by using the
nvidia-smi -pm 1
command. The Persistence Mode (Legacy) method is near end-of-life and will be deprecated and replaced by the NVIDIA Persistence Daemon method.
Run the following command to run the NVIDIA Persistence Daemon:
sudo nvidia-persistenced --user username # Replace username with your username.
Run the following command to view the status of Persistence-M:
nvidia-smi
If the following command output is displayed, Persistence-M is in the enabled (
on
) state.
(Optional) Enable Persistence-M after you restart the system.
If you restart the system, the enabled (
on
) state of Persistence-M becomes invalid. You can perform the following operations to enable Persistence-M:Install the installation scripts provided by NVIDIA, such as the sample script and the installer script, to the
/usr/share/doc/NVIDIA_GLX-1.0/samples/nvidia-persistenced-init.tar.bz2
path by installing the Tesla driver installation package.Run the following command to decompress and install the installation script provided by NVIDIA:
cd /usr/share/doc/NVIDIA_GLX-1.0/samples/ tar xf nvidia-persistenced-init.tar.bz2 cd nvidia-persistenced-init sh install.sh
Run the following command to check whether the NVIDIA Persistence Daemon runs as expected:
systemctl status nvidia-persistenced
If the following command output is displayed, the NVIDIA Persistence Daemon runs as expected.
NoteYou can adapt the NVIDIA Persistence Daemon installation script based on your OS to ensure that the NVIDIA Persistence Daemon works as expected.
Run the following command to verify that Persistence-M is in the enabled (
on
) state:nvidia-smi
(Optional) Run the following command to disable the NVIDIA Persistence Daemon.
You can disable the NVIDIA Persistence Daemon based on your business requirements.
systemctl stop nvidia-persistenced systemctl disable nvidia-persistenced
(Optional) Install NVIDIA Fabric Manager that matches the driver version. This operation is required when your GPU-accelerated instance belongs to the ebmgn7 or ebmgn7e instance family.
ImportantIf your GPU-accelerated instance belongs to the ebmgn7 or ebmgn7e instance family, you must install NVIDIA Fabric Manager that matches the driver version. Otherwise, you cannot use the instance as expected.
You can skip this operation if your GPU-accelerated instance does not belong to the ebmgn7 or ebmgn7e instance family.
Install NVIDIA Fabric Manager.
You can install NVIDIA Fabric Manager by using the source code or the installation package. The commands that are required to install NVIDIA Fabric Manager vary based on your OS. In the following examples, the driver version is 460.91.03, and CentOS 7.x and CentOS 8.x are used. Replace
driver_version
with the version of the driver that you downloaded in Step 1: Download a Tesla driver.Source code
Installation package
Run the following commands to start NVIDIA Fabric Manager:
systemctl enable nvidia-fabricmanager systemctl start nvidia-fabricmanager
Run the following command to check whether NVIDIA Fabric Manager is installed:
systemctl status nvidia-fabricmanager
If the following command output is displayed, NVIDIA Fabric Manager is installed.
Other Linux distributions such as Ubuntu
Grant the permissions on the installation package of your Tesla driver and install the driver.
In this example, a Linux 64-bit Tesla driver is used. We recommend that you use a .run installation package for your Tesla driver, such as the NVIDIA-Linux-x86_64-xxxx.run package. Run the following commands to grant the execution permissions on the installation package and install the Tesla driver:
NoteIf the installation package of your Tesla driver is in another format, such as the .deb or .rpm format, refer to NVIDIA CUDA Installation Guide for Linux for the installation method.
chmod +x NVIDIA-Linux-x86_64-xxxx.run
sh NVIDIA-Linux-x86_64-xxxx.run
Run the following command to check whether the Tesla driver is installed:
nvidia-smi
If the following command output is displayed, the Tesla driver is installed.
(Optional) Enable the persistence mode (Persistence-M) by using the NVIDIA Persistence Daemon.
After the Tesla driver is installed, Persistence-M is in the disabled (
off
) state by default. A Tesla driver can achieve more stable performance when Persistence-M is enabled. We recommend that you enable Persistence-M by using the NVIDIA Persistence Daemon to ensure business continuity. For more information, see Persistence Daemon.NotePersistence-M is a term for a user-settable driver property that keeps a GPU in the initialized state.
NVIDIA provides the Persistence Mode (Legacy) method to enable Persistence-M by using the
nvidia-smi -pm 1
command. The Persistence Mode (Legacy) method is near end-of-life and will be deprecated and replaced by the NVIDIA Persistence Daemon method.
Run the following command to run the NVIDIA Persistence Daemon:
sudo nvidia-persistenced --user username # Replace username with your username.
Run the following command to view the status of Persistence-M:
nvidia-smi
If the following command output is displayed, Persistence-M is in the enabled (
on
) state.
(Optional) Enable Persistence-M after you restart the system.
If you restart the system, the enabled (
on
) state of Persistence-M becomes invalid. You can perform the following operations to enable Persistence-M:Install the installation scripts provided by NVIDIA, such as the sample script and the installer script, to the
/usr/share/doc/NVIDIA_GLX-1.0/samples/nvidia-persistenced-init.tar.bz2
path by installing the Tesla driver installation package.Run the following command to decompress and install the installation script provided by NVIDIA:
cd /usr/share/doc/NVIDIA_GLX-1.0/samples/ tar xf nvidia-persistenced-init.tar.bz2 cd nvidia-persistenced-init sh install.sh
Run the following command to check whether the NVIDIA Persistence Daemon runs as expected:
systemctl status nvidia-persistenced
If the following command output is displayed, the NVIDIA Persistence Daemon runs as expected.
NoteYou can adapt the NVIDIA Persistence Daemon installation script based on your OS to ensure that the NVIDIA Persistence Daemon works as expected.
Run the following command to verify that Persistence-M is in the enabled (
on
) state:nvidia-smi
(Optional) Run the following command to disable the NVIDIA Persistence Daemon.
You can disable the NVIDIA Persistence Daemon based on your business requirements.
systemctl stop nvidia-persistenced systemctl disable nvidia-persistenced
(Optional) Install NVIDIA Fabric Manager that matches the driver version. This operation is required when your GPU-accelerated instance belongs to the ebmgn7 or ebmgn7e instance family.
ImportantIf your GPU-accelerated instance belongs to the ebmgn7 or ebmgn7e instance family, you must install NVIDIA Fabric Manager that matches the driver version. Otherwise, you cannot use the instance as expected.
You can skip this operation if your GPU-accelerated instance does not belong to the ebmgn7 or ebmgn7e instance family.
Install NVIDIA Fabric Manager.
You can install NVIDIA Fabric Manager by using the source code or the installation package. The commands that are required to install NVIDIA Fabric Manager vary based on your OS. In the following examples, the driver version is 460.91.03, and Ubuntu 16.04, Ubuntu 18.04, and Ubuntu 20.04 are used. Replace
driver_version
with the version of the driver that you downloaded in Step 1: Download a Tesla driver.Source code
Installation package
Run the following commands to start NVIDIA Fabric Manager:
systemctl enable nvidia-fabricmanager systemctl start nvidia-fabricmanager
Run the following command to check whether NVIDIA Fabric Manager is installed:
systemctl status nvidia-fabricmanager
If the following command output is displayed, NVIDIA Fabric Manager is installed.
References
If you purchase a GPU-accelerated compute-optimized Windows instance, you can install only a Tesla driver to better use the instance in general computing scenarios, such as deep learning and AI scenarios. For more information, see Install a Tesla driver on a GPU-accelerated compute-optimized Windows instance.
You can install a Tesla driver when you create a GPU-accelerated instance. For more information, see Create a GPU-accelerated instance.
If you no longer need a Tesla driver, you can uninstall the driver. For more information, see Uninstall an NVIDIA Tesla driver.
If the driver version of your GPU-accelerated instance cannot meet your business requirements, or your GPU-accelerated instance becomes unavailable due to an invalid driver type or version, you can uninstall the driver and install a new driver. You can also upgrade the driver. For more information, see Upgrade an NVIDIA Tesla or GRID driver.