This topic describes how to solve the issues that Persistence Mode (Persistence-M) that you enabled by running the nvidia-smi -pm 1
command does not take effect and the elastic compute container (ECC) status or the multi-instance GPU (MIG) feature fails to be configured after you restart a GPU-accelerated compute-optimized instance on which you install a later version of the Tesla driver, such as 535 or later.
Problem description
You install the Tesla driver of version 535 or later on a GPU-accelerated compute-optimized Linux instance and then run the nvidia-smi -pm 1
command to enable Persistence Mode. As a result, the following issues may occur:
After you restart the GPU-accelerated compute-optimized instance, Persistence Mode is in the
Off
state, which indicates that Persistence Mode is disabled.The ECC status fails to be configured.
The MIG feature fails to be configured.
Cause
The version of the Tesla driver is not compatible with the instance. When you run the nvidia-smi -pm 1
command to enable Persistence Mode and restart the GPU-accelerated compute-optimized instance, the preceding issues may occur.
Solution
If dmesg logs contain the following information, enable Persistence Mode by using the NVIDIA Persistence Daemon. For more information, see the Enable Persistence Mode by using the NVIDIA Persistence Daemon step in the "Step 2: Install the Tesla driver" section of the "Manually install the Tesla driver on a GPU-accelerated compute-optimized Linux instance" topic.
NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead.