This topic describes how to build a heterogeneous confidential computing environment on an Alibaba Cloud heterogeneous confidential computing instance (gn8v-tee). This topic also shows how to run sample code to verify the GPU-based confidential computing feature.
Background
Alibaba Cloud heterogeneous confidential computing instances (gn8v-tee) are built on CPU TDX confidential computing instances and integrate a GPU into a Trusted Execution Environment (TEE). This integration protects data transfers between the CPU and GPU, and data computations within the GPU. This topic focuses on verifying the GPU-based confidential computing feature. For more information about how to build a CPU TDX confidential computing environment and verify its remote attestation capabilities, see Build a TDX confidential computing environment. If you want to deploy a large language model inference environment on a heterogeneous confidential computing instance, see Build an LLM inference environment that supports security measurement on a heterogeneous confidential computing instance.
As shown in the preceding figure, the GPU on a heterogeneous confidential computing instance starts in confidential computing mode. The confidentiality of the instance is ensured by the following mechanisms:
The TDX feature ensures that the Hypervisor/Host OS cannot access the sensitive registers or memory data of the instance.
The PCIe firewall prevents the CPU from accessing the GPU's critical registers and protected video memory. The Hypervisor/Host OS has limited access and can perform only certain operations, such as resetting the GPU, but cannot access sensitive data. This ensures the confidentiality of data within the GPU.
The GPU's NVLink Firewall prevents other GPUs from directly accessing its video memory.
During initialization, the GPU driver and library functions in the CPU TEE establish an encrypted channel with the GPU using the Security Protocol and Data Model (SPDM). After key negotiation, the CPU and GPU transmit only ciphertext data over PCIe. This ensures the confidentiality of the data transmission link between the CPU and GPU.
The GPU's remote attestation capability confirms whether the GPU is in a secure state.
Specifically, applications in the confidential computing instance can use the Attestation SDK to call the GPU driver and obtain a cryptographic report on the GPU's security status from the hardware. This report contains cryptographically signed GPU hardware information, VBIOS, and hardware status measurement values. A relying party can compare these measurement values with the reference values provided by the GPU vendor to confirm that the GPU is in a secure confidential computing state.
Usage note
Heterogeneous confidential computing is supported only on Alibaba Cloud Linux 3 images. If you use a custom image that is built on Alibaba Cloud Linux 3 to create an instance, ensure that the kernel version is 5.10.134-18 or later.
Create a heterogeneous confidential computing instance (gn8v-tee)
ECS console
The steps to create an instance with heterogeneous confidential computing features in the console are similar to creating a regular instance. However, you must select specific options. This section highlights the specific configurations for heterogeneous confidential computing instances. For information about other general configurations, see Create an instance using the wizard.
Go to ECS console - Instance.
In the top navigation bar, select the region and resource group of the resource that you want to manage.
Click Create Instance and configure the instance with the following settings.
Configuration Item
Description
Region and Zone
China (Beijing) Zone L
Instance Type
Only ecs.gn8v-tee.4xlarge and higher instance types are supported.
Image
Select the Alibaba Cloud Linux 3.2104 LTS 64-bit image.
Public IP Address
Assign Public IPv4 Address. This ensures that you can download the driver from the official NVIDIA website later.
ImportantWhen you create an 8-GPU confidential instance, do not add extra secondary elastic network interfaces (ENIs). Doing so may prevent the instance from starting.
Follow the on-screen instructions to complete the instance creation.
OpenAPI or CLI
You can call the RunInstances operation or use the Cloud Assistant command-line interface (CLI) to create an ECS instance that supports the TDX security feature. The following table describes the key parameters.
Parameter | Description | Example |
RegionId | China (Beijing) | cn-beijing |
ZoneId | Zone L | cn-beijing-l |
InstanceType | Select ecs.gn8v-tee.4xlarge or a higher instance type. | ecs.gn8v-tee.4xlarge |
ImageId | Specify the ID of an image that supports heterogeneous confidential computing. Currently, only Alibaba Cloud Linux 3.2104 LTS 64-bit images with a kernel version of 5.10.134-18.al8.x86_64 or later are supported. | aliyun_3_x64_20G_alibase_20250117.vhd |
CLI example:
aliyun ecs RunInstances \
--Region cn-beijing \
--ZoneId cn-beijing-l \
--SystemDisk.Category cloud_essd \
--ImageId 'aliyun_3_x64_20G_alibase_20250117.vhd' \
--InstanceType 'ecs.gn8v-tee.4xlarge' \
--SecurityGroupId 'sg-[SecurityGroupId]' \
--VSwitchId 'vsw-[VSwitchID]' \
--KeyPairName [KEY_PAIR_NAME] \Build the heterogeneous confidential computing environment
Step 1: Install the NVIDIA driver and CUDA Toolkit
Heterogeneous confidential computing instances take a long time to initialize. Wait until the instance status is Running and the operating system has fully started before you proceed with the following operations.
The installation steps vary based on the instance type:
Single-GPU confidential instances: ecs.gn8v-tee.4xlarge and ecs.gn8v-tee.6xlarge
8-GPU confidential instances: ecs.gn8v-tee-8x.16xlarge and ecs.gn8v-tee-8x.48xlarge
Single-GPU confidential instances
Remotely connect to the confidential computing instance.
For more information, see Log on to a Linux instance using Workbench.
Adjust the kernel parameters to set the SWIOTLB buffer to 8 GB.
sudo grubby --update-kernel=ALL --args="swiotlb=4194304,any"Restart the instance for the configuration to take effect.
For more information, see Restart an instance.
Download the NVIDIA driver and CUDA Toolkit.
Single-GPU confidential instances require driver version
550.144.03or later. This topic uses version550.144.03as an example.wget --referer=https://www.nvidia.cn/ https://cn.download.nvidia.cn/tesla/550.144.03/NVIDIA-Linux-x86_64-550.144.03.run wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda_12.4.1_550.54.15_linux.runInstall dependencies and disable the CloudMonitor service.
sudo yum install -y openssl3 sudo systemctl disable cloudmonitor sudo systemctl stop cloudmonitorCreate and configure
nvidia-persistenced.service.cat > nvidia-persistenced.service << EOF [Unit] Description=NVIDIA Persistence Daemon Wants=syslog.target Before=cloudmonitor.service [Service] Type=forking ExecStart=/usr/bin/nvidia-persistenced --user root ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1 ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced [Install] WantedBy=multi-user.target EOF sudo cp nvidia-persistenced.service /usr/lib/systemd/system/nvidia-persistenced.serviceInstall the NVIDIA driver and CUDA Toolkit.
sudo bash NVIDIA-Linux-x86_64-550.144.03.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd --kernel-module-build-directory=kernel-open --rebuild-initramfs sudo bash cuda_12.4.1_550.54.15_linux.run --silent --toolkitStart the nvidia-persistenced and CloudMonitor services.
sudo systemctl start nvidia-persistenced.service sudo systemctl enable nvidia-persistenced.service sudo systemctl start cloudmonitor sudo systemctl enable cloudmonitor
8-GPU confidential instances
Remotely connect to the confidential computing instance.
For more information, see Log on to a Linux instance using Workbench.
ImportantConfidential computing instances take a long time to initialize. Ensure that the initialization process is complete before you proceed.
Adjust the kernel parameters to set the SWIOTLB buffer to 8 GB.
sudo grubby --update-kernel=ALL --args="swiotlb=4194304,any"Configure the loading behavior of the NVIDIA driver and regenerate the initramfs.
sudo bash -c 'cat > /etc/modprobe.d/nvidia-lkca.conf << EOF install nvidia /sbin/modprobe ecdsa_generic; /sbin/modprobe ecdh; /sbin/modprobe --ignore-install nvidia options nvidia NVreg_RegistryDwords="RmEnableProtectedPcie=0x1" EOF' sudo dracut --regenerate-all -fRestart the instance for the configuration to take effect.
For more information, see Restart an instance.
Download the NVIDIA driver and CUDA Toolkit.
8-GPU confidential computing instances require driver version
570.148.08or later and the corresponding version ofFabric Manager. This topic uses version570.148.08as an example.wget --referer=https://www.nvidia.cn/ https://cn.download.nvidia.cn/tesla/570.148.08/NVIDIA-Linux-x86_64-570.148.08.run wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run wget https://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/nvidia-fabric-manager-570.148.08-1.x86_64.rpmInstall dependencies and disable the CloudMonitor service.
sudo yum install -y openssl3 sudo systemctl disable cloudmonitor sudo systemctl stop cloudmonitorCreate and configure
nvidia-persistenced.service.cat > nvidia-persistenced.service << EOF [Unit] Description=NVIDIA Persistence Daemon Wants=syslog.target Before=cloudmonitor.service After=nvidia-fabricmanager.service [Service] Type=forking ExecStart=/usr/bin/nvidia-persistenced --user root --uvm-persistence-mode --verbose ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1 ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced TimeoutStartSec=900 TimeoutStopSec=60 [Install] WantedBy=multi-user.target EOF sudo cp nvidia-persistenced.service /usr/lib/systemd/system/nvidia-persistenced.serviceInstall Fabric Manager, the NVIDIA driver, and the CUDA Toolkit.
sudo rpm -ivh nvidia-fabric-manager-570.148.08-1.x86_64.rpm sudo bash NVIDIA-Linux-x86_64-570.148.08.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd --kernel-module-build-directory=kernel-open --rebuild-initramfs sudo bash cuda_12.8.1_570.124.06_linux.run --silent --toolkitStart the nvidia-persistenced and CloudMonitor services.
sudo systemctl start nvidia-fabricmanager.service sudo systemctl enable nvidia-fabricmanager.service sudo systemctl start nvidia-persistenced.service sudo systemctl enable nvidia-persistenced.service sudo systemctl start cloudmonitor sudo systemctl enable cloudmonitor
Step 2: Check the TDX status
The heterogeneous confidential computing feature is built on TDX. You must check the TDX status of the instance to verify that it is protected.
Check whether TDX is enabled.
lscpu |grep -i tdx_guestThe following command output indicates that TDX is enabled.
Check the installation of TDX-related drivers.
ls -l /dev/tdx_guestThe following figure shows that the TDX-related drivers are installed.

Step 3: Check the GPU-based confidential computing feature status
Single-GPU confidential instances
View the confidential computing feature status.
nvidia-smi conf-compute -fA return value of CC status: ON indicates that the confidential computing feature is enabled. A return value of CC status: OFF indicates that the feature is disabled and the instance is in an abnormal state. If the instance is in an abnormal state, submit a ticket.

8-GPU confidential instances
View the status of the confidential computing attribute.
nvidia-smi conf-compute -mgmA result of Multi-GPU Mode: Protected PCIe indicates that the multi-GPU confidential computing feature is enabled. A result of Multi-GPU Mode: None indicates that the multi-GPU confidential computing feature is disabled, which indicates an abnormal instance state. If this occurs, submit a ticket.

On an 8-GPU confidential instance, the nvidia-smi conf-compute -f command normally returns CC status: OFF.
Step 4: Verify the trustworthiness of the GPU/NVSwitch through local attestation
Single-GPU confidential instances
Install the dependencies required for GPU trust.
sudo yum install -y python3.11 python3.11-devel python3.11-pip sudo alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 60 sudo alternatives --set python3 /usr/bin/python3.11 sudo python3 -m ensurepip --upgrade sudo python3 -m pip install --upgrade pip sudo python3 -m pip install nv_attestation_sdk==2.5.0.post6914366 nv_local_gpu_verifier==2.5.0.post6914366 nv_ppcie_verifier==1.5.0.post6914366 -f https://attest-public-cn-beijing.oss-cn-beijing.aliyuncs.com/repo/pip/attest.htmlVerify the GPU's trust status.
python3 -m verifier.cc_admin --user_modeThe output indicates that the GPU is in a confidential computing state, and the measurement values for the driver, VBIOS, and other components match the expected values:

8-GPU confidential instances
Install the dependencies required for GPU trust.
sudo yum install -y python3.11 python3.11-devel python3.11-pip sudo alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 60 sudo alternatives --set python3 /usr/bin/python3.11 sudo python3 -m ensurepip --upgrade sudo python3 -m pip install --upgrade pip sudo python3 -m pip install nv_attestation_sdk==2.5.0.post6914366 nv_local_gpu_verifier==2.5.0.post6914366 nv_ppcie_verifier==1.5.0.post6914366 -f https://attest-public-cn-beijing.oss-cn-beijing.aliyuncs.com/repo/pip/attest.htmlInstall NVSwitch-related dependent components.
wget https://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/libnvidia-nscq-570-570.148.08-1.x86_64.rpm sudo rpm -ivh libnvidia-nscq-570-570.148.08-1.x86_64.rpmRun the following command to verify the GPU/NVSwitch trust status.
python3 -m ppcie.verifier.verification --gpu-attestation-mode=LOCAL --switch-attestation-mode=LOCALThe sample code verifies 8 GPUs and 4 NVSwitches. A final output of SUCCESS indicates that the verification is successful:

Limitations
Because the heterogeneous confidential computing feature is built on TDX, the functional limitations of TDX confidential computing instances also apply to heterogeneous confidential computing instances. For more information, see Known limitations of TDX instances.
After the GPU-based confidential computing feature is enabled, data transmission between the CPU and GPU requires encryption and decryption. This results in some performance loss for GPU-related tasks compared to non-confidential heterogeneous computing instances.
Usage notes
Single-GPU instances use CUDA 12.4. NVIDIA's cuBLAS library has a known issue that may cause errors when you run CUDA tasks or large language model tasks. You must install a specific version of cuBLAS.
pip3 install nvidia-cublas-cu12==12.4.5.8After the GPU-based confidential computing feature is enabled, initialization is slow, especially for 8-GPU confidential instances. After the guest OS starts, ensure that the nvidia-persistenced service has finished starting before you run nvidia-smi or other commands to use the GPU. To check the status of the nvidia-persistenced service, run the following command:
systemctl status nvidia-persistenced | grep "Active: "activating (start)indicates that the service is starting.Active: activating (start) since Wed 2025-02-19 10:07:54 CST; 2min 20s agoactive (running) indicates that the service is running.
Active: active (running) since Wed 2025-02-19 10:10:28 CST; 22s ago
Any auto-start service that uses the GPU, such as cloudmonitor.service or ollama.service, must be started after nvidia-persistenced.service.
The following is an example of the
/usr/lib/systemd/system/nvidia-persistenced.serviceconfiguration:[Unit] Description=NVIDIA Persistence Daemon Wants=syslog.target Before=cloudmonitor.service ollama.service After=nvidia-fabricmanager.service [Service] Type=forking ExecStart=/usr/bin/nvidia-persistenced --user root --uvm-persistence-mode --verbose ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1 ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced TimeoutStartSec=900 TimeoutStopSec=60 [Install] WantedBy=multi-user.target