全部产品
Search
文档中心

Elastic Compute Service:Membangun lingkungan komputasi rahasia heterogen

更新时间:Dec 09, 2025

Topik ini menjelaskan cara membangun lingkungan komputasi rahasia heterogen pada instans komputasi rahasia heterogen Alibaba Cloud (gn8v-tee) serta menunjukkan cara menjalankan kode contoh untuk memverifikasi fitur komputasi rahasia berbasis GPU.

Latar Belakang

Instans komputasi rahasia heterogen Alibaba Cloud (gn8v-tee) dibangun di atas instans komputasi rahasia CPU TDX dan mengintegrasikan GPU ke dalam Trusted Execution Environment (TEE). Integrasi ini melindungi transfer data antara CPU dan GPU serta komputasi data di dalam GPU. Topik ini berfokus pada verifikasi fitur komputasi rahasia berbasis GPU. Untuk informasi lebih lanjut tentang cara membangun lingkungan komputasi rahasia CPU TDX dan memverifikasi kemampuan remote attestation-nya, lihat Build a TDX confidential computing environment. Jika Anda ingin menerapkan lingkungan inferensi model bahasa besar pada instans komputasi rahasia heterogen, lihat Build an LLM inference environment that supports security measurement on a heterogeneous confidential computing instance.

Seperti yang ditunjukkan pada gambar sebelumnya, GPU pada instans komputasi rahasia heterogen dimulai dalam mode komputasi rahasia. Kerahasiaan instans ini dijamin oleh mekanisme berikut:

  1. Fitur TDX memastikan bahwa Hypervisor/Host OS tidak dapat mengakses register sensitif atau data memori instans.

  2. Firewall PCIe mencegah CPU mengakses register kritis GPU dan memori video yang dilindungi. Hypervisor/Host OS memiliki akses terbatas dan hanya dapat melakukan operasi tertentu, seperti mereset GPU, tetapi tidak dapat mengakses data sensitif. Hal ini menjamin kerahasiaan data di dalam GPU.

  3. NVLink Firewall GPU mencegah GPU lain mengakses langsung memori videonya.

  4. Saat inisialisasi, driver GPU dan fungsi pustaka di dalam CPU TEE membuat saluran terenkripsi dengan GPU menggunakan Security Protocol and Data Model (SPDM). Setelah negosiasi kunci, CPU dan GPU hanya mentransmisikan data ciphertext melalui PCIe. Hal ini menjamin kerahasiaan tautan transmisi data antara CPU dan GPU.

  5. Kemampuan remote attestation GPU mengonfirmasi apakah GPU berada dalam keadaan aman.

    Secara spesifik, aplikasi dalam instans komputasi rahasia dapat menggunakan Attestation SDK untuk memanggil driver GPU dan mendapatkan laporan kriptografis mengenai status keamanan GPU dari perangkat keras. Laporan ini berisi informasi perangkat keras GPU, VBIOS, dan nilai pengukuran status perangkat keras yang telah ditandatangani secara kriptografis. Pihak yang bergantung dapat membandingkan nilai pengukuran ini dengan nilai referensi yang disediakan oleh vendor GPU untuk mengonfirmasi bahwa GPU berada dalam keadaan komputasi rahasia yang aman.

Catatan Penggunaan

Komputasi rahasia heterogen hanya didukung pada citra Alibaba Cloud Linux 3. Jika Anda menggunakan custom image berbasis Alibaba Cloud Linux 3 untuk membuat instans, pastikan versi kernel-nya adalah 5.10.134-18 atau lebih baru.

Buat instans komputasi rahasia heterogen (gn8v-tee)

Konsol ECS

Langkah-langkah untuk membuat instans dengan fitur komputasi rahasia heterogen di konsol mirip dengan pembuatan instans biasa. Namun, Anda harus memilih opsi tertentu. Bagian ini menyoroti konfigurasi spesifik untuk instans komputasi rahasia heterogen. Untuk informasi tentang konfigurasi umum lainnya, lihat Create an instance using the wizard.

  1. Buka ECS console - Instances.

  2. Di bilah navigasi atas, pilih wilayah dan kelompok sumber daya dari resource yang ingin Anda kelola. 地域

  3. Klik Create Instance dan konfigurasikan instans dengan pengaturan berikut.

    Item Konfigurasi

    Deskripsi

    Wilayah dan Zona

    China (Beijing) Zone L

    Instance Type

    Hanya tipe instans ecs.gn8v-tee.4xlarge dan yang lebih tinggi yang didukung.

    Image

    Pilih citra Alibaba Cloud Linux 3.2104 LTS 64-bit.

    Alamat IP Publik

    Assign Public IPv4 Address. Ini memastikan Anda dapat mengunduh driver dari situs web resmi NVIDIA nanti.

    Penting

    Saat membuat instans rahasia 8-GPU, jangan tambahkan elastic network interfaces (ENIs) secondary tambahan. Melakukan hal tersebut dapat mencegah instans berjalan.

    Penyebab dan solusi

    Instans Elastic Compute Service (ECS) dengan fitur TDX yang diaktifkan menggunakan wilayah memori non-enkripsi khusus (SWIOTLB) untuk komunikasi peripheral. Wilayah memori ini memiliki ukuran terbatas. Secara default, ukurannya adalah 6% dari memori yang tersedia pada instans, hingga maksimum 1 GiB.

    Saat membuat instans rahasia 8-GPU, menyambungkan beberapa ENI dapat menghabiskan memori SWIOTLB. Hal ini menyebabkan kegagalan alokasi memori dan mencegah instans berjalan.

    Jika instans gagal berjalan, gunakan salah satu solusi berikut:

    • Solusi 1: Hentikan instans dan detach ENI secondary tambahan.

    • Solusi 2: Buat ulang instans hanya dengan satu network interface card utama.

    Untuk menambahkan beberapa ENI ke instans rahasia 8-GPU, Anda dapat associate an ENI with an ECS instance setelah menyelesaikan Step 1 untuk menyesuaikan buffer SWIOTLB menjadi 8 GB.

  4. Ikuti petunjuk di layar untuk menyelesaikan pembuatan instans.

OpenAPI atau CLI

Anda dapat memanggil operasi RunInstances atau menggunakan Alibaba Cloud CLI untuk membuat instans ECS yang mendukung atribut keamanan TDX. Tabel berikut menjelaskan parameter yang diperlukan.

Parameter

Deskripsi

Contoh

RegionId

China (Beijing)

cn-beijing

ZoneId

Zone L

cn-beijing-l

InstanceType

Pilih ecs.gn8v-tee.4xlarge atau tipe instans yang lebih besar.

ecs.gn8v-tee.4xlarge

ImageId

Tentukan ID citra yang mendukung komputasi rahasia. Hanya citra Alibaba Cloud Linux 3.2104 LTS 64-bit dengan versi kernel 5.10.134-18.al8.x86_64 atau lebih baru yang didukung.

aliyun_3_x64_20G_alibase_20250117.vhd

Contoh CLI:

Dalam perintah ini, <SECURITY_GROUP_ID> merepresentasikan ID security group, <VSWITCH_ID> merepresentasikan ID vSwitch, dan <KEY_PAIR_NAME> merepresentasikan nama pasangan kunci SSH.
aliyun ecs RunInstances \
  --RegionId cn-beijing \
  --ZoneId cn-beijing-l \
  --SystemDisk.Category cloud_essd \
  --ImageId 'aliyun_3_x64_20G_alibase_20250117.vhd' \
  --InstanceType 'ecs.gn8v-tee.4xlarge' \
  --SecurityGroupId '<SECURITY_GROUP_ID>' \
  --VSwitchId '<VSWITCH_ID>' \
  --KeyPairName <KEY_PAIR_NAME>

Membangun lingkungan komputasi rahasia heterogen

Langkah 1: Instal driver NVIDIA dan CUDA Toolkit

Penting

Instans komputasi rahasia heterogen memerlukan waktu lama untuk inisialisasi. Tunggu hingga status instans menjadi Running dan sistem operasi telah sepenuhnya dimulai sebelum melanjutkan operasi berikut.

Langkah instalasi bervariasi berdasarkan tipe instans:

  • Instans rahasia single-GPU: ecs.gn8v-tee.4xlarge dan ecs.gn8v-tee.6xlarge

  • Instans rahasia 8-GPU: ecs.gn8v-tee-8x.16xlarge dan ecs.gn8v-tee-8x.48xlarge

Instans rahasia single-GPU

  1. Hubungkan secara remote ke instans komputasi rahasia.

    Untuk informasi lebih lanjut, lihat Log on to a Linux instance using Workbench.

  2. Sesuaikan parameter kernel untuk mengatur buffer SWIOTLB menjadi 8 GB.

    sudo grubby --update-kernel=ALL --args="swiotlb=4194304,any"
  3. Mulai ulang instans agar konfigurasi berlaku.

    Untuk informasi lebih lanjut, lihat Restart an instance.

  4. Unduh driver NVIDIA dan CUDA Toolkit.

    Instans rahasia single-GPU memerlukan versi driver 550.144.03 atau lebih baru. Topik ini menggunakan versi 550.144.03 sebagai contoh.
    wget --referer=https://www.nvidia.cn/ https://cn.download.nvidia.cn/tesla/550.144.03/NVIDIA-Linux-x86_64-550.144.03.run
    wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda_12.4.1_550.54.15_linux.run
  5. Instal dependensi dan nonaktifkan layanan CloudMonitor.

    sudo yum install -y openssl3
    sudo systemctl disable cloudmonitor
    sudo systemctl stop cloudmonitor
  6. Buat dan konfigurasikan nvidia-persistenced.service.

    cat > nvidia-persistenced.service << EOF
    [Unit]
    Description=NVIDIA Persistence Daemon
    Wants=syslog.target
    Before=cloudmonitor.service
    
    [Service]
    Type=forking
    ExecStart=/usr/bin/nvidia-persistenced --user root
    ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
    sudo cp nvidia-persistenced.service /usr/lib/systemd/system/nvidia-persistenced.service
  7. Instal driver NVIDIA dan CUDA Toolkit.

    sudo bash NVIDIA-Linux-x86_64-550.144.03.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd --kernel-module-build-directory=kernel-open --rebuild-initramfs
    sudo bash cuda_12.4.1_550.54.15_linux.run --silent --toolkit
  8. Jalankan layanan nvidia-persistenced dan CloudMonitor.

    sudo systemctl start nvidia-persistenced.service
    sudo systemctl enable nvidia-persistenced.service
    sudo systemctl start cloudmonitor
    sudo systemctl enable cloudmonitor

Instans rahasia 8-GPU

  1. Hubungkan secara remote ke instans komputasi rahasia.

    Untuk informasi lebih lanjut, lihat Log on to a Linux instance using Workbench.

    Penting

    Instans komputasi rahasia memerlukan waktu lama untuk inisialisasi. Pastikan proses inisialisasi telah selesai sebelum melanjutkan.

  2. Sesuaikan parameter kernel untuk mengatur buffer SWIOTLB menjadi 8 GB.

    sudo grubby --update-kernel=ALL --args="swiotlb=4194304,any"
  3. Konfigurasikan perilaku pemuatan driver NVIDIA dan regenerasi initramfs.

    sudo bash -c 'cat > /etc/modprobe.d/nvidia-lkca.conf << EOF
    install nvidia /sbin/modprobe ecdsa_generic; /sbin/modprobe ecdh; /sbin/modprobe --ignore-install nvidia
    options nvidia NVreg_RegistryDwords="RmEnableProtectedPcie=0x1"
    EOF'
    
    sudo dracut --regenerate-all -f
  4. Mulai ulang instans agar konfigurasi berlaku.

    Untuk informasi lebih lanjut, lihat Restart an instance.

  5. Unduh driver NVIDIA dan CUDA Toolkit.

    Instans komputasi rahasia 8-GPU memerlukan versi driver 570.148.08 atau lebih baru dan versi Fabric Manager yang sesuai. Topik ini menggunakan versi 570.148.08 sebagai contoh.
    wget --referer=https://www.nvidia.cn/ https://cn.download.nvidia.cn/tesla/570.148.08/NVIDIA-Linux-x86_64-570.148.08.run
    wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run
    wget https://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/nvidia-fabric-manager-570.148.08-1.x86_64.rpm
  6. Instal dependensi dan nonaktifkan layanan CloudMonitor.

    sudo yum install -y openssl3
    sudo systemctl disable cloudmonitor
    sudo systemctl stop cloudmonitor
  7. Buat dan konfigurasikan nvidia-persistenced.service.

    cat > nvidia-persistenced.service << EOF
    [Unit]
    Description=NVIDIA Persistence Daemon
    Wants=syslog.target
    Before=cloudmonitor.service
    After=nvidia-fabricmanager.service
    
    [Service]
    Type=forking
    ExecStart=/usr/bin/nvidia-persistenced --user root --uvm-persistence-mode --verbose
    ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    TimeoutStartSec=900
    TimeoutStopSec=60
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
    sudo cp nvidia-persistenced.service /usr/lib/systemd/system/nvidia-persistenced.service
  8. Instal Fabric Manager, driver NVIDIA, dan CUDA Toolkit.

    sudo rpm -ivh nvidia-fabric-manager-570.148.08-1.x86_64.rpm
    sudo bash NVIDIA-Linux-x86_64-570.148.08.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd --kernel-module-build-directory=kernel-open --rebuild-initramfs
    sudo bash cuda_12.8.1_570.124.06_linux.run --silent --toolkit
  9. Jalankan layanan nvidia-persistenced dan CloudMonitor.

    sudo systemctl start nvidia-fabricmanager.service
    sudo systemctl enable nvidia-fabricmanager.service
    sudo systemctl start nvidia-persistenced.service
    sudo systemctl enable nvidia-persistenced.service
    sudo systemctl start cloudmonitor
    sudo systemctl enable cloudmonitor

Langkah 2: Periksa status TDX

Fitur komputasi rahasia heterogen dibangun di atas TDX. Anda harus memeriksa status TDX instans untuk memverifikasi bahwa instans tersebut dilindungi.

  1. Periksa apakah TDX diaktifkan.

    lscpu |grep -i tdx_guest

    Output perintah berikut menunjukkan bahwa TDX diaktifkan.tdx-install

  2. Periksa instalasi driver terkait TDX.

    ls -l /dev/tdx_guest

    Gambar berikut menunjukkan bahwa driver terkait TDX telah terinstal.image

Langkah 3: Periksa status fitur komputasi rahasia berbasis GPU

Instans rahasia single-GPU

Lihat status fitur komputasi rahasia.

nvidia-smi conf-compute -f

Nilai kembali CC status: ON menunjukkan bahwa fitur komputasi rahasia diaktifkan. Nilai kembali CC status: OFF menunjukkan bahwa fitur dinonaktifkan dan instans berada dalam keadaan abnormal. Jika instans berada dalam keadaan abnormal, submit a ticket.

image

Instans rahasia 8-GPU

Lihat status atribut komputasi rahasia.

nvidia-smi conf-compute -mgm

Hasil Multi-GPU Mode: Protected PCIe menunjukkan bahwa fitur komputasi rahasia multi-GPU diaktifkan. Hasil Multi-GPU Mode: None menunjukkan bahwa fitur komputasi rahasia multi-GPU dinonaktifkan, yang mengindikasikan keadaan instans abnormal. Jika hal ini terjadi, submit a ticket.

image

Catatan

Pada instans rahasia 8-GPU, perintah nvidia-smi conf-compute -f biasanya mengembalikan CC status: OFF.

Langkah 4: Verifikasi keandalan GPU/NVSwitch melalui local attestation

Instans rahasia single-GPU

  1. Instal dependensi yang diperlukan untuk keandalan GPU.

    sudo yum install -y python3.11 python3.11-devel python3.11-pip
    sudo alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 60
    sudo alternatives --set python3 /usr/bin/python3.11
    sudo python3 -m ensurepip --upgrade
    sudo python3 -m pip install --upgrade pip
    
    sudo python3 -m pip install nv_attestation_sdk==2.5.0.post6914366 nv_local_gpu_verifier==2.5.0.post6914366 nv_ppcie_verifier==1.5.0.post6914366 -f https://attest-public-cn-beijing.oss-cn-beijing.aliyuncs.com/repo/pip/attest.html
  2. Verifikasi status keandalan GPU.

    python3 -m verifier.cc_admin --user_mode

    Output menunjukkan bahwa GPU berada dalam keadaan komputasi rahasia, dan nilai pengukuran untuk driver, VBIOS, dan komponen lainnya sesuai dengan nilai yang diharapkan:

    image

    Contoh output lengkap

    Generating nonce in the local GPU Verifier ..
    Number of GPUs available : 1
    Fetching GPU 0 information from GPU driver.
    All GPU Evidences fetched successfully
    Set OCSP_NONCE_DISABLED to True while using aliyun's OCSP service
    -----------------------------------
    Verifying GPU: GPU-e1e94012-8c7b-f9a2-d712-fc5b014f364c
            Driver version fetched : 550.144.03
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 550.144.03
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 0 with UUID GPU-e1e94012-8c7b-f9a2-d712-fc5b014f364c verified successfully.
    GPU Attestation is Successful.

Instans rahasia 8-GPU

  1. Instal dependensi yang diperlukan untuk keandalan GPU.

    sudo yum install -y python3.11 python3.11-devel python3.11-pip
    sudo alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 60
    sudo alternatives --set python3 /usr/bin/python3.11
    sudo python3 -m ensurepip --upgrade
    sudo python3 -m pip install --upgrade pip
    
    sudo python3 -m pip install nv_attestation_sdk==2.5.0.post6914366 nv_local_gpu_verifier==2.5.0.post6914366 nv_ppcie_verifier==1.5.0.post6914366 -f https://attest-public-cn-beijing.oss-cn-beijing.aliyuncs.com/repo/pip/attest.html
  2. Instal komponen dependen terkait NVSwitch.

    wget https://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/libnvidia-nscq-570-570.148.08-1.x86_64.rpm
    sudo rpm -ivh libnvidia-nscq-570-570.148.08-1.x86_64.rpm
  3. Jalankan perintah berikut untuk memverifikasi status keandalan GPU/NVSwitch.

    python3 -m ppcie.verifier.verification --gpu-attestation-mode=LOCAL --switch-attestation-mode=LOCAL

    Kode contoh memverifikasi 8 GPU dan 4 NVSwitch. Output akhir SUCCESS menunjukkan bahwa verifikasi berhasil:

    image

    Contoh output lengkap

    **************************************************
    *    PPCIE: Starting PPCIE Verification Tool    *
    **************************************************
    **************************************************
    *          PPCIE: Number of GPUs are: 8          *
    **************************************************
    **************************************************
    *       PPCIE: Number of NVSwitches are: 4       *
    **************************************************
    Nonce generated: 006a638b032ae5eed158d6584dd13429de5743ce36498e60b7256703ce6a68ae
    Number of GPUs available : 8
    Fetching GPU 0 information from GPU driver.
    Fetching GPU 1 information from GPU driver.
    Fetching GPU 2 information from GPU driver.
    Fetching GPU 3 information from GPU driver.
    Fetching GPU 4 information from GPU driver.
    Fetching GPU 5 information from GPU driver.
    Fetching GPU 6 information from GPU driver.
    Fetching GPU 7 information from GPU driver.
    All GPU Evidences fetched successfully
    **************************************************
    *           PPCIE: Attesting the GPUs           *
    **************************************************
    Set OCSP_NONCE_DISABLED to True while using aliyun's OCSP service
    -----------------------------------
    Verifying GPU: GPU-db98b8e0-51c7-d188-99ec-6755455abcd9
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 0 with UUID GPU-db98b8e0-51c7-d188-99ec-6755455abcd9 verified successfully.
    -----------------------------------
    Verifying GPU: GPU-d5372674-da51-fe3c-29f7-034c6aad55bd
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 1 with UUID GPU-d5372674-da51-fe3c-29f7-034c6aad55bd verified successfully.
    -----------------------------------
    Verifying GPU: GPU-3865b295-1fd1-a21a-a4d3-07bc47ff31ca
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 2 with UUID GPU-3865b295-1fd1-a21a-a4d3-07bc47ff31ca verified successfully.
    -----------------------------------
    Verifying GPU: GPU-98377e04-ff60-ecac-beb5-28b3e8005c64
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 3 with UUID GPU-98377e04-ff60-ecac-beb5-28b3e8005c64 verified successfully.
    -----------------------------------
    Verifying GPU: GPU-ecab3a2a-0cb3-eebd-6a46-b941338b9e5f
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 4 with UUID GPU-ecab3a2a-0cb3-eebd-6a46-b941338b9e5f verified successfully.
    -----------------------------------
    Verifying GPU: GPU-91acee11-dd57-920c-c2f9-0b67fc4540d6
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 5 with UUID GPU-91acee11-dd57-920c-c2f9-0b67fc4540d6 verified successfully.
    -----------------------------------
    Verifying GPU: GPU-84370594-42f5-cbbe-a71c-b6d22ba45b65
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 6 with UUID GPU-84370594-42f5-cbbe-a71c-b6d22ba45b65 verified successfully.
    -----------------------------------
    Verifying GPU: GPU-4d8767db-a4ed-4ec1-d863-6c7635366dd1
            Driver version fetched : 570.148.08
            VBIOS version fetched : 96.00.cf.00.05
            Validating GPU certificate chains.
                    The firmware ID in the device certificate chain is matching with the one in the attestation report.
                    GPU attestation report certificate chain validation successful.
                            The certificate chain revocation status verification successful.
            Authenticating attestation report
                    The nonce in the SPDM GET MEASUREMENT request message is matching with the generated nonce.
                    Driver version fetched from the attestation report : 570.148.08
                    VBIOS version fetched from the attestation report : 96.00.cf.00.05
                    Attestation report signature verification successful.
                    Attestation report verification successful.
            Authenticating the RIMs.
                    Authenticating Driver RIM
                            Fetching the driver RIM from the RIM service.
                            RIM Schema validation passed.
                            driver RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            driver RIM signature verification successful.
                            Driver RIM verification successful
                    Authenticating VBIOS RIM.
                            Fetching the VBIOS RIM from the RIM service.
                            RIM Schema validation passed.
                            vbios RIM certificate chain verification successful.
                            The certificate chain revocation status verification successful.
                            vbios RIM signature verification successful.
                            VBIOS RIM verification successful
            Comparing measurements (runtime vs golden)
                            The runtime measurements are matching with the golden measurements.                            
                    GPU is in expected state.
            GPU 7 with UUID GPU-4d8767db-a4ed-4ec1-d863-6c7635366dd1 verified successfully.
    GPU Attestation is Successful.
    **************************************************
    *      PPCIE: GPU Attestation result: True      *
    **************************************************
    **************************************************
    *        PPCIE: GPU Attestation Completed        *
    **************************************************
    **************************************************
    *        PPCIE: GPU state is READY           *
    **************************************************
    +--------------------+---------+
    |       STAGE        |  STATUS |
    +--------------------+---------+
    |   GPU Pre-checks   | SUCCESS |
    | Switch Pre-checks  | SUCCESS |
    |  GPU Attestation   | SUCCESS |
    | Switch Attestation | SUCCESS |
    |  Topology checks   | SUCCESS |
    +--------------------+---------+
    **************************************************
    *     PPCIE: End of PPCIE Verification Tool     *
    **************************************************

Batasan

  • Karena fitur komputasi rahasia heterogen dibangun di atas TDX, batasan fungsional instans komputasi rahasia TDX juga berlaku untuk instans komputasi rahasia heterogen. Untuk informasi lebih lanjut, lihat Known limitations of TDX instances.

  • Setelah fitur komputasi rahasia berbasis GPU diaktifkan, transmisi data antara CPU dan GPU memerlukan enkripsi dan dekripsi. Hal ini menyebabkan penurunan performa pada tugas-tugas terkait GPU dibandingkan dengan instans komputasi heterogen non-rahasia.

Catatan Penggunaan

  1. Instans single-GPU menggunakan CUDA 12.4. Pustaka cuBLAS dari NVIDIA memiliki masalah yang diketahui yang dapat menyebabkan error saat menjalankan tugas CUDA atau tugas model bahasa besar. Anda harus menginstal versi cuBLAS tertentu.

    pip3 install nvidia-cublas-cu12==12.4.5.8
  2. Setelah fitur komputasi rahasia berbasis GPU diaktifkan, inisialisasi menjadi lambat, terutama untuk instans rahasia 8-GPU. Setelah guest OS dimulai, pastikan layanan nvidia-persistenced telah selesai dimulai sebelum menjalankan nvidia-smi atau perintah lain untuk menggunakan GPU. Untuk memeriksa status layanan nvidia-persistenced, jalankan perintah berikut:

    systemctl status nvidia-persistenced | grep "Active: "
    • activating (start) menunjukkan bahwa layanan sedang dimulai.

      Active: activating (start) since Wed 2025-02-19 10:07:54 CST; 2min 20s ago
    • active (running) menunjukkan bahwa layanan sedang berjalan.

      Active: active (running) since Wed 2025-02-19 10:10:28 CST; 22s ago
  3. Setiap layanan auto-start yang menggunakan GPU (seperti cloudmonitor.service, ollama.service, atau nvidia-cdi-refresh.service dari paket nvidia-container-toolkit-base), harus dimulai setelah nvidia-persistenced.service.

    Berikut adalah contoh konfigurasi /usr/lib/systemd/system/nvidia-persistenced.service:

    [Unit]
    Description=NVIDIA Persistence Daemon
    Wants=syslog.target
    Before=cloudmonitor.service nvidia-cdi-refresh.service ollama.service
    After=nvidia-fabricmanager.service
    
    [Service]
    Type=forking
    ExecStart=/usr/bin/nvidia-persistenced --user root --uvm-persistence-mode --verbose
    ExecStartPost=/usr/bin/nvidia-smi conf-compute -srs 1
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    TimeoutStartSec=900
    TimeoutStopSec=60
    
    [Install]
    WantedBy=multi-user.target