Build a large language model inference environment with security measurement on heterogeneous confidential computing instances - Elastic Compute Service

Background

Alibaba Cloud heterogeneous confidential computing instances (gn8v-tee) extend CPU-based confidential computing instances by integrating a GPU into the Trusted Execution Environment (TEE). This protects data in transit between the CPU and GPU and data processed within the GPU. This topic describes a solution that uses these instances to integrate the security measurement and remote attestation features of Intel® Trust Domain Extensions (TDX) into a large language model (LLM) inference service. This solution builds a robust security, authentication, and privacy-preserving workflow for the LLM service. This ensures the secure management and integrity of models and user data, preventing unauthorized access throughout the service lifecycle.

This solution is based on two principles:

Confidentiality: Ensures models and user data are processed only within the instance's confidential boundary, preventing plaintext exposure.
Integrity: Ensures all components in the LLM inference environment, such as the inference service framework, model files, and interactive interface, are tamper-proof and verifiable through a strict third-party audit.

Security principles of the solution

This solution involves two security principles: trusted measurement and remote attestation.

Trusted measurement

Intel Trust Domain Extensions (TDX) enhance the security of virtual machines (VMs) by isolating them in hardware-protected Trusted Domains (TDs). During startup, the TDX module records the state of the TD guest in two main register sets:
- Build Time Measurement Register (MRTD): Captures measurements of the guest VM's initial configuration and boot image.
- Runtime Measurement Registers (RTMRs): Records measurements of the initial state, kernel image, command-line options, and other runtime services and parameters.
These measurements ensure the integrity of the TD and the running applications throughout their lifecycle. In this solution, measurements of the model service and kernel parameters, including those related to the Ollama and DeepSeek models and the Open WebUI framework, are recorded in the RTMRs.
Remote attestation

In TDX, remote attestation provides a remote party with cryptographic proof of a TD Confidential VM's integrity and authenticity. The process includes these key steps:
- Obtain a TD Quote
  1. The client requests full remote attestation from Open WebUI.
  2. The Open WebUI backend obtains a hardware-signed remote attestation report. This signed report is called a Quote.
- Verify a TD Quote: The client sends the Quote to a trusted attestation service for verification against a predefined policy. This establishes trust with the model service before the client processes sensitive information.
Note
For more information about Alibaba Cloud Remote Attestation Service, see Remote Attestation Service.

By integrating trusted measurement and attestation, this solution creates a strong security framework for the LLM inference service. It verifies the remote model service's integrity and authenticity, which is crucial for protecting data and privacy.

Architecture

The architecture includes the following components:

Client

The user interface (UI) for accessing the large language model service. The client initiates sessions, verifies the remote model service's trustworthiness, and communicates securely with the backend.
Remote Attestation Service

This service, based on Alibaba Cloud Remote Attestation Service, verifies the security state of the model inference environment. This includes the platform's Trusted Computing Base (TCB) and the inference model service itself.
Inference service components
- Ollama: A model serving framework that handles model inference requests. This solution uses version v0.5.7.
- DeepSeek model: This solution uses a distilled DeepSeek-R1-70B (int4 quantized) model.
- Open WebUI: A web-based interactive interface that runs inside the Confidential VM and receives user model service requests through a RESTful API. This solution uses version v0.5.20.
- CCZoo open source project: This solution uses the Confidential AI source code from CCZoo. This solution uses version v1.2. For more information about this open source project, see CCZoo.

Note

Confidential Computing Zoo (CCZoo) is a collection of security solutions for cloud computing scenarios, designed to help developers easily build their own end-to-end confidential computing solutions. Its security technologies include TEEs (such as Intel® SGX and TDX), Homomorphic Encryption (HE) with hardware acceleration, remote attestation, LibOS, and hardware-accelerated cryptography. The business scenarios involved include, but are not limited to, cloud-native AI inference, federated learning, big data analytics, key management, and Remote Procedure Calls (RPCs) such as gRPC.

Workflow

This solution follows this workflow:

Service startup and measurement

The platform's TCB module measures the integrity of the model service's runtime environment. The measurement result is stored in the TDX Module within the TCB.
Inference session initialization

The client (a browser) sends a new session request to Open WebUI.
Remote attestation
1. Attestation request: When starting a session, the client requests a TDX Quote from the backend to prove the trustworthiness of the model's runtime environment. This proof verifies the remote service environment, including the user session management service (Open WebUI) and the model service (Ollama + DeepSeek).
2. Quote generation: The Open WebUI service backend forwards the attestation request to the Intel TDX-based Confidential VM. The VM then uses the CPU hardware to generate a TDX Quote, which includes a complete certificate chain.
3. Quote verification: The client submits the received Quote to the Remote Attestation Service for verification. The service validates the Quote, including the digital signature, certificate chain, and security policy, and then returns a result that confirms the security status and integrity of the remote model service environment.
Confidential LLM inference
1. Remote attestation succeeds: The client can trust the remote model service. This assurance means the risk of data leakage is extremely low, though no system is completely risk-free.
2. Remote attestation fails: The attestation service returns an error message, indicating that the remote attestation has failed. The user or system can then choose to abort the request or continue after receiving a security risk warning. If the service continues, it may face data security risks.

Procedure

Step 1: Create a heterogeneous confidential computing instance

Important

Model data downloaded using Ollama is saved to the /usr/share/ollama/.ollama/models directory. Model files are typically large. For example, the DeepSeek-R1 70b quantized model is about 40 GB. When creating the instance, select a Cloud Disk with enough capacity for your models. We recommend a capacity two to three times the size of the model file.

Console

Creating a heterogeneous confidential computing instance is similar to creating a standard instance but requires specific configurations. For general configurations, see Create an instance using the wizard.

Go to ECS console - Instances.
In the upper-left corner of the page, select a region and resource group.

Click Create Instance and configure the instance with the following settings.

Configuration Item	Description
Region and Zone	China (Beijing) Zone L
Instance Type	ecs.gn8v-tee.4xlarge or higher.
Image	Select the Alibaba Cloud Linux 3.2104 LTS 64-bit image.
Public IP Address	Assign Public IPv4 Address. Required to download the NVIDIA driver later.

Important

When creating or restarting a confidential instance with 8 GPUs, do not attach additional secondary ENIs or data disks. This can cause a startup failure.

Cause and solution

ECS instances with TDX enabled use a non-encrypted memory region, the Software Input Output Translation Lookaside Buffer (SWIOTLB), for peripheral communication. By default, SWIOTLB size is 6% of available memory, with a maximum of 1 GiB.

Attaching multiple ENIs or data disks to a confidential instance with 8 GPUs can exhaust the SWIOTLB memory, causing a memory allocation failure that prevents startup.

If the instance fails to start:

Solution 1: Stop the instance, unbind additional secondary ENIs, and detach all data disks.
Solution 2: Create a new instance with only one primary ENI, no data disks, and only a system disk.

To add multiple ENIs or data disks to a confidential instance with 8 GPUs, first complete Step 1 to increase the SWIOTLB buffer to 8 GB. Then attach ENIs to the instance and attach data disks.

Complete the instance creation by following the on-screen instructions.

API/CLI

Call the RunInstances operation or use the Alibaba Cloud CLI to create a TDX-enabled ECS instance. Key parameters:

Parameter	Description	Example
RegionId	China (Beijing)	cn-beijing
ZoneId	Zone L	cn-beijing-l
InstanceType	ecs.gn8v-tee.4xlarge or higher.	ecs.gn8v-tee.4xlarge
ImageId	ID of an image that supports confidential computing. Only 64-bit Alibaba Cloud Linux 3.2104 LTS images with kernel version 5.10.134-18.al8.x86_64 or later.	aliyun_3_x64_20G_alibase_20250117.vhd

CLI example:

<SECURITY_GROUP_ID>: security group ID. <VSWITCH_ID>: vSwitch ID. <KEY_PAIR_NAME>: SSH key pair name.

aliyun ecs RunInstances \
  --RegionId cn-beijing \
  --ZoneId cn-beijing-l \
  --SystemDisk.Category cloud_essd \
  --ImageId 'aliyun_3_x64_20G_alibase_20250117.vhd' \
  --InstanceType 'ecs.gn8v-tee.4xlarge' \
  --SecurityGroupId '<SECURITY_GROUP_ID>' \
  --VSwitchId '<VSWITCH_ID>' \
  --KeyPairName <KEY_PAIR_NAME>

Step 2: Build the TDX remote attestation environment

A TDX Report is a CPU-generated data structure that represents the identity of a TDX instance. It contains key information, such as ATTRIBUTES, Runtime-extendable Measurement Registers (RTMRs), and Trusted Computing Base Security Version Numbers (TCB SVNs), and uses cryptographic methods to protect its integrity. See Intel TDX Module.

Add the Alibaba Cloud confidential computing yum repository.
- Public endpoint format: https://enclave-[Region-ID].oss-[Region-ID].aliyuncs.com/repo/alinux/enclave-expr.repo.
- VPC endpoint format: https://enclave-[Region-ID].oss-[Region-ID]-internal.aliyuncs.com/repo/alinux/enclave-expr.repo.
  
  Replace [Region-ID] with the region ID of the TDX instance. The following example uses instance metadata to dynamically obtain the region ID:
```
token=$(curl -s -X PUT -H "X-aliyun-ecs-metadata-token-ttl-seconds: 5" "http://100.100.100.200/latest/api/token")
region_id=$(curl -s -H "X-aliyun-ecs-metadata-token: $token" http://100.100.100.200/latest/meta-data/region-id)

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://enclave-${region_id}.oss-${region_id}-internal.aliyuncs.com/repo/alinux/enclave-expr.repo
```

Install build tools and sample code.

sudo yum groupinstall -y "Development Tools"
sudo yum install -y sgxsdk libtdx-attest-devel

Configure the Alibaba Cloud TDX remote attestation service.

Set PCCS_URL in /etc/sgx_default_qcnl.conf. The following example uses instance metadata to dynamically obtain the region ID and configure the DCAP service:

token=$(curl -s -X PUT -H "X-aliyun-ecs-metadata-token-ttl-seconds: 5" "http://100.100.100.200/latest/api/token")
region_id=$(curl -s -H "X-aliyun-ecs-metadata-token: $token" http://100.100.100.200/latest/meta-data/region-id)

sudo sed -i.$(date "+%m%d%y") 's|PCCS_URL=.*|PCCS_URL=https://sgx-dcap-server.${region_id}.aliyuncs.com/sgx/certification/v4/|' /etc/sgx_default_qcnl.conf

Step 3: Install Ollama

Connect to a Linux instance by using Workbench.
Run the following command to install Ollama.
```
curl -fsSL https://ollama.com/install.sh | sh
```
Note
The preceding script is the official installation script provided by Ollama. If the installation fails due to network issues, you can refer to the Ollama official website and choose another installation method. For more information, see the Ollama installation guide.

Step 4: Download and run DeepSeek-R1 using Ollama

The model file is large, and the download can be time-consuming. We recommend that you use the tmux tool to keep the session active and prevent the download from being unexpectedly interrupted.

Install the tmux tool.

Run the following command to install the tmux tool.
```
sudo yum install -y tmux
```
Download and run DeepSeek-R1 using Ollama.

Run the following commands to create a tmux session and download and run the DeepSeek-R1 model in the tmux session.
```
# Create a tmux session named run-deepseek.
tmux new -s "run-deepseek"
# In the tmux session, download and run the deepseek-r1 model.
ollama run deepseek-r1:70b
```
The following example shows the command output, which indicates that the model is downloaded and started successfully. You can enter /bye to exit the interactive session.
```
......
verifying sha256 digest 
writing manifest 
success 
>>> 
>>> Send a message (/? for help)
```
(Optional) Reconnect to the tmux session.

If you need to restore the tmux session after a network disconnection, run the following command.
```
tmux attach -t run-deepseek
```

Step 5: Compile Open WebUI

To enable TDX security measurement in Open WebUI, you must download the TDX plugin and compile Open WebUI from source. This section describes the procedure.

Important

The following examples use /home/ecs-user as the working directory. Replace it with your actual working directory.

Install dependencies and the required environment

Install Node.js.

Run the following command to install Node.js.

sudo yum install -y nodejs

Note

If you encounter problems installing Node.js with the package manager, you can try using Node Version Manager (nvm) to install a specific version of Node.js.

# Download and install nvm.
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
# Load nvm environment variables.
source ~/.bashrc
# Install Node.js version 20.18.1.
nvm install 20.18.1
# Use this version.
nvm use 20.18.1
# Verify the version.
node --version

Install Miniforge3 and configure its environment variables.

Run the following commands to install Miniforge3 and configure its environment variables to manage the open-webui virtual environment.

# Get the Miniforge3 installation package.
wget https://github.com/conda-forge/miniforge/releases/download/24.11.3-2/Miniforge3-24.11.3-2-Linux-x86_64.sh
# Install miniforge3 non-interactively to the /home/ecs-user/miniforge3 directory.
bash Miniforge3-24.11.3-2-Linux-x86_64.sh -bu -p /home/ecs-user/miniforge3
# Set the environment variable for Miniforge3.
export PATH="/home/ecs-user/miniforge3/bin:$PATH"

Initialize Conda and verify its version.

Run the following commands to initialize Conda and verify its version.
```
# Initialize Conda.
conda init
source ~/.bashrc
# Verify the version.
conda --version
```

Manually compile Open WebUI.

Download the TDX security measurement plugin.

Run the following commands to download the TDX security measurement plugin and switch to the v1.2 branch.

cd /home/ecs-user
git clone https://github.com/intel/confidential-computing-zoo.git
git config --global --add safe.directory /home/ecs-user/confidential-computing-zoo
cd confidential-computing-zoo
git checkout v1.2

Pull the Open WebUI source code.

Run the following commands to pull the Open WebUI source code and switch to the v0.5.20 branch.

cd /home/ecs-user
git clone https://github.com/open-webui/open-webui.git
# Switch to the tag:v0.5.20 branch.
git config --global --add safe.directory /home/ecs-user/open-webui
cd /home/ecs-user/open-webui
git checkout v0.5.20
# Apply the patch provided by CCZoo, which adds support for TDX remote attestation to open-webui.
cd /home/ecs-user
cp /home/ecs-user/confidential-computing-zoo/cczoo/confidential_ai/open-webui-patch/v0.5.20-feature-cc-tdx-v1.0.patch .
git apply --ignore-whitespace --directory=open-webui/ v0.5.20-feature-cc-tdx-v1.0.patch

Create and activate the open-webui environment.

Run the following commands to create and activate the open-webui environment.
```
conda create --name open-webui python=3.11
conda activate open-webui
```
Install the "Get TDX Quote" plugin.
```
cd /home/ecs-user/confidential-computing-zoo/cczoo/confidential_ai/tdx_measurement_plugin/
pip install Cython
python setup.py install
```
After the commands finish, run the following command to verify the installation. If it runs without error, the installation was successful.
```
python3 -c "import quote_generator"
```
Compile Open WebUI.
```
# Install dependencies.
cd /home/ecs-user/open-webui/
# Configure the npm registry.
npm config set registry http://registry.npmmirror.com
sudo npm install
# Compile.
sudo npm run build
```
After the compilation is complete, run the following command to copy the generated build folder to the backend directory and rename it to frontend.
```
rm -rf ./backend/open_webui/frontend
cp -r build ./backend/open_webui/frontend
```
Note
At this point, the Alibaba Cloud Remote Attestation Service is successfully configured in the compiled Open WebUI. You can find the relevant configuration information in the /home/ecs-user/open-webui/external/acs-attest-client/index.js file.

Configure the startup file for the Open WebUI backend service.

Run the following commands to create a startup file for the Open WebUI backend and make it executable.

tee /home/ecs-user/open-webui/backend/dev.sh << 'EOF'
#Set the service address and port. The default port is 8080.
PORT="${PORT:-8080}"
uvicorn open_webui.main:app --port $PORT --host 0.0.0.0 --forwarded-allow-ips '*' --reload
EOF
# Add executable permissions to the startup file.
chmod +x /home/ecs-user/open-webui/backend/dev.sh

Install the required dependency libraries for running Open WebUI.

cd /home/ecs-user/open-webui/backend/
pip install -r requirements.txt -U
conda deactivate

Step 6: Run Open WebUI and verify TDX attestation

Run the large model and start the Open WebUI service.

(Optional) If the Ollama service is not running, you can run the following command to start it.
```
ollama serve
```
Run the following command to run the DeepSeek-R1 model using Ollama.
```
ollama run deepseek-r1:70b
```
Run the following command to activate the open-webui virtual environment.
```
conda activate open-webui
```

Run the following command to start the Open WebUI backend service.

cd /home/ecs-user/open-webui/backend && ./dev.sh

The following example shows the command output, which indicates that the Open WebUI backend service has started successfully.

......
INFO  [open_webui.env] Embedding model set: sentence-transformers/all-MiniLM-L6-v2
/root/miniforge3/envs/open-webui/lib/python3.12/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
WARNI [langchain_community.utils.user_agent] USER_AGENT environment variable not set, consider setting it to identify your requests.
 ██████╗ ██████╗ ███████╗███╗   ██╗    ██╗    ██╗███████╗██████╗ ██╗   ██╗██╗
██╔═══██╗██╔══██╗██╔════╝████╗  ██║    ██║    ██║██╔════╝██╔══██╗██║   ██║██║
██║   ██║██████╔╝█████╗  ██╔██╗ ██║    ██║ █╗ ██║█████╗  ██████╔╝██║   ██║██║
██║   ██║██╔═══╝ ██╔══╝  ██║╚██╗██║    ██║███╗██║██╔══╝  ██╔══██╗██║   ██║██║
╚██████╔╝██║     ███████╗██║ ╚████║    ╚███╔███╔╝███████╗██████╔╝╚██████╔╝██║
 ═════╝ ╚═╝     ╚══════╝╚═╝  ╚═══╝     ╚══╝╚══╝ ╚══════╝╚═════╝  ╚═════╝ ╚═╝
v0.5.20 - building the best open-source AI user interface.
https://github.com/open-webui/open-webui

Access the Open WebUI service from a browser.
1. Add a security group rule.
  
  Add a rule to your instance's security group to allow inbound traffic on port 8080. For more information, see Add a security group rule.
2. Access the Open WebUI service from a browser.
  
  In your local browser, go to http://{ip_address}:{port}.
  - {ip_address}: The public IP address of the instance where Open WebUI is located.
  - {port}: The default port number is 8080.
  A green icon indicates successful remote attestation; a red icon indicates failure.
  
  After successful access, Open WebUI displays the chat assistant interface for the selected model, such as deepseek-r1:70b, which includes a dialog input box and a list of recommended questions.
  
  Note
  Each time you click the New Chat button, the backend service automatically obtains the Quote data of the TDX confidential computing environment, sends it to the Remote Attestation Service, and returns the authentication result. Initially, this icon is red, which indicates that the remote attestation is not complete or has failed. It turns green after a successful remote attestation.

Verify the TDX attestation information.

Hover over the first icon in the dialog box to view detailed attestation information from the TDX Quote.

The authentication information includes hash values and attribute fields such as jti, tee (with a value of tdx), exp/iat timestamps, mr_td, rtmr_0 to rtmr_3, mr_seam, seam_attributes, td_attributes, and xfam.

You can view detailed information using the browser's developer tools. The following example shows the result.

user-list {user_ids: Array(1)}
usage ▶ {models: Array(0)}
user-list {user_ids: Array(1)}
mounted
[tiptap warn]: Duplicate extension names found: ['codeBlock']. This can lead to issues.
Attestation Display info:
{jti: 'c1e24a09-daf5-4064-a8a9-642c8c18c7fe', tee: 'tdx', exp: '2025-04-22 00:16:21 Asia/Shanghai', iat: '2025-04-21 18:16:21 Asia/Shanghai', mr_td: 'b0e52c59577523b17ad553c6fffb0f5f3496dbf3ccca69fbb2ea87cf4f938157550005c92a98130d8d30507ca5c652df', …}
  exp: "2025-04-22 00:16:21 Asia/Shanghai"
  iat: "2025-04-21 18:16:21 Asia/Shanghai"
  jti: "c1e24a09-daf5-4064-a8a9-642c8c18c7fe"
  mr_seam: "1cc6a17ab799e9a693fac7536be61c12ee1e0fabada82d0c999e08ccee2aa86de77b0870f558c570e7ffe55d6d47fa04"
  mr_td: "b0e52c59577523b17ad553c6fffb0f5f3496dbf3ccca69fbb2ea87cf4f938157550005c92a98130d8d30507ca5c652df"
  rtmr_0: "78be53d723b6be3f82997e3e8291f133b5d0a9905c17e5f95308c7db488e22da3405fc2e3b60f6291c38304096a17d21"
  rtmr_1: "216e85c7541a45bfbb9fe0521c72886bf8f47493d6027f2e33afe50a6d2f94690435078b0f0205ee447bf08d29f60e4e"
  rtmr_2: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  rtmr_3: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  seam_attributes: "0000000000000000"
  td_attributes: "0000001000000000"
  tee: "tdx"
  xfam: "e742060000000000"
  [[Prototype]]: Object
ALL Attestation info
{jti: 'c1e24a09-daf5-4064-a8a9-642c8c18c7fe', tee: 'tdx', exp: '2025-04-22 00:16:21 Asia/Shanghai', iat: '2025-04-21 18:16:21 Asia/Shanghai', mr_td: 'b0e52c59577523b17ad553c6fffb0f5f3496dbf3ccca69fbb2ea87cf4f938157550005c92a98130d8d30507ca5c652df', …}
  att_key_type: "0200"
  exp: "2025-04-22 00:16:21 Asia/Shanghai"
  iat: "2025-04-21 18:16:21 Asia/Shanghai"
  jti: "c1e24a09-daf5-4064-a8a9-642c8c18c7fe"
  mr_condif_id: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  mr_owner: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  mr_owner_config: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  mr_seam: "1cc6a17ab799e9a693fac7536be61c12ee1e0fabada82d0c999e08ccee2aa86de77b0870f558c570e7ffe55d6d47fa04"
  mr_servicetd: "383c87d3bbb047b2d171eaca95312ede99f258088dc788f6ae2ccf8b6dd848fe8d47629e08b3f6cbd4a0ddd47a5..."
  mr_td: "b0e52c59577523b17ad553c6fffb0f5f3496dbf3ccca69fbb2ea87cf4f938157550005c92a98130d8d30507ca5c652df"
  mrsigner_seam: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  rtmr_0: "78be53d723b6be3f82997e3e8291f133b5d0a9905c17e5f95308c7db488e22da3405fc2e3b60f6291c38304096a17d21"
  rtmr_1: "216e85c7541a45bfbb9fe0521c72886bf8f47493d6027f2e33afe50a6d2f94690435078b0f0205ee447bf08d29f60e4e"
  rtmr_2: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  rtmr_3: "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
  seam_attributes: "0000000000000000"
  tcb_svn: "05010600000000000000000000000000"
  td_attributes: "0000001000000000"
  td_attributes.debug: false

FAQ

Slow package downloads when using pip

Symptom: pip install is slow or fails.
Cause: Network access to the official pip software source is unstable.
Solution: You can use an Alibaba Cloud mirror to accelerate downloads.
Global acceleration

Add the following content to your ~/.pip/pip.conf file.
```
[global]
index-url = https://mirrors.aliyun.com/pypi/simple/
```
Single installation

When you run the pip install command, you can add the -i parameter to specify the software source address to accelerate the installation. The following example shows how to install the torch package. Replace it with your actual package.
```
pip install torch -i https://mirrors.aliyun.com/pypi/simple/
```

`Cannot find package` error when compiling Open WebUI

Symptom: A Cannot find package error occurs when you compile Open WebUI.
Cause: The corresponding package is missing in the compilation environment.
Solution: You need to use npm to install the missing package and then recompile. The following example shows how to install the pyodide package. Replace it with your actual package.
```
npm install pyodide
```

Elastic Compute Service:Build a large language model inference environment with security measurement using heterogeneous confidential computing instances

Background

Architecture

Workflow

Procedure

Step 1: Create a heterogeneous confidential computing instance

Console

API/CLI

Step 2: Build the TDX remote attestation environment

Step 3: Install Ollama

Step 4: Download and run DeepSeek-R1 using Ollama

Step 5: Compile Open WebUI

Step 6: Run Open WebUI and verify TDX attestation

FAQ

Slow package downloads when using pip

Global acceleration

Single installation

`Cannot find package` error when compiling Open WebUI

Related documents

Background

Architecture

Workflow

Procedure

Step 1: Create a heterogeneous confidential computing instance

Console

API/CLI

Step 2: Build the TDX remote attestation environment

Step 3: Install Ollama

Step 4: Download and run DeepSeek-R1 using Ollama

Step 5: Compile Open WebUI

Step 6: Run Open WebUI and verify TDX attestation

FAQ

Slow package downloads when using pip

Global acceleration

Single installation

Cannot find package error when compiling Open WebUI

Related documents

`Cannot find package` error when compiling Open WebUI