All Products
Search
Document Center

Elastic Compute Service:Step 1: Deploy a client

Last Updated:Jan 09, 2025

The client encrypts a trained model and the TLS certificate that is used to establish secure connections. Then, the client uploads the encrypted files to the SGX encrypted computing environment. A key service is deployed on the same Elastic Compute Service (ECS) instance on which the client is deployed to authenticate Alibaba Cloud virtual SGX (vSGX) ECS instances and to ensure the integrity of the TensorFlow Serving inference service that runs on the cloud and the feasibility of cloud-based SGX encrypted computing environments. After a vSGX instance is authenticated, the client sends a key to the TensorFlow Serving inference service that runs on the instance. This topic describes how to deploy a client. The operations that you can perform to deploy a client include building an SGX encrypted computing environment, creating an encryption model, and creating a gRPC Transport Layer Security (TLS) certificate.

Procedure

  1. Create a client instance and build an SGX encrypted computing environment on the instance.

    1. Create a client instance and configure security group rules that control access to or from the instance.

      The instance on which the client is deployed must meet the following requirements:

      • Instance Type: The client does not need to run in an SGX environment. You can select an instance type that meets your business requirements. We recommend that you select an instance type that has 2 vCPUs and 4 GiB of memory.

      • Image: Select an Alibaba Cloud Linux 3.2104 LTS 64-bit image.

      • Public IP Address: Select Assign Public IPv4 Address.

      • Security Group: Select a security group that has rules to open port 4433.

      Note

      If the client and the vSGX client are deployed on the same ECS instance, the selected security group does not need to contain rules to open port 4433.

    2. Log on to the ECS instance.

    3. Install dependencies.

      sudo yum install -y wget git python3-pip
      python3 -m pip install --user -U pip -i https://mirrors.aliyun.com/pypi/simple/
      python3 -m pip install --user virtualenv -i https://mirrors.aliyun.com/pypi/simple/
    4. Install Docker Community Edition (Docker-CE).

      For more information, see Install Docker.

      After Docker-CE is installed, to use Docker-CE as a non-root user, you can run the sudo usermod -aG docker $USER command to add the non-root user to the Docker group and relog on to the instance as the non-root user.

      Note

      You must install Docker-CE. If you use podman-docker, incompatibility issues may occur.

  2. Switch to the working directory, download the TensorFlow Serving script code, and then install the required software packages, such as argparse, aiohttp, and tensorflow.

    Important

    An extended period of time is required to install the software packages.

    git clone https://gitee.com/cloud_cc/confidential-computing.git
    export CC_DIR=$(realpath ./confidential-computing)
    
    # Create a virtual environment to prevent the Python dependencies from being contaminated.
    python3 -m virtualenv venv && source venv/bin/activate
    python3 -m pip install -r ${CC_DIR}/Tensorflow_Serving/client/requirements.txt --trusted-host mirrors.cloud.aliyuncs.com -i https://mirrors.cloud.aliyuncs.com/pypi/simple/
  3. Go to the TensorFlow_Serving/client directory and download a trained model.

    source venv/bin/activate
    cd ${CC_DIR}/Tensorflow_Serving/client
    ./download_model.sh

    The files of the trained model that you download are stored in the models/resnet50-v15-fp32 directory.

  4. Convert the model format.

    You must convert the format of the trained model files to ensure that the files are compatible with TensorFlow Serving.

    python3 ./model_graph_to_saved_model.py --import_path `pwd -P`/models/resnet50-v15-fp32/resnet50-v15-fp32.pb --export_dir `pwd -P`/models/resnet50-v15-fp32 --model_version 1 --inputs input --outputs predict

    The converted model files are stored as the models/resnet50-v15-fp32/1/saved_model.pb file.

    Note

    When you convert the model format, error logs related to Could not load dynamic library 'libcudart.so.11.0' may be generated. You can ignore this issue.

  5. Create a gRPC TLS certificate.

    In this example, gRPC TLS is used to establish a connection between the client and TensorFlow Serving, and a TensorFlow Serving domain name is configured to create a unidirectional TLS key and a certificate used to establish a secure communications channel.

    The generate_twoway_ssl_config.sh script is run to create the ssl_configure folder, which contains the certificates of the server and the client.

    service_domain_name=grpc.tf-serving.service.com
    client_domain_name=client.tf-serving.service.com
    ./generate_twoway_ssl_config.sh ${service_domain_name} ${client_domain_name}
  6. Create an encryption model.

    SGX SDK v1.9 and later versions come with the secure file I/O feature. The secure file I/O feature is provided in a component named Intel Protected File System Library and allows developers to securely perform I/O operations inside enclaves.

    SGX SDKs can ensure:

    • Confidentiality of user data. All user data is encrypted and written to disks to prevent data leaks.

    • Integrity of user data. All user data is read from disks and decrypted by using verified message authentication codes (MACs) to detect data tampering.

    • Matched file names. Before an existing file is opened, the metadata of the file is checked to ensure that the name of the file when created is the same as the name provided to the file open operation.

    LibOS Gramine, which is used in this example, provides a reference tool based on the secure file I/O feature. You can use the reference tool to encrypt and decrypt files. The sgx.protected_files.file_mode=file_name configuration option is defined in a template configuration file provided by LibOS Gramine to allow you to specify the encrypted files that you want the reference tool to decrypt.

    TensorFlow Serving loads the model from the models/resnet50-v15-fp32/1/saved_model.pb path and uses the key stored in the files/wrap-key directory to encrypt the model file. You can also specify a password that is 128 characters in length. The paths of a file when the file is encrypted and when the file is used must be the same to meet the path matching rule. Run the following commands in the gramine-sgx-pf-crypt tool to encrypt the model file:

    mkdir plaintext/
    mv models/resnet50-v15-fp32/1/saved_model.pb plaintext/
    LD_LIBRARY_PATH=./libs ./gramine-sgx-pf-crypt encrypt -w files/wrap-key -i plaintext/saved_model.pb -o models/resnet50-v15-fp32/1/saved_model.pb
  7. Start a key authentication service.

    In this example, secret_prov_server_dcap provided by LibOS Gramine is used as a remote attestation service to verify SGX enclave quotes by calling the quote verification library of SGX Data Center Attestation Primitives (DCAP) at the underlying layer. Then, secret_prov_server_dcap provides quote verification data, such as the trusted computing base (TCB) and certificate revocation list (CRL), to Alibaba Cloud Provisioning Certificate Caching Service (PCCS). After the remote attestation service determines that an SGX enclave quote is trusted, the TensorFlow Serving inference service sends a key from files/wrap-key in the current directory to the remote application. In this example, the remote application is LibOS Gramine in the vSGX environment. After LibOS Gramine receives the keys from the inference service, LibOS Gramine decrypts the encrypted model file and the TLS configuration file.

    1. Switch to the secret_prov_server directory.

      cd ${CC_DIR}/Tensorflow_Serving/docker/secret_prov
    2. Use the image of the key authentication service.

      You can use one of the following methods to use the image of the key authentication service:

      • Download the image of the key authentication service.

        docker pull registry.cn-beijing.aliyuncs.com/tee_sgx/secret_prov_server:v1
      • Compile the image by using a script.

        image_tag="v1"
        ./build_secret_prov_image.sh $image_tag
    3. Start the key authentication service.

      image_tag="registry.cn-beijing.aliyuncs.com/tee_sgx/secret_prov_server:v1"
      
      # If you created an image, set $image_tag to the tag of the image.
      container_id=$(./run_secret_prov.sh -i $image_tag|tail -n1)
      
      # You can also run the docker ps command to view all container instances that are running.

      After the service is started, it runs in the background to wait for remote attestation requests. After the service receives a remote attestation request from a remote end and authenticates the request to determine that the remote end is trusted, the service sends a key to the remote end.

    4. View the secret_prov_server logs.

      docker logs -f $container_id

      The following command output indicates that a remote attestation request is received.

      启动密钥

What to do next

After the client is deployed, it waits for the vSGX client to start the inference service and send a remote attestation request. For information about how to deploy the vSGX client, see Step 2: Deploy a vSGX client to run the TensorFlow Serving inference service.