How to use a DataCache to accelerate the building of a Stable Diffusion application - Elastic Container Instance

This topic describes how to use a DataCache to accelerate the deployment of a Stable Diffusion application. If you want to deploy a Stable Diffusion application, you can pull Stable Diffusion model data from a DataCache in advance. When you create the pod that corresponds to the Stable Diffusion application, you can mount the model data to the pod. This eliminates the need to pull data and accelerates the deployment of the Stable Diffusion application.

Background information

Stable Diffusion is a model that can be used to generate and modify images based on text descriptions. Stable Diffusion consists of a text understanding component and an image generating component. Stable Diffusion encodes prompts by using the Contrastive Language-Image Pre-Training (CLIP) model and generates images by using a diffusion model.

Important

Alibaba Cloud does not guarantee the legality, security, or accuracy of third-party models. Alibaba Cloud is not liable for any damages caused thereby.
You must abide by the user agreements, usage specifications, and relevant laws and regulations of the third-party models. You agree that your use of the third-party models is at your sole risk.

Prerequisites

A DataCache custom resource definition (CRD) is deployed in the cluster. For more information, see Deploy a DataCache CRD.
The virtual private cloud (VPC) in which the cluster resides is associated with an Internet NAT gateway. An SNAT entry is configured for the Internet NAT gateway to allow resources in the VPC or resources connected to vSwitches in the VPC to access the Internet.
Note
If the VPC is not associated with an Internet NAT gateway, you must associate an elastic IP address (EIP) with the VPC when you create the DataCache and deploy the application. This way, you can pull data from the Internet.

Prepare a runtime environment

To deploy a Stable Diffusion application, you must prepare a container image that contains the environment required for running the Stable Diffusion application. The environment must include the Compute Unified Device Architecture (CUDA), the Diffusers library, and other basic dependencies. Elastic Container Instance provides container images that can run stable environments for most models. If your application does not require special dependencies, you can use the images that are provided by Elastic Container Instance.

Images that start the HTTP service

The GPU-accelerated image: registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu:cuda11.7.1-cudnn8-ubuntu20.04
The CPU-accelerated image: registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu:hf-ubuntu20.04

The following part provides information about the images.

Show the Dockerfile that is used to create the image

The following Dockerfile can be used to build a development environment based on Ubuntu and Python and pre-install common dependencies. After the container starts, it runs the python3 http-server.py command to start the HTTP service.

FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04
LABEL maintainer="Alibaba Cloud Serverless Container"

ENV DEBIAN_FRONTEND=noninteractive

RUN apt update && \
    apt install -y bash \
                   vim \
                   build-essential \
                   git \
                   git-lfs \
                   curl \
                   ca-certificates \
                   libsndfile1-dev \
                   libgl1 \
                   python3.8 \
                   python3-pip \
                   python3.8-venv && \
    rm -rf /var/lib/apt/lists

# make sure to use venv
RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
RUN mkdir -p /workspace/pic/
WORKDIR /workspace
COPY http-server.py http-server.py

# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
    python3 -m pip install --no-cache-dir \
        torch \
        torchvision \
        torchaudio \
        invisible_watermark && \
    python3 -m pip install --no-cache-dir \
        accelerate \
        datasets \
        hf-doc-builder \
        huggingface-hub \
        Jinja2 \
        librosa \
        numpy \
        scipy \
        tensorboard \
        transformers \
        omegaconf \
        pytorch-lightning \
        xformers \
        safetensors \
        diffusers
        

CMD ["/bin/bash"]

Show the HTTP service script that is used in the image

An http-server.py script is a simple HTTP service that receives text descriptions, generates corresponding images, and returns the paths of the images.

import os
import hashlib
import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
from http.server import BaseHTTPRequestHandler
from http.server import HTTPServer
from urllib.parse import urlparse, parse_qs

MODEL_DIR_NEV = "MODEL_DIR"
APP_PORT_ENV = "APP_PORT"


def text2image(input):
    model_id = os.getenv(MODEL_DIR_NEV, default="/data/model/")
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    pipe = pipe.to("cuda")

    image = pipe(input).images[0]
    name = "/workspace/pic/" + hashlib.md5(input.encode('utf8')).hexdigest() + ".png"
    image.save(name)
    return name


class GetHandler(BaseHTTPRequestHandler):

    def do_GET(self):
        query = parse_qs(urlparse(self.path).query)
        # Obtain the parameter value.
        input = query.get('input')[0]
        print("get user input:%s, try generate image" % input)
        picName = text2image(input)
        # Construct a response.
        self.send_response(200)
        self.send_header('Content-type', 'text/html')
        self.end_headers()
        self.wfile.write(bytes("<html><head><title>Stable Diffusion</title></head>", "utf-8"))
        self.wfile.write(bytes("<body><p>Success generate image:%s</p>" % picName, "utf-8"))
        self.wfile.write(bytes("</body></html>", "utf-8"))


if __name__ == '__main__':
    server = HTTPServer(('', int(os.getenv(APP_PORT_ENV, default="8888"))), GetHandler)
    server.serve_forever()
    print('Starting server')

The image that supports WebUI
Image address: registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion:1.0.0
Note
This image applies to all Stable Diffusion models. To use a model, you only need to mount an existing model data cache to the /stable-diffusion-webui/models/Stable-diffusion/ directory.

Procedure

Select a procedure based on the image that you use.

Use an image that starts the HTTP service

Create a DataCache

Visit Hugging Face and obtain the ID of the model.
In this example, the stabilityai/stable-diffusion-2-1 model is used. Find the model in Hugging Face and copy the ID of the model in the upper part of the model details page.

Write a YAML configuration file for the DataCache. Then, use the YAML file to create the DataCache and pull the stable-diffusion-2-1 model data from the DataCache.

kubectl create -f datacache-test.yaml

Example: datacache-test.yaml document

apiVersion: eci.aliyun.com/v1alpha1
kind: DataCache
metadata:
  name: stable-diffusion
spec:
  path: /model/stable-diffusion/                 # Specify the storage path of the model data.
  dataSource:
    type: URL 
    options:
      repoSource: HuggingFace/Model              # Specify the model whose data source is Hugging Face.
      repoId: stabilityai/stable-diffusion-2-1   # Specify the ID of the model.
  retentionDays: 1
  netConfig: 
    securityGroupId: sg-2ze63v3jtm8e6s******
    vSwitchId: vsw-2ze94pjtfuj9vay******         # Specify a vSwitch for which an SNAT gateway is configured.

View the status of the DataCache.
```
kubectl get edc stable-diffusion
```
After the data is downloaded and the status of the DataCache becomes Available, the DataCache is ready for use. Example:

Deploy the Stable Diffusion application

Write a YAML configuration file for the application, and then use the YAML file to deploy the Stable Diffusion application.

kubectl create -f stable-diffusion.yaml

The following example shows the content of stable-diffusion.yaml. You can use the configuration file to create a Deployment that contains a pod replica. Select the GPU-accelerated image for the pod and mount the Stable Diffusion v2-1 model data to the pod. Select the GPU-accelerated image for the container in the pod. After the container starts, it runs the python3 http-server.py command to start the HTTP service.

Note

In the following example, a GPU-accelerated image is used. When you create the Elastic Container Instance pod, you must specify a GPU-accelerated ECS instance type and the number of GPUs required by the container. You can also use the CPU-accelerated image. An application that uses the CPU-accelerated image starts faster than an application that uses the GPU-accelerated image, but infers slower.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stable-diffusion
  labels:
    app: stable-diffusion
spec:
  replicas: 1
  selector:
    matchLabels:
      app: stable-diffusion
  template:
    metadata:
      name: stable-diffusion
      labels:
        app: stable-diffusion
        alibabacloud.com/eci: "true" 
      annotations:
        k8s.aliyun.com/eci-use-specs: ecs.gn7i-c16g1.4xlarge     # Specify a GPU-accelerated ECS instance type.
        k8s.aliyun.com/eci-data-cache-bucket: "default"          # Specify the bucket in which you want to store the DataCache.
    spec:
      containers:
      - name: stable-diffusion
        image: registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu:cuda11.7.1-cudnn8-ubuntu20.04  # Use the GPU-accelerated image.
        resources:
            limits:
              nvidia.com/gpu: "1"           # Specify the number of GPUs that are required by the container.
        command: ["/bin/sh"]
        args: ["-c","python3 http-server.py"]
        volumeMounts:
        - name: "model"
          mountPath: "/data/model/"         # Specify the mount path of the model data in the container.
      volumes: 
      - name: "model"
        hostPath:             
          path: "/model/stable-diffusion/"    # Mount the model data.

Check whether the application is deployed.
```
kubectl get deployment stable-diffusion
kubectl get Pod
```
The following example shows that the Stable Diffusion application is deployed.
Check whether the model data is mounted.
```
kubectl exec -it <POD_NAME> -- bash
ls /data/model
```
The following example shows that the model data is mounted in the /data/model directory of the container.

Create a Service to allow external access to the Stable Diffusion application.

kubectl create -f stable-diffusion-svc.yaml

The following example shows the content of the stable-diffusion-svc.yaml. You can use the configuration file to create a pay-as-you-go Internet-facing LoadBalancer Service. The Service exposes port 8888 and forwards data transfers to port 8888 of pods that have the app: stable-diffusion label (pods that correspond to Stable Diffusion applications).

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: internet
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type: PayByCLCU
  name: stable-diffusion-svc
  namespace: default
spec:
  externalTrafficPolicy: Local
  ports:
  - port: 8888
    protocol: TCP
    targetPort: 8888
  selector:
    app: stable-diffusion
  type: LoadBalancer

View the IP address of the Service.
```
kubectl get svc stable-diffusion-svc
```
In the following example, the IP address of the Service that is displayed in the EXTERNAL-IP column is 39.106.XX.XX.

Test the model

Pass in a text description to test whether the Stable Diffusion application can generate an image.
1. Add an inbound rule to the security group to which the pod belongs and open port 8888.
2. Open a browser and visit the external IP address of the Service over port 8888.
```
http://39.106.XX.XX:8888?input=xxx
```
  Pass in a text description in input=xxx. The Stable Diffusion application generates an image based on the text description and saves the image to the /workspace/pic directory of the container. Example:
View the image that is generated by the Stable Diffusion application.
1. Add an inbound rule to the security group to which the pod belongs to open a port to view images. In this example, port 7777 is used.
2. Update the Service to add port 7777 that is required to view images.
```
kubectl patch service stable-diffusion-svc --type='json' -p '[{"op": "add", "path": "/spec/ports/-", "value": {"name":"image", "port": 7777, "targetPort": 7777}}]'
```
3. Start a new HTTP service that uses port 7777.
```
kubectl exec -it stable-diffusion-7fb849b98f-b9d9t -- bash
python3 -m http.server 7777 --directory /workspace/pic/ &
```
4. Open a browser, visit the external IP address of the Service over port 7777, and view the generated image.

Use the image that supports WebUI

Create a DataCache

Visit Hugging Face and obtain the ID of the model.
In this example, the hanafuusen2001/BeautyProMix model is used. Find the model in Hugging Face and copy the ID of the model in the upper part of the model details page.

Write a YAML configuration file for the DataCache. Then, use the YAML file to create a DataCache and pull the BeautyProMix model data from the DataCache.

kubectl create -f datacache-test.yaml

Example: datacache-test.yaml document

apiVersion: eci.aliyun.com/v1alpha1
kind: DataCache
metadata:
  name: beautypromix
spec:
  path: /model/BeautyProMix/                 # Specify the storage path of the model data.
  dataSource:
    type: URL 
    options:
      repoSource: HuggingFace/Model              # Specify the model whose data source is Hugging Face.
      repoId: hanafuusen2001/BeautyProMix        # Specify the ID of the model.
  retentionDays: 1
  netConfig: 
    securityGroupId: sg-2ze63v3jtm8e6s******
    vSwitchId: vsw-2ze94pjtfuj9vay******         # Specify a vSwitch for which an SNAT gateway is configured.

View the status of the DataCache.
```
kubectl get edc beautypromix
```
After the data is downloaded and the status of the DataCache becomes Available, the DataCache is ready for use. Example:

Deploy the Stable Diffusion application

Write a YAML configuration file for the application, and then use the YAML file to deploy the Stable Diffusion application.

kubectl create -f stable-diffusion.yaml

The following example shows the content of the stable-diffusion.yaml. You can use the configuration file to create a Deployment that contains a pod replica. Select the GPU-accelerated image for the pod and mount the BeautyProMix model data to the pod. Select the GPU-accelerated image for the container in the pod. After the container starts, it runs the python3 launch.py --listen --skip-torch-cuda-test --port 8888 --no-half command to start a WebUI.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stable-diffusion
  labels:
    app: stable-diffusion
spec:
  replicas: 1
  selector:
    matchLabels:
      app: stable-diffusion
  template:
    metadata:
      name: stable-diffusion
      labels:
        app: stable-diffusion
        alibabacloud.com/eci: "true" 
      annotations:
        k8s.aliyun.com/eci-use-specs: ecs.gn7i-c16g1.4xlarge     # Specify a GPU-accelerated ECS instance type.
        k8s.aliyun.com/eci-data-cache-bucket: "default"          # Specify the bucket in which you want to store the DataCache.
    spec:
      containers:
      - name: stable-diffusion
        image: registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion:1.0.0  # Use the image that supports WebUI.
        resources:
            limits:
              nvidia.com/gpu: "1"           # Specify the number of GPUs that are required by the container.
        command: ["/bin/sh"]
        args: ["-c","python3 launch.py --listen --skip-torch-cuda-test --port 8888 --no-half"]
        volumeMounts:
        - name: "model"
          mountPath: "/stable-diffusion-webui/models/Stable-diffusion/"         # Specify the mount path of the model data in the container.
      volumes: 
      - name: "model"
        hostPath:             
          path: "/model/BeautyProMix/"    # Mount the model data.

Check whether the application is deployed.
```
kubectl get deployment stable-diffusion
kubectl get Pod
```
The following example shows that the Stable Diffusion application is deployed.
Check whether the model data is mounted.
```
kubectl exec -it <POD_NAME> -- bash
ls /stable-diffusion-webui/models/Stable-diffusion
```
The following example shows that the model data is mounted in the /stable-diffusion-webui/models/Stable-diffusion directory of the container.

Create a Service to allow external access to the Stable Diffusion application.

kubectl create -f stable-diffusion-svc.yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: internet
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type: PayByCLCU
  name: stable-diffusion-svc
  namespace: default
spec:
  externalTrafficPolicy: Local
  ports:
  - port: 8888
    protocol: TCP
    targetPort: 8888
  selector:
    app: stable-diffusion
  type: LoadBalancer

View the IP address of the Service.
```
kubectl get svc stable-diffusion-svc
```
In the following example, the IP address of the Service that is displayed in the EXTERNAL-IP column is 101.200.XX.XX.

Test the model

Add an inbound rule to the security group to which the pod belongs and open port 8888.
Open a browser and visit the external IP address of the Service over port 8888.
Enter a text description to test whether the application can generate an image.
Example: