All Products
Search
Document Center

Elastic Container Instance:Use a DataCache to accelerate the building of an Alpaca-LoRa application

Last Updated:Nov 01, 2023

This topic describes how to use a DataCache to accelerate the building of an Alpaca-LoRa application. If you want to build an Alpaca-LoRa application, you can pull the llama-7b-hf model data and alpaca-lora-7b weight data from a DataCache in advance. When you create the pod that corresponds to the Alpaca-LoRa application, you can mount the model data and weight data to the pod. This eliminates the need to pull data and accelerates the startup of the Apache-LoRa application.

Background information

Alpaca-LoRa is a lightweight language model that uses Low-Rank Adaptation (LoRA) techniques to be fine-tuned on the LLaMA (Large Language Model Meta AI) model. Alpaca-LoRa can simulate natural language for dialogue and interaction, generate different texts based on the instructions entered by a user, and help users complete tasks such as writing, translation, and coding.

Important
  • Alibaba Cloud does not guarantee the legality, security, or accuracy of third-party models. Alibaba Cloud is not liable for any damages caused thereby.

  • You must abide by the user agreements, usage specifications, and relevant laws and regulations of the third-party models. You agree that your use of the third-party models is at your sole risk.

Prerequisites

  • A DataCache custom resource definition (CRD) is deployed in the cluster. For more information, see Deploy a DataCache CRD.

  • The virtual private cloud (VPC) in which the cluster resides is associated with an Internet NAT gateway. A SNAT entry is configured for the Internet NAT gateway to allow resources in the VPC or resources connected to vSwitches in the VPC to access the Internet.

    Note

    If the VPC is not associated with an Internet NAT gateway, you must associate an elastic IP address (EIP) with the VPC when you create the DataCache and deploy the application. This way, you can pull data from the Internet.

Procedure

Create an Alpaca-LoRa image

Create an image based on your business requirements.

  1. Visit alpaca-lora and clone a repository to your on-premises machine.

  2. Modify the requirements.txt and Dockerfile in the repository.

    Show the requirements.txt

    accelerate
    appdirs
    loralib
    bitsandbytes
    black
    black[jupyter]
    datasets
    fire
    git+https://github.com/huggingface/peft.git
    transformers>=4.28.0
    sentencepiece
    gradio
    scipy

    Show the Dockerfile

    FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
    
    ARG DEBIAN_FRONTEND=noninteractive
    
    RUN apt-get update && apt-get install -y \
        git \
        curl \
        software-properties-common \
        && add-apt-repository ppa:deadsnakes/ppa \
        && apt install -y python3.10 \
        && rm -rf /var/lib/apt/lists/*
    WORKDIR /workspace
    COPY requirements.txt requirements.txt
    RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10 \
        && python3.10 -m pip install -r requirements.txt \
        && python3.10 -m pip install numpy --pre torch --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu118 \
        && python3.10 -m pip install --upgrade typing-extensions
    COPY . .
    
    EXPOSE 7860
  3. Use the Dockerfile to build an image.

  4. Push the image to the image repository.

Create a DataCache

  1. Visit Hugging Face and obtain the IDs of the models.

    In this example, the following models are used. Find the models in Hugging Face and copy the IDs of the models in the upper part of the model details page.

    • decapoda-research/llama-7b-hf

    • tloen/alpaca-lora-7b

  2. Create DataCaches

    1. Create a DataCache for llama-7b-hf.

      kubectl apply -f llama-7b-hf.yaml

      The llama-7b-hf.yaml document:

      apiVersion: eci.aliyun.com/v1alpha1
      kind: DataCache
      metadata:
        name: llama-7b-hf
      spec:
        path: /model/llama-7b-hf                          # Specify the storage path of the model data.
        bucket: test                                      # Specify the bucket in which you want to store the DataCache.
        dataSource:
          type: URL 
          options:
            repoSource: "HuggingFace/Model"               # Specify a model whose data source is Hugging Face.
            repoId: "decapoda-research/llama-7b-hf"       # Specify the ID of the model.
        netConfig: 
          securityGroupId: sg-2ze63v3jtm8e6sy******
          vSwitchId: vsw-2ze94pjtfuj9vaym******           # Specify a vSwitch for which a SNAT gateway is configured.
    2. Create a DataCache for alpaca-lora-7b.

      kubectl apply -f alpaca-lora-7b.yaml

      The alpaca-lora-7b.yaml document:

      apiVersion: eci.aliyun.com/v1alpha1
      kind: DataCache
      metadata:
        name: alpaca-lora-7b
      spec:
        path: /model/alpaca-lora-7b                        # Specify the storage path of the model data.
        bucket: test                                       # Specify the bucket in which you want to store the DataCache.
        dataSource:
          type: URL 
          options:
            repoSource: "HuggingFace/Model"               # Specify a model whose data source is Hugging Face.
            repoId: "tloen/alpaca-lora-7b" # Specify the ID of the model.
        netConfig: 
          securityGroupId: sg-2ze63v3jtm8e6sy******
          vSwitchId: vsw-2ze94pjtfuj9vaym******           # Specify a vSwitch for which a SNAT gateway is configured.
  3. Query the status of the DataCaches.

    kubectl get edc

    After the data is downloaded and the status of the DataCaches becomes Available, the DataCaches are ready for use. Example:

    lora1.png

Deploy an Alpaca-lora application

  1. Write a YAML configuration file for the Alpaca-lora application, and then use the YAML file to deploy the application.

    kubectl create -f alpacalora.yaml

    The following sample code provides a sample content of alpacalora.yaml. You can create two resource objects:

    • Deployment: The name of the Deployment is alpacalora. The Deployment contains a pod replica. The pod has an additional temporary storage space of 20 GiB. The llama-7b-hf and alpaca-lora-7b Datacaches are mounted to the pod. The image for the containers in the pod is the Alpaca-LoRa image that you created. After the containers are started, the containers run python3.10 generate.py --load_8bit --base_model /data/llama-7b-hf --lora_weights /data/alpaca-lora-7b.

    • Service: The name of the Service is alpacalora-svc. The type of the Service is LoadBalancer. The Service exposes port 80 and forwards data transfers to port 7860 of pods that have the app: alpacalora label.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: alpacalora
      labels:
        app: alpacalora
    spec:
      replicas: 1 
      selector:
        matchLabels:
          app: alpacalora
      template:
        metadata:
          labels:
            app: alpacalora
          annotations:
            k8s.aliyun.com/eci-data-cache-bucket: "test"              # Specify the bucket in which you want to store the DataCache.
            k8s.aliyun.com/eci-extra-ephemeral-storage: "20Gi"        # Increase the temporary storage space by 20 GiB.
        spec:
          containers:
          - name: alpacalora
            image: registry.cn-hangzhou.aliyuncs.com/****/alpaca-lora:v3.5   # Use the image that you created.
            command: ["/bin/sh","-c"]
            args: ["python3.10 generate.py --load_8bit --base_model /data/llama-7b-hf --lora_weights /data/alpaca-lora-7b"] # Replace arguments in the startup command with actual values.
            resources:
              limits:
                cpu: "16000m"
                memory: "64.0Gi"
            ports:
            - containerPort: 7860
            volumeMounts:
            - mountPath: /data/llama-7b-hf         # Specify the mount path of llama-7b-hf in the container.
              name: llama-model
            - mountPath: /data/alpaca-lora-7b   # Specify the mount path of alpaca-lora-7b in the container.
              name: alpacalora-weight
          volumes:
            - name: llama-model
              hostPath:
                path: /model/llama-7b-hf       # Specify the storage path of llama-7b-hf.
            - name: alpacalora-weight
              hostPath:
                path: /model/alpaca-lora-7b   # Specify the storage path of alpaca-lora-7b.
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: alpacalora-svc
    spec:
      ports:
      - port: 80
        targetPort: 7860
        protocol: TCP
      selector:
        app: alpacalora
      type: LoadBalancer
  2. Check the deployment status of the application.

    kubectl get deployment alpacalora
    kubectl get Pod

    The following example shows that the Alpaca-lora application is deployed.

    lora2.png
  3. View the IP address of the Service.

    kubectl get svc alpacalora-svc 

    In the following example, the IP address of the Service that is displayed in the EXTERNAL-IP column is 123.57.XX.XX.

    lora3.png

Test the model

  1. Add an inbound rule to the security group to which the pod belongs and open port 80.

  2. Open a browser and visit the external IP address of the Service over port 80.

  3. Enter text transcripts to test the model.

    Example:

    lora4.png