All Products
Search
Document Center

Alibaba Cloud Linux:Deploy Qwen-7B-Chat on an NVIDIA GPU

Last Updated:Jun 04, 2026

Deploy the Qwen-7B-Chat AI container image on an NVIDIA GPU using Alibaba Cloud AI Containers (AC2).

Background

Qwen-7B is a 7-billion-parameter model in the Qwen large language model (LLM) series developed by Alibaba Cloud. It is a Transformer-based LLM that is pre-trained on a massive dataset containing a wide range of data, including web text, professional books, and code. Qwen-7B-Chat is an AI assistant created by applying alignment techniques to the base Qwen-7B model.

Important

The code for Qwen-7B-Chat is open-sourced under the LICENSE. To use it for commercial purposes free of charge, you must submit a commercial license application. You must comply with the user agreements, usage specifications, and all applicable laws and regulations for any third-party models you use. You are solely responsible for ensuring the legality and compliance of your use of these models.

Step 1: Create an ECS instance

  1. Go to the instance buy page.

  2. Configure the parameters to create an ECS instance.

    Set the following key parameters. Other parameters are described in Custom launch.

    • Instance: Qwen-7B-Chat requires more than 16 GiB of GPU memory. Select at least ecs.gn6i-c4g1.xlarge.

    • Image: Alibaba Cloud Linux 3.2104 LTS 64-bit.

    • Public IP Address: Select Assign Public IPv4 Address. Set Bandwidth Billing Method to Pay-by-traffic and Maximum Bandwidth to 100 Mbps to accelerate model downloads.

      image

    • Data Disk: Qwen-7B-Chat model files require significant storage. Set the data disk size to at least 100 GiB.

Step 2: Set up the Docker environment

  1. Install Docker.

    Install Docker on Alibaba Cloud Linux 3 by following Install and use Docker and Docker Compose.

  2. Check the Docker daemon status.

    sudo systemctl status docker
  3. Install the NVIDIA driver and CUDA components.

    sudo dnf install -y anolis-epao-release
    sudo dnf install -y kernel-devel-$(uname -r) nvidia-driver{,-cuda}
  4. Install the NVIDIA Container Toolkit.

    sudo dnf install -y nvidia-container-toolkit
  5. The NVIDIA Container Toolkit adds a prestart hook that exposes GPUs to containers. Restart Docker to apply the changes.

    sudo systemctl restart docker

    After the restart, use the --gpus <gpu-request> parameter when creating containers to specify GPU passthrough.

  6. Create and run a PyTorch AI container.

    AC2 provides container images for AI scenarios. Use the following image to create a PyTorch runtime environment.

    sudo docker pull ac2-registry.cn-hangzhou.cr.aliyuncs.com/ac2/pytorch:2.2.0.1-3.2304-cu121
    sudo docker run -itd --name pytorch --gpus all --net host -v $HOME/workspace:/workspace   ac2-registry.cn-hangzhou.cr.aliyuncs.com/ac2/pytorch:2.2.0.1-3.2304-cu121

    These commands pull the image and create a detached container named pytorch with your home directory mounted into the container.

Step 3: Manually deploy Qwen-7B-Chat

  1. Start a shell in the container.

    sudo docker exec -it -w /workspace pytorch /bin/bash

    Run all subsequent commands inside the container. If you exit, re-enter with the same command. To verify you are in the container, run cat /proc/1/cgroup | grep docker.

  2. Install required software.

    yum install -y git git-lfs wget tmux
  3. Enable Git LFS.

    The pretrained model download requires Git LFS.

    git lfs install
  4. Download the source code and model.

    1. Create a new tmux session.

      tmux

      The model download may take a long time. Use tmux so you can resume with tmux attach if the connection drops.

    2. Download the Qwen-7B source code and pretrained model.

      git clone https://github.com/QwenLM/Qwen.git
      git clone https://www.modelscope.cn/qwen/Qwen-7B-Chat.git qwen-7b-chat --depth=1
  5. Set up the runtime environment.

    AC2 containers include prepackaged Python AI dependencies. Install additional dependencies with yum or dnf.

    dnf install -y python-einops     python3-datasets     python3-gradio     python3-mdtex2html     python3-protobuf     python3-psutil     python3-pyyaml     python3-rich     python3-scikit-learn     python3-scipy     python3-sentencepiece     python3-tensorboard     python3-tiktoken     python3-transformers     python3-transformers-stream-generator     yum-utils

    Some dependencies must be installed manually to avoid overwriting AC2 image components.

    yumdownloader --destdir ./rpmpkgs python3-timm python3-accelerate
    rpm -ivh --nodeps rpmpkgs/*.rpm && rm -rf rpmpkgs
  6. Start the AI chatbot.

    1. Start the chatbot.

      cd /workspace/Qwen
      python3 cli_demo.py -c ../qwen-7b-chat

      After startup, enter text at the User: prompt to interact with Qwen-7B-Chat.

      image.png

      Note

      Enter the :exit command to exit the chatbot.