All Products
Search
Document Center

Alibaba Cloud Linux:Deploy Qwen-7B-Chat on an Intel CPU

Last Updated:Jun 03, 2026

Deploy the Qwen-7B-Chat large language model on an Intel CPU by using an Alibaba Cloud AI Containers (AC2) image.

Background

Qwen-7B is a 7-billion-parameter model in the Qwen large language model (LLM) series developed by Alibaba Cloud. It is a Transformer-based LLM that is pre-trained on a massive dataset containing a wide range of data, including web text, professional books, and code. Qwen-7B-Chat is an AI assistant created by applying alignment techniques to the base Qwen-7B model.

Important

The code for Qwen-7B-Chat is open-sourced under the LICENSE. To use it for commercial purposes free of charge, you must submit a commercial license application. You must comply with the user agreements, usage specifications, and all applicable laws and regulations for any third-party models you use. You are solely responsible for ensuring the legality and compliance of your use of these models.

Step 1: Create an ECS instance

  1. Go to the instance buy page.

  2. Configure the following parameters to create an ECS instance.

    For other parameters, refer to Custom launch.

    • Instance: Qwen-7B-Chat requires approximately 30 GiB of memory. Select an instance type of at least ecs.g8i.4xlarge (64 GiB memory) to ensure stable model operation.

    • Image: Alibaba Cloud Linux 3.2104 LTS 64-bit.

    • Public IP Address: Select Assign Public IPv4 Address. Set Bandwidth Billing Method to Pay-by-traffic and the

      image

      to 100 Mbps to accelerate the model download.
    • Data Disk: The model files require significant storage. Set the data disk size to 100 GiB.

Step 2: Create a Docker runtime environment

  1. Install Docker.

    To install Docker on Alibaba Cloud Linux 3, follow the steps in Install and use Docker and Docker Compose.

  2. Verify that the Docker daemon is running:

    sudo systemctl status docker
  3. Pull and run the PyTorch AI container:

    AC2 provides PyTorch images optimized for Intel CPUs. Use this image to set up a PyTorch runtime environment.

    sudo docker pull ac2-registry.cn-hangzhou.cr.aliyuncs.com/ac2/pytorch:2.0.1-3.2304
    sudo docker run -itd --name pytorch --net host -v $HOME/workspace:/workspace   ac2-registry.cn-hangzhou.cr.aliyuncs.com/ac2/pytorch:2.0.1-3.2304

Step 3: Manually deploy Qwen-7B-Chat

  1. Enter the container environment:

    sudo docker exec -it -w /workspace pytorch /bin/bash

    All subsequent commands run inside the container. If you exit accidentally, re-run the preceding command. To verify you are in the container, run cat /proc/1/cgroup | grep docker.

  2. Install the required tools:

    yum install -y tmux git git-lfs wget
  3. Enable Git LFS:

    Git LFS is required to download the pre-trained model.

    git lfs install
  4. Download the source code and model.

    1. Create a tmux session:

      tmux
      Note

      Downloading the pre-trained model may take a long time depending on network conditions. Use a tmux session to prevent interruption if your SSH connection drops.

    2. Download the Qwen-7B project source code and the pre-trained model:

      git clone https://github.com/QwenLM/Qwen.git
      git clone https://www.modelscope.cn/qwen/Qwen-7B-Chat.git qwen-7b-chat --depth=1
    3. Verify the directory contents:

      ls -l
  5. Set up the runtime environment.

    The AC2 container includes many Python AI dependencies. Install the remaining dependencies with yum or dnf.

    yum install -y python3-{transformers{,-stream-generator},tiktoken,accelerate} python-einops
  6. Start the chatbot.

    1. Modify the model loading parameters.

      The source code includes a terminal demo script for running Qwen-7B-Chat as a local chatbot. Modify the model loading parameters to use BF16 precision, which leverages the CPU's AVX-512 instruction set for acceleration.

      cd /workspace/Qwen
      grep "torch.bfloat16" cli_demo.py 2>&1 >/dev/null || sed -i "57i	orch_dtype=torch.bfloat16," cli_demo.py
    2. Start the chatbot:

      cd /workspace/Qwen
      python3 cli_demo.py -c ../qwen-7b-chat --cpu-only

      After deployment, interact with the Qwen-7B-Chat LLM by entering messages at the User> prompt.

      image.png

      Note

      Run :exit to exit the chatbot.