All Products
Search
Document Center

Container Service for Kubernetes:Quickly deploy a FastChat application in an ACK Serverless cluster

Last Updated:Nov 21, 2025

This topic describes how to quickly deploy a FastChat application in an ACK Serverless cluster. You can deploy the application using the console or kubectl. After the deployment is complete, you can access FastChat through an external endpoint to experience AI-generated content (AIGC).

Prerequisites

You have created an ACK Serverless cluster in the China (Beijing), China (Hangzhou), China (Shanghai), or China (Shenzhen) region and enabled Internet access for the cluster. For more information, see Create an ASK cluster.

Introduction to FastChat

FastChat is an intelligent and easy-to-use chatbot for training, serving, and evaluating large language models. It is a distributed, multi-model service built on advanced large language models, such as Vicuna and FastChat-T5. FastChat provides a web interface and is compatible with OpenAI RESTful APIs.

Important
  • Alibaba Cloud does not guarantee the legitimacy, security, or accuracy of the third-party model, FastChat. Alibaba Cloud is not liable for any damages that result from using FastChat.

  • You must abide by the user agreements, usage specifications, and relevant laws and regulations of FastChat. You are responsible for all consequences that result from your use of FastChat.

Step 1: Deploy the FastChat application

You can deploy the FastChat application using the console or by applying a YAML file with kubectl.

Console

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster you want to manage and click its name. In the left navigation pane, choose Workloads > Deployments.

  3. On the Deployments page, click Create from Image.

  4. On the Basic Information step, set Name to fastchat and Replicas to 1, and then click Next.

  5. On the Container step, set the following parameters and then click Next.

    The following table describes the container parameters. For parameters that are not mentioned, retain their default settings.

    Section

    Parameter

    Example

    General

    Image Name

    yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/fastchat:v1.1.0

    Required Resources

    • CPU: 8 cores

    • Memory: 32 GB

    Important

    In subsequent steps, you will add a pod annotation to specify the ECS instance type for the ECI pod. Therefore, this configuration does not take effect.

    Health Check

    Readiness

    Select Readiness, select TCP, and then set the port to 7860.

    Lifecycle

    Start

    Set the startup command of the container to ["sh","-c","/root/webui.sh"].

  6. On the Advanced step, click Create to the right of Services and configure the service parameters to expose the FastChat application.

    The following table describes the service parameters. For parameters that are not mentioned, retain their default settings.

    Parameter

    Example

    Name

    fastchat-svc

    Service Type

    Select SLB.

    • Load balancer type: Classic Load Balancer (CLB)

    • Select resource: Create Resource

    Port Mapping

    • Name: example-port

    • Service Port: 7860

    • Container Port: 7860

    • Protocol: TCP

  7. On the Advanced page, in the Labels and Annotations section, add the pod annotations in the following table, and then click Create.

    Name

    Value

    k8s.aliyun.com/eci-use-specs

    ecs.gn6i-c8g1.2xlarge,ecs.gn5-c8g1.2xlarge,ecs.gn6v-c8g1.8xlarge,ecs.gn6i-c16g1.4xlarge

    k8s.aliyun.com/eci-extra-ephemeral-storage

    100Gi

    image.png

  8. Return to the Deployments page. Click the application name to go to the application details page. On the Pods tab, wait for the pod status to change to Running. Then, click the Access Method tab to obtain the IP address from the External Endpoint column.

kubectl

  1. Connect to the ACK Serverless cluster using kubectl. For more information, see Connect to a Kubernetes cluster using kubectl.

  2. Create a file named fastchat.yaml for the FastChat application and copy the following code into the file.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: fastchat
      name: fastchat
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: fastchat
      template:
        metadata:
          labels:
            app: fastchat
            alibabacloud.com/eci: "true" 
          annotations:
            k8s.aliyun.com/eci-use-specs: ecs.gn6i-c8g1.2xlarge,ecs.gn5-c8g1.2xlarge,ecs.gn6v-c8g1.8xlarge,ecs.gn6i-c16g1.4xlarge
            k8s.aliyun.com/eci-extra-ephemeral-storage: 100Gi
        spec:
          dnsPolicy: Default
          containers:
          - command:
            - sh
            - -c 
            - "/root/webui.sh"
            image: yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/fastchat:v1.1.0
            imagePullPolicy: IfNotPresent
            name: fastchat
            ports:
            - containerPort: 7860
              protocol: TCP
            readinessProbe:
              failureThreshold: 3
              initialDelaySeconds: 5
              periodSeconds: 10
              successThreshold: 1
              tcpSocket:
                port: 7860
              timeoutSeconds: 1
            resources:
              requests:
                cpu: "8"
                memory: 16Gi
              limits:
                nvidia.com/gpu: 1
    ---
    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: internet
        service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type: PayByCLCU
      name: fastchat-svc
      namespace: default
    spec:
      externalTrafficPolicy: Local
      ports:
      - port: 7860
        protocol: TCP
        targetPort: 7860
      selector:
        app: fastchat
      type: LoadBalancer
  3. Run the following command to deploy the fastchat application.

    kubectl apply -f fastchat.yaml
  4. Run the following command to confirm that the pod is in the Normal state.

    kubectl get deployment fastchat

    Expected output:

    NAME       READY   UP-TO-DATE   AVAILABLE   AGE
    fastchat   1/1     1            1           38m

Step 2: Access the service

Enter the external IP address (EXTERNAL-IP) of the service in your browser to access the FastChat application.

image.png

Step 3: Release resources

After you complete the tutorial, release the resources to avoid unnecessary fees.

Delete the created application and service

  1. On the Clusters page of the Container Service for Kubernetes (ACK) console, click the name of the destination cluster.

  2. In the navigation pane on the left, choose Workloads > Deployments. Select the fastchat application, click Batch Delete, and then follow the prompts to confirm the deletion.

Delete a cluster

ACK Serverless clusters are in public preview and offer a free trial. However, you must pay for other Alibaba Cloud services used by your ACK Serverless clusters based on the billing rules of the services. Fees are charged by these Alibaba Cloud services separately. After you complete the configuration, you can manage the cluster in one of the following ways:

  • If you no longer need the cluster, log on to the ACK console. On the Clusters page, choose More > Delete in the Actions column of the cluster to delete the cluster. In the Delete Cluster dialog box, select Delete ALB Instances Created by the Cluster, Delete Alibaba Cloud DNS PrivateZone instances Created by the Cluster, and I understand the above information and want to delete the specified cluster, and then click OK. For more information about how to delete an ACK Serverless cluster, see Delete an ACK Serverless cluster.

  • If you want to continue to use the cluster, recharge your Alibaba Cloud account at least 1 hour before the trial period ends and ensure that your account has a balance of at least CNY 100. For more information about the billing of Alibaba Cloud services used by ACK Serverless Pro clusters, see Cloud service fee.

Contact us

If you have any questions about enabling AIGC services for ACK, join the DingTalk group 31850017754.