All Products
Search
Document Center

Container Service for Kubernetes:Use an ACK Edge cluster to centrally manage ECS resources in multiple regions

Last Updated:Feb 14, 2025

ACK Edge clusters allow you to centrally manage compute resources that reside in multiple regions. This helps you implement full lifecycle management and efficient resource scheduling for cloud-native applications. This topic describes how to use an ACK Edge cluster to centrally manage ECS resources in multiple regions.

Scenarios

You can use an ACK Edge cluster to centrally manage ECS resources in multiple regions in the following scenarios:

  • You have ECS instances that are spread across multiple virtual private clouds (VPCs).

  • You have ECS instances that are spread across multiple regions.

  • You have ECS instances that are created by multiple Resource Access Management (RAM) users.

Manage applications spread across multiple regions

When you want to centrally manage or deploy the same business on a large number of ECS instances that are spread across multiple regions in the following scenarios, you can create an ACK Edge cluster and add the ECS instances to the cluster. For more information, see Example 1: Use an ACK Edge cluster to manage applications that are spread across multiple regions.

  • Security protection

    When you use a distributed computing environment, you need to protect your business from malicious attacks and data leaks. A common protection solution is to deploy network security agents on distributed resources. In this case, you can use an ACK Edge cluster to deploy and maintain the agents in a centralized manner.

  • Distributed stress testing and synthetic monitoring

    When you perform stress tests on a large-scale business, you need to use stress testing tools to initiate stress testing tasks in multiple regions at the same time. In this case, you need to install stress testing tools on compute resources that are spread across regions. You can add the resources to an ACK Edge cluster and then quickly deploy stress testing tools on the resources in a centralized manner.

  • Cache acceleration

    When you use a distributed cache acceleration system, you need to deploy a cache service in each region to accelerate content delivery over networks. In this case, you can use an ACK Edge cluster to deploy and maintain the distributed cache acceleration system across multiple regions in a centralized manner.

image

Resolve the issue of insufficient GPU resources in a region

When you deploy a task in a region where GPU resources are insufficient, you can purchase a GPU-accelerated ECS instance in another region and then add the instance to your ACK Edge cluster. This way, your cluster can schedule the task to the GPU-accelerated instance. For more information, see Example 2: Purchase a new GPU-accelerated ECS instance in another region to scale out GPU resources when GPU resources are insufficient in a region.

image

Benefits

  • Cost-effectiveness: This solution provides standard integration with cloud-native technologies to optimize distributed application O&M and reduce O&M costs.

  • Zero O&M: The control plane of an ACK Edge cluster is managed by Alibaba Cloud and does not require manual O&M. In addition, Alibaba Cloud provides a service-level agreement (SLA) guarantee for the control plane.

  • High availability: This solution is integrated with other Alibaba Cloud services to provide various capabilities including elasticity, networking, storage, and observability. This ensures application stability. In addition, this solution provides edge autonomy, cloud-edge O&M channels, and cell-based management to meet requirements for O&M, stability, and business communication in centralized cloud-edge management.

  • High compatibility: This solution can integrate dozens of types of heterogeneous compute resources that use different operating systems.

  • High performance: This solution optimizes cloud-edge communication and reduces communication costs. Each ACK Edge cluster can contain thousands of nodes.

Examples

Example 1: Use an ACK Edge cluster to manage applications that are spread across multiple regions

Environment preparation

  • Select a region as the central region and create an ACK Edge cluster in the region. For more information, see Create an ACK Edge cluster.

  • Install OpenKruise. For more information, see Component management.

  • Create an edge node pool in each region where your ECS instances reside and add your ECS instances to the edge node pools. For more information, see Create an edge node pool.

Procedure

You can use a Kubernetes DaemonSet or an OpenKruise DaemonSet to deploy and manage your business.

Use a Kubernetes DaemonSet

Example
  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > DaemonSets.

  3. On the DaemonSets page, select a namespace and a deployment method, enter the application name, set Type to DaemonSet, and then follow the on-screen instructions to complete the creation.

    For more information about how to create a DaemonSet, see Create a DaemonSet.

Business upgrades

On the DaemonSets page, find the DaemonSet you created and click Edit in the Actions column. On the Edit page, you can modify the DaemonSet template to perform version upgrades or configuration updates for your business.

Use an OpenKruise DaemonSet

Example
  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to manage and click its name. In the left-side pane, choose Workloads > Pods.

  3. On the Pods page, click Create from YAML, select Custom from the Sample Template drop-down list, copy the YAML template of the DaemonSet to the code editor, and then click Create.

Business upgrades
  1. On the Clusters page, click the name of the cluster that you want to manage and choose Workloads > Custom Resources in the left-side navigation pane.

  2. On the Custom Resources page, click Resource Objects, find the DaemonSet you created, and then click Edit YAML in the Actions column. You can modify the DaemonSet template to perform version upgrades or configuration updates for your business.

    image

Example 2: Purchase a new GPU-accelerated ECS instance in another region to scale out GPU resources when GPU resources are insufficient in a region

Environment preparation

Create an ACK Edge cluster

Procedure

In this example, an inference service is deployed in an ACK Edge cluster that contains ECS instances that are spread across multiple regions. When GPU resources in one region become insufficient, you can add GPU-accelerated instances in another region to the cluster. Then, you can schedule the inference service to the newly added instances.

  1. Deploy an inference service and view the service status.

    1. Create a file named tensorflow-mnist.yaml.

      Click to view the content of the tensorflow-mnist.yaml file

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: tensorflow-mnist
        labels:
          app: tensorflow-mnist
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: tensorflow-mnist
        template:
          metadata:
            name: tensorflow-mnist
            labels:
              app: tensorflow-mnist
          spec:
            containers:
            - name: tensorflow-mnist
              image: registry.cn-beijing.aliyuncs.com/acs/tensorflow-mnist-sample:v1.5
              command:
              - python
              - tensorflow-sample-code/tfjob/docker/mnist/main.py
              - --max_steps=100000
              - --data_dir=tensorflow-sample-code/data
              resources:
                limits:
                  nvidia.com/gpu: "1"
                requests:
                  nvidia.com/gpu: "1"
              workingDir: /root
    2. Deploy an inference service.

      kubectl apply -f tensorflow-mnist.yaml
    3. Check the status of the inference service.

      kubectl get pods

      Expected output:

      NAME                                 READY   STATUS     RESTARTS   AGE
      tensorflow-mnist-664cf976d8-whrbc    0/1     pending    0          30s

      The output shows that the inference service is in the pending state, which indicates that GPU resources are insufficient.

  2. Create an edge node pool. For more information, see Create an edge node pool.

  3. Add GPU-accelerated instances to the edge node pool as edge nodes. For more information, see Add a GPU-accelerated node.123

  4. View the status of the edge nodes.

    kubectl get nodes

    Expected output:

    NAME                        STATUS     ROLES    AGE    VERSION
    cn-hangzhou.192.168.XX.XX   Ready      <none>   9d     v1.30.7-aliyun.1
    iz2ze21g5pq9jbesubr****     Ready      <none>   8d     v1.30.7-aliyun.1
    izf8z0dko1ivt5kwgl4****     Ready      <none>   8d     v1.30.7-aliyun.1
    izuf65ze9db2kfcethw****     Ready      <none>   8d     v1.30.7-aliyun.1  # Information about the newly added GPU-accelerated edge nodes.

  5. Check the status of the inference service.

    kubectl get pods -owide

    Expected output:

    NAME                                 READY   STATUS     RESTARTS   AGE   IP            NODE                       NOMINATED NODE   READINESS GATES
    tensorflow-mnist-664cf976d8-whrbc    1/1     running    0          23m   10.12.XX.XX   izuf65ze9db2kfcethw****    <none>           <none>

    The output shows that the inference service is scheduled to one of the newly added GPU-accelerated nodes and is successfully deployed.