All Products
Search
Document Center

Container Service for Kubernetes:Create a hybrid elastic container cluster (elastic ECS)

Last Updated:Mar 26, 2026

A hybrid cluster connects your on-premises self-managed Kubernetes cluster to Alibaba Cloud through a registered cluster. This lets you scale your on-premises cluster with cloud-based Elastic Compute Service (ECS) nodes while managing both environments from a single control plane.

This guide uses a data center cluster running Calico in route reflector mode as the example. On the cloud side, Alibaba Cloud Container Service for Kubernetes (ACK) uses the Terway plugin for container networking.

Prerequisites

Before you begin, make sure you have:

  • Network connectivity between your on-premises cluster and the virtual private cloud (VPC) used by the registered cluster — covering both the compute node network and the container network. Use Cloud Enterprise Network (CEN) to establish this connectivity. For details, see Establish multi-VPC connections in different scenarios.

Important

Network connectivity between your on-premises environment and the VPC is the foundational requirement. If this is not in place, no subsequent steps will succeed.

  • The on-premises cluster connected to the registered cluster using the private cluster import agent configuration provided by the registered cluster

  • Cloud-based compute nodes added through the registered cluster that can reach the API Server of your on-premises cluster

  • A kubectl connection to the registered cluster. See Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

Hybrid cluster architecture

In a hybrid cluster, Calico runs on-premises and Terway runs in the cloud. The two network plugins must not interfere with each other: Calico pods stay on on-premises nodes, and Terway pods run only on cloud-based ECS nodes.

The following diagram shows the network topology for this example.

image

On-premises (data center):

  • Private CIDR block: 192.168.0.0/24

  • Container network CIDR block: 10.100.0.0/16

  • Network plugin: Calico (route reflector mode)

Cloud side:

  • VPC CIDR block: 10.0.0.0/8

  • vSwitch CIDR block for compute nodes: 10.10.24.0/24

  • vSwitch CIDR block for pods: 10.10.25.0/24

  • Network plugin: Terway (shared mode)

Make sure the on-premises container network CIDR block (10.100.0.0/16) does not overlap with the VPC CIDR block (10.0.0.0/8).

Set up the hybrid cluster

The setup consists of seven steps:

  1. Restrict Calico to on-premises nodes

  2. Grant RAM permissions to Terway

  3. Install Terway

  4. Verify the Terway DaemonSet

  5. Configure the Terway ENI ConfigMap

  6. Create a custom node initialization script

  7. Create a node pool and scale out ECS nodes

Step 1: Restrict Calico to on-premises nodes

Cloud-based ECS nodes added to a registered ACK cluster are automatically labeled alibabacloud.com/external=true. Configure nodeAffinity on the Calico DaemonSet so that Calico pods run only on on-premises nodes (nodes without this label).

cat <<EOF > calico-ds.patch
spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: alibabacloud.com/external
                operator: NotIn
                values:
                - "true"
              - key: type
                operator: NotIn
                values:
                - "virtual-kubelet"
EOF
kubectl -n kube-system patch ds calico-node -p "$(cat calico-ds.patch)"

Step 2: Grant RAM permissions to Terway

Terway needs Resource Access Management (RAM) permissions to manage elastic network interfaces (ENIs) on ECS nodes.

Option A: Using onectl (recommended)

  1. Install and configure onectl. See Manage registered clusters using onectl.

  2. Run:

    onectl ram-user grant --addon terway-eniip

    Expected output:

    Ram policy ack-one-registered-cluster-policy-terway-eniip granted to ram user ack-one-user-ce313528c3 successfully.

Option B: Using the console

Grant the following RAM policy to the AccessKey used by Terway. For instructions, see Manage RAM user permissions.

{
    "Version": "1",
    "Statement": [
        {
            "Action": [
                "ecs:CreateNetworkInterface",
                "ecs:DescribeNetworkInterfaces",
                "ecs:AttachNetworkInterface",
                "ecs:DetachNetworkInterface",
                "ecs:DeleteNetworkInterface",
                "ecs:DescribeInstanceAttribute",
                "ecs:AssignPrivateIpAddresses",
                "ecs:UnassignPrivateIpAddresses",
                "ecs:DescribeInstances",
                "ecs:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": [
                "*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "vpc:DescribeVSwitches"
            ],
            "Resource": [
                "*"
            ],
            "Effect": "Allow"
        }
    ]
}

Step 3: Install Terway

Option A: Using onectl

onectl addon install terway-eniip

Expected output:

Addon terway-eniip, version **** installed.

Option B: Using the console

  1. Log on to the Container Service Management Console. In the left navigation pane, click Clusters.

  2. Click your cluster name, then click Add-ons in the left navigation pane.

  3. On the Add-ons page, search for terway-eniip. Click Install in the lower-right corner of the component card, then click OK.

Step 4: Verify the Terway DaemonSet

Before adding cloud-based nodes, confirm that Terway is not scheduled on any on-premises nodes.

kubectl -nkube-system get ds |grep terway

Expected output:

terway-eniip   0         0         0       0            0           alibabacloud.com/external=true      16s

The alibabacloud.com/external=true node selector confirms that Terway pods will run only on cloud-based ECS nodes.

Step 5: Configure the Terway ENI ConfigMap

Edit the eni-config ConfigMap in the kube-system namespace to set the AccessKey credentials for Terway:

kubectl -n kube-system edit cm eni-config

Set access_key and access_secret in the eni_conf section:

kind: ConfigMap
apiVersion: v1
metadata:
  name: eni-config
  namespace: kube-system
data:
  eni_conf: |
    {
     "version": "1",
     "max_pool_size": 5,
     "min_pool_size": 0,
     "vswitches": {"AZoneID":["VswitchId"]},
     "eni_tags": {"ack.aliyun.com":"{{.ClusterID}}"},
     "service_cidr": "{{.ServiceCIDR}}",
     "security_group": "{{.SecurityGroupId}}",
     "access_key": "",
     "access_secret": "",
     "vswitch_selection_policy": "ordered"
    }
  10-terway.conf: |
    {
     "cniVersion": "0.3.0",
     "name": "terway",
     "type": "terway"
    }

Replace access_key and access_secret with the AccessKey ID and AccessKey Secret of the RAM user that has the permissions granted in Step 2.

Step 6: Create a custom node initialization script

When a registered ACK cluster adds an ECS node, it runs a node initialization script and passes cloud-specific environment variables to it. Your existing on-premises initialization script (init-node.sh) must be extended to handle these variables.

Environment variables passed by the registered cluster:

Environment variable Purpose
ALIBABA_CLOUD_PROVIDER_ID Sets --provider-id on kubelet
ALIBABA_CLOUD_NODE_NAME Sets --hostname-override on kubelet
ALIBABA_CLOUD_LABELS Sets --node-labels on kubelet
ALIBABA_CLOUD_TAINTS Sets --register-with-taints on kubelet

6a. Extend the initialization script

The custom script init-node-ecs.sh starts with the same setup as init-node.sh (installing containerd, kubelet, kubeadm, and so on), then adds a section that reads the Alibaba Cloud environment variables and passes them to kubelet before joining the cluster.

View the init-node.sh script example

#!/bin/bash

export K8S_VERSION=1.24.3

export REGISTRY_MIRROR=https://registry.cn-hangzhou.aliyuncs.com
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl --system
yum remove -y containerd.io
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y containerd.io-1.4.3
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i "s#k8s.gcr.io#registry.aliyuncs.com/k8sxio#g"  /etc/containerd/config.toml
sed -i '/containerd.runtimes.runc.options/a\            SystemdCgroup = true' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#${REGISTRY_MIRROR}#g"  /etc/containerd/config.toml
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd
yum install -y nfs-utils
yum install -y wget
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum remove -y kubelet kubeadm kubectl
yum install -y kubelet-$K8S_VERSION kubeadm-$K8S_VERSION kubectl-$K8S_VERSION
crictl config runtime-endpoint /run/containerd/containerd.sock
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet
containerd --version
kubelet --version

kubeadm join 10.200.1.253:XXXX --token cqgql5.1mdcjcvhszol**** --discovery-token-unsafe-skip-ca-verification

View the init-node-ecs.sh script example

#!/bin/bash

export K8S_VERSION=1.24.3

export REGISTRY_MIRROR=https://registry.cn-hangzhou.aliyuncs.com
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl --system
yum remove -y containerd.io
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y containerd.io-1.4.3
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i "s#k8s.gcr.io#registry.aliyuncs.com/k8sxio#g"  /etc/containerd/config.toml
sed -i '/containerd.runtimes.runc.options/a\            SystemdCgroup = true' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#${REGISTRY_MIRROR}#g"  /etc/containerd/config.toml
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd
yum install -y nfs-utils
yum install -y wget
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum remove -y kubelet kubeadm kubectl
yum install -y kubelet-$K8S_VERSION kubeadm-$K8S_VERSION kubectl-$K8S_VERSION
crictl config runtime-endpoint /run/containerd/containerd.sock
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet
containerd --version
kubelet --version

####### Start of added section
# Configure node labels, taints, node name, and provider ID
# from environment variables passed by the registered ACK cluster
KUBELET_CONFIG_FILE="/etc/sysconfig/kubelet"

if [[ $ALIBABA_CLOUD_LABELS != "" ]];then
  option="--node-labels"
  if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_LABELS},@g" $KUBELET_CONFIG_FILE
  elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_LABELS} @g" $KUBELET_CONFIG_FILE
  else
    sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_LABELS}\"" $KUBELET_CONFIG_FILE
  fi
fi

if [[ $ALIBABA_CLOUD_TAINTS != "" ]];then
  option="--register-with-taints"
  if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_TAINTS},@g" $KUBELET_CONFIG_FILE
  elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_TAINTS} @g" $KUBELET_CONFIG_FILE
  else
    sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_TAINTS}\"" $KUBELET_CONFIG_FILE
  fi
fi

if [[ $ALIBABA_CLOUD_NODE_NAME != "" ]];then
  option="--hostname-override"
  if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_NODE_NAME},@g" $KUBELET_CONFIG_FILE
  elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_NODE_NAME} @g" $KUBELET_CONFIG_FILE
  else
    sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_NODE_NAME}\"" $KUBELET_CONFIG_FILE
  fi
fi

if [[ $ALIBABA_CLOUD_PROVIDER_ID != "" ]];then
  option="--provider-id"
  if grep -- "${option}=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@${option}=@${option}=${ALIBABA_CLOUD_PROVIDER_ID},@g" $KUBELET_CONFIG_FILE
  elif grep "KUBELET_EXTRA_ARGS=" $KUBELET_CONFIG_FILE &> /dev/null;then
    sed -i "s@KUBELET_EXTRA_ARGS=@KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_PROVIDER_ID} @g" $KUBELET_CONFIG_FILE
  else
    sed -i "/^\[Service\]/a\Environment=\"KUBELET_EXTRA_ARGS=${option}=${ALIBABA_CLOUD_PROVIDER_ID}\"" $KUBELET_CONFIG_FILE
  fi
fi

# Reload and restart kubelet with the new configuration
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet
####### End of added section

kubeadm join 10.200.1.253:XXXX --token cqgql5.1mdcjcvhszol**** --discovery-token-unsafe-skip-ca-verification

6b. Host the script and register it with the cluster

  1. Upload init-node-ecs.sh to an HTTP file server accessible from the cloud — for example, an Object Storage Service (OSS) bucket:

    https://kubelet-****.oss-cn-hangzhou-internal.aliyuncs.com/init-node-ecs.sh
  2. Set addNodeScriptPath in the ack-agent-config ConfigMap to the script URL:

    apiVersion: v1
    data:
      addNodeScriptPath: https://kubelet-****.oss-cn-hangzhou-internal.aliyuncs.com/init-node-ecs.sh
    kind: ConfigMap
    metadata:
      name: ack-agent-config
      namespace: kube-system

Step 7: Create a node pool and scale out ECS nodes

  1. Log on to the Container Service Management Console. In the left navigation pane, click Clusters.

  2. Click your cluster name, then click Nodes > Node Pools in the left navigation pane.

  3. On the Node Pools page, create a node pool and scale out nodes. For details, see Create and manage a node pool.

Verify the hybrid cluster

After scaling out, confirm that the new ECS nodes have joined the cluster with the correct label:

kubectl get nodes --show-labels | grep external=true

Nodes labeled alibabacloud.com/external=true are cloud-based ECS nodes. Nodes without this label are on-premises nodes. If both appear in the output, the hybrid cluster is operational.

What's next