All Products
Search
Document Center

Elastic Container Instance:FAQ

Last Updated:Nov 04, 2022

This topic describes the commonly asked questions about deploying a VNode in a self-managed Kubernetes cluster to use elastic container instances.

How do cloud services access the IP addresses of on-premises pods?

If you use Express Connect circuits to connect your cloud and on-premises networks, the cloud and on-premises services can learn routing rules from each other by using Border Gateway Protocol (BGP). Then, the on-premises equipment can broadcast the IP addresses of pods to the cloud service by using BGP. As a result, the cloud service can access the IP addresses of the on-premises pods. For more information, see Configure BGP.

How do on-premises services access the IP addresses of cloud pods?

If you use Express Connect circuits to connect your cloud and on-premises networks, the cloud and on-premises services can learn routing rules from each other by using BGP. You can deploy a cloud controller manager (CCM) to automatically synchronize the IP addresses of cloud pods to the virtual private cloud (VPC) route table. For more information about a CCM, see Cloud Controller Manager.

After you deploy a CCM in a self-managed or an on-premises cluster, you can synchronize the route IP addresses of the Kubernetes pods to the VPC route table. When you deploy the CCM, take note of the following items:

  • Change the format of the providerID value of the Kubernetes cluster nodes to the <region-id>.<ecs-id> format. Example: cn-shanghai.i-ankb8zjh2nzchf*******.

  • Make sure that the pod IP addresses of the cluster nodes are all within the pod CIDR blocks of the nodes. For example, you must configure the Calico IPAM configuration file as the host-local type. This configuration specifies that the pod CIDR field of Kubernetes cluster nodes is obtained from the Kubernetes API. This ensures all the pod IP addresses of the cluster nodes are within the pod CIDR blocks of the nodes.

    You can check the pod CIDR blocks in the spec data of the nodes.

    spec:
      podCIDR: 172.23.XX.0/26
      podCIDRs:
      - 172.23.XX.0/26
      providerID: cn-shanghai.i-ankb8zjh2nzchfxxxxxxx

What do I do if an internal network domain name cannot be resolved?

Problem description

Cloud and on-premises services cannot mutually invoke services because the internal network domain names of the services cannot be resolved. The failure to resolve internal network domain names includes:

  • Cloud services cannot resolve the internal network domain names of on-premises networks.

  • On-premises services cannot resolve cloud PrivateZone domain names.

Solutions

On-premises solutions and Alibaba Cloud VPC are deployed in different network environments. If cloud and on-premises services can communicate with each other only after the internal network domain names are resolved by using Alibaba Cloud DNS, you can configure Alibaba Cloud DNS PrivateZone to resolve the internal network domain names. For more information, see Use Alibaba Cloud DNS PrivateZone and VPN Gateway to allow ECS instances in a VPC to access an on-premises DNS.

Why on-premises services cannot access cloud services?

Problem description

On-premises services cannot use leased lines to access Alibaba Cloud services such as ApsaraDB RDS, Object Storage Service (OSS), and Log Service.

Solutions

You can use one of the following solution. We recommend that you use Solution 1.

  • Solution 1

    Configure the domain name of the cloud service on the cloud. Then, the virtual border router (VBR) publishes the route to the on-premises network over BGP. For more information, see Access cloud services.

  • Solution 2

    Add a static route to the on-premises network to route 100.64.0.0/10 to the leased line.

Why am I unable to pull images from a self-managed container image repository?

Problem description

When I try to pull images from a self-managed container image repository, the following error is reported:

Error during Virtual Kubelet deployment

Solutions

The reason of the preceding problem is that the image repository uses a certificate that is issued by you. The certificate that is issued by you is unqualified. Therefore, the certificate-based authentication fails when you pull images. When you create a pod, you can add the following annotation to skip the certificate-based authentication:

"k8s.aliyun.com/insecure-registry": "<host-name>"

For example, if the link of an NGINX image in the private image repository is test.example.com/test/nginx:apline, you can add the "k8s.aliyun.com/insecure-registry": "test.example.com" annotation to skip certificate-based authentication.

How do I schedule pods to a VNode?

You can use one of the following methods to schedule pods to a VNode based on your business requirements. Then, the pods can be run on the elastic container instances that are deployed in the VNode. The scheduling methods include:

  • Manual scheduling

    You can configure the nodeSelector and tolerations parameters or specify the nodeName parameter to schedule pods to the VNode. For more information, see Schedule pods to a VNode.

  • Automatic Scheduling

    The ECI Profile feature allows you to customize selectors. Then, Elastic Container Instance automatically schedules pods that meet the conditions of the selectors to the VNode. For more information about specific operations, see Configure an ECI Profile to orchestrate pods.

Why do DaemonSet pods remain in the Pending state after they are scheduled to a VNode?

VNodes are not real nodes and do not support DaemonSets. When you create a DaemonSet, you must configure an anti-affinity scheduling policy to prevent Kubernetes from scheduling DaemonSet pods to VNodes. Sample configurations:

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: type
            operator: NotIn
            values:
            - virtual-kubelet

Why does the scheduling fail when I attempt to schedule pods to a VNode by configuring pod labels?

This problem occurs because the version of your Kubernetes cluster is earlier than v1.16. Error 2 during Virtual Kubelet deployment

What do I do if the mount of a NAS volume times out?

Cause

After you mount a network attached storage (NAS) file system, Kubernetes recursively runs the chmod and chown commands on the files in the NAS directory based on the permissions and ownership specified in the pods. If the NAS directory contains a large number of files, and you configure the permissions and ownership of the files in the security context when you create the pod, the mount of a NAS volume times out.

Solutions

When you configure the security context, set fsGroupChangePolicy to OnRootMismatch. This way, the system does not run the chmod and chown commands when the permissions and ownership of the root directory in the NAS file system match the permissions and ownership specified in the pods. For more information, see Configure a Security Context for a Pod or Container.

How do I collect logs to Log Service?

You can install the Logtail agent by installing alibaba-log-controller on your self-managed Kubernetes cluster to collect logs to Log Service. When you install the Logtail agent, the system automatically performs the following operations:

  1. Creates a ConfigMap named alibaba-log-configuration. The ConfigMap contains the configuration information of Log Service, such as projects.

  2. Optional. Creates a Custom Resource Definition (CRD) named AliyunLogConfig.

  3. Optional. Creates a Deployment controller named alibaba-log-controller. The Deployment controller is used to monitor the changes in the AliyunLogConfig CRD and the creation of Logtail configurations.

  4. Creates a DaemonSet named logtail-ds to collect logs from nodes.

For more information, see Install Logtail in a Kubernetes cluster.

Note

If your cluster is of an early version, such as v1.13, download a CRD of an early version from alibaba-cloud-log-0.1.1 and deploy the CRD. If you have other questions, submit a ticket.

What do I do if the metrics-server reports a 404 error?

Metrics-server v0.5.x or earlier can call the kubelet API to collect metrics from VNodes. If a 404 error occurs, try to use a metrics-server of a version earlier than the current version.

Note

Metrics-server v0.6.0 and later use /metrics/resource instead of /stats/summary to fetch node metrics. The metrics-servers cannot use the metric resource endpoint /metrics/resource to collect metrics from VNodes.

The following code provides the boot parameters of metrics-servers:

        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls