FAQ about using elastic container instances in ACK clusters - Elastic Container Instance

This topic provides answers to commonly asked questions when you use Elastic Container Instance in Container Service for Kubernetes (ACK).

ECI Pod
Scheduling
- A pod is scheduled to the virtual-kubelet node but fails to run on the node. What do I do?
- In a scenario in which ACK and VNodes are used, kube-proxy and coreDNS are scheduled to VNodes and fail to be started. What do I do?
Networking
Logging
- Why logs cannot be collected for my elastic container instance?
Monitoring
- How does Prometheus obtain the monitoring metrics of Elastic Container Instance pods when the pods are connected to an ACK cluster by using virtual nodes?

How do I create a GPU-accelerated elastic container instance?

You can specify GPU-accelerated Elastic Compute Service (ECS) instance types to create GPU-accelerated elastic container instances. For more information, see Create a GPU-accelerated elastic container instance.

How do I query the ID of an elastic container instance?

In Kubernetes clusters, one pod is one elastic container instance. You can use one of the following methods to query the ID of an elastic container instance:

Method 1: Run a kubectl command
Run the kubectl describe pod command to view the pod details and then view the ID of the elastic container instance in the Annotations section of the pod details.
The value of the k8s.aliyun.com/eci-instance-id annotation is the ID of the elastic container instance. The ID is in the eci-xxxx format. Example:
Method 2: Use the Elastic Container Instance console
On the Container Group page of the Elastic Container Instance console, query the elastic container instance based on the pod name and then view the instance ID.
The ID of the container group is the ID of the instance. The ID is in the eci-xxxx format. Example:

Why is the creation speed of Elastic Container Instance pods still slow when I use the image cache feature?

Problem description

On a standard node, a pod can be created within 2 or 3 seconds. However, an elastic container instance that was created by using an image cache spent more than 10 seconds to start.

Explanation

This situation is normal. When you request to create a pod on a standard node, the system does not apply for resources but directly creates containers on the node. When you request to create an elastic container instance, the system first applies for the required resources. If you specify multiple zones, the system tries the specified zones one by one to find a zone where available resources are sufficient to create the instance.

The system must spend more time in creating the instance if the system has to retry in different zones due to insufficient resources. To prevent this problem, we recommend that you put a zone that has sufficient available resources at first place when you specify multiple zones.

What do I do if the pod remains in the Pending state after I create an Elastic Container Instance pod?

Problem description

A pod remains in the Pending state for several hours after it was created. The pod event list shows that the issue occurs because the connection to the API server times out when you mount a volume.

Solution

This issue is caused by a poor connection between the instance and the API server. You can perform the following operations to troubleshoot the issue:

Check whether the pod and the API server of the cluster are deployed in the same virtual private cloud (VPC).
If you have configured access control for the server load balancer (SLB) instance of the cluster, make sure that the CIDR block of the pod is added to the access control list (ACL).

A pod is scheduled to the virtual-kubelet node but fails to run on the node. What do I do?

Problem description

In a scenario in which ACK and VNodes are used, pods may be scheduled to the virtual-kubelet node but events have not been updated. In this case, you must query the logs of the virtual-kubelet node and troubleshoot the issue based on the logs.

Note

If an event is generated, you can troubleshoot the issue based on the event.

Solution

On the Clusters page of the ACK console, find the cluster and choose More > Open Cloud Shell in the Actions column.
Run the following command to obtain the name of the pod that is created by the Virtual Kubelet component:
```
kubectl -n kube-system get pods
```
Run the following command to obtain the logs of the pod. Replace ack-virtual-node-controller-xxxxxxxxxx with the pod name obtained in Step 2.
```
kubectl -n kube-system logs ack-virtual-node-controller-xxxxxxxxxx
```
Troubleshoot the issue based on the latest error messages in the logs. Alternatively, submit a ticket and provide the request ID and error messages to Alibaba Cloud technical support.

In a scenario in which ACK and VNodes are used, kube-proxy and coreDNS are scheduled to a VNode and fail to be started. What do I do?

When Kubernetes schedules kube-proxy and CoreDNS, Kubernetes ignores their taints and may schedule kube-proxy and CoreDNS to VNodes. To solve this issue, you can add the following content to the YAML files of kube-proxy and CoreDNS:

affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: type
                operator: NotIn
                values:
                - virtual-kubelet

How do I modify ClusterDomain of an Elastic Container Instance pod?

You can modify the Deployment of the Virtual kubelet component by adding the environment variable CLUSTER_DOMAIN to the container. This way, you can modify ClusterDomain of an Elastic Container Instance pod that is created by the Virtual Kubelet component. We recommend that you submit a ticket to contact Alibaba Cloud technical support.

Why does the authentication configured in the ingress controller of an ACK Serverless cluster not take effect?

Problem description

The nginx.ingress.kubernetes.io/auth-url annotation is set in nginx-ingress but does not take effect.

Explanation

In ACK Serverless clusters, ingress controllers provide load balancing capabilities based on SLB instances and do not support URL authentication.

ACK clusters support URL authentication.

After a cluster is upgraded, the service IP address cannot be pinged. What do I do?

Before October 2020, each service IP address was assigned to a virtual network interface controller and could be pinged. Starting from October 2020, service IP addresses were made present only in IP Virtual Server (IPVS) rules to optimize high concurrency. Service IP addresses can no longer be pinged. IPVS forwards requests based on IP addresses and port numbers and cannot forward ping packets.

Why logs cannot be collected for my elastic container instance?

If you have set the aliyun_logs_{Logstore name} environment variable of Simple Log Service in a pod but no elastic container instance logs are found in Simple Log Service, it may be due to the following causes:

Short runtime of the elastic container instance
If the application container completes is running to completion within 20 seconds after the elastic container is started, the container may exit and the log-related volume may be unmounted before logs are collected. As a result, Simple Log Service fails to collect logs.
Collection path error
If you specify the environment variable for a pod to collect logs for the first time, Elastic Container Instance automatically creates a Logstore and a path in Simple Log Service. Only this path can be used when you create another pod. If you use a different path, Simple Log Service cannot collect logs of pods. You can change the path together with the Logstore itself. Elastic Container Instance then automatically creates a new Logstore in Simple Log Service.

How does Prometheus obtain the monitoring metrics of Elastic Container Instance pods when the pods are connected to an ACK cluster by using virtual nodes?

Virtual nodes are compatible with real nodes. Prometheus (ARMS Prometheus or self-managed open source Prometheus) automatically obtains the basic monitoring metrics of Elastic Container Instance pods that are deployed on virtual nodes. You do not need to make additional configurations.

For information about how to deploy Prometheus in an ACK cluster, see Enable Prometheus Service or Use Prometheus to monitor an ACK cluster.