This topic describes the frequently asked questions about deploying a VNode in a self-managed Kubernetes cluster to use elastic container instances.
FAQ about networks
FAQ about image pulling
FAQ about pod scheduling
FAQ about storage
FAQ about logs and monitoring
How do cloud services access the IP addresses of on-premises pods?
If you use Express Connect circuits to connect your cloud and on-premises networks, the cloud and on-premises services can learn routing rules from each other by using Border Gateway Protocol (BGP). Then, the on-premises equipment can broadcast the IP addresses of pods to the cloud service by using BGP. As a result, the cloud service can access the IP addresses of the on-premises pods. For more information, see Configure BGP.
How do on-premises services access the IP addresses of cloud pods?
If you use Express Connect circuits to connect your cloud and on-premises networks, the cloud and on-premises services can learn routing rules from each other by using BGP. You can deploy a cloud controller manager (CCM) to automatically synchronize the IP addresses of cloud pods to the virtual private cloud (VPC) route table. For more information about a CCM, see Cloud Controller Manager.
After you deploy a CCM in a self-managed or an on-premises cluster, you can synchronize the route IP addresses of the Kubernetes pods to the VPC route table. When you deploy the CCM, take note of the following items:
Change the format of the providerID value of the Kubernetes cluster nodes to the
Make sure that the pod IP addresses of the cluster nodes are all within the pod CIDR blocks of the nodes. For example, you must configure the Calico IPAM configuration file as the
host-localtype. This configuration specifies that the pod CIDR field of Kubernetes cluster nodes is obtained from the Kubernetes API. This ensures all the pod IP addresses of the cluster nodes are within the pod CIDR blocks of the nodes.
You can check the pod CIDR blocks in the spec data of the nodes.
spec: podCIDR: 172.23.XX.0/26 podCIDRs: - 172.23.XX.0/26 providerID: cn-shanghai.i-ankb8zjh2nzchfxxxxxxx
What do I do if an internal network domain name cannot be resolved?
Cloud and on-premises services cannot mutually invoke services because the internal network domain names of the services cannot be resolved. The failure to resolve internal network domain names includes:
Cloud services cannot resolve the internal network domain names of on-premises networks.
- On-premises services cannot resolve cloud PrivateZone domain names.
On-premises solutions and Alibaba Cloud VPC are deployed in different network environments. If cloud and on-premises services can communicate with each other only after the internal network domain names are resolved by using Alibaba Cloud DNS, you can configure Alibaba Cloud DNS PrivateZone to resolve the internal network domain names. For more information, see Use Alibaba Cloud DNS PrivateZone and VPN Gateway to allow ECS instances in a VPC to access an on-premises DNS.
Why on-premises services cannot access cloud services?
On-premises services cannot use leased lines to access Alibaba Cloud services such as ApsaraDB RDS, Object Storage Service (OSS), and Log Service.
You can use one of the following solution. We recommend that you use Solution 1.
Configure the domain name of the cloud service on the cloud. Then, the virtual border router (VBR) publishes the route to the on-premises network over BGP. For more information, see Access cloud services.
Add a static route to the on-premises network to route 100.64.0.0/10 to the leased line.
Why am I unable to pull images from a self-managed container image repository?
When I try to pull images from a self-managed container image repository, the following error is reported:
The reason of the preceding problem is that the image repository uses a certificate that is issued by you. The certificate that is issued by you is unqualified. Therefore, the certificate-based authentication fails when you pull images. When you create a pod, you can add the following annotation to skip the certificate-based authentication:
For example, if the link of an NGINX image in the private image repository is
test.example.com/test/nginx:apline, you can add the
"k8s.aliyun.com/insecure-registry": "test.example.com" annotation to skip certificate-based authentication.
How do I schedule pods to a VNode?
You can use one of the following methods to schedule pods to a VNode based on your business requirements. Then, the pods can be run on the elastic container instances that are deployed in the VNode. The scheduling methods include:
You can configure the nodeSelector and tolerations parameters or specify the nodeName parameter to schedule pods to the VNode. For more information, see Schedule pods to a VNode.
The ECI Profile feature allows you to customize selectors. Then, Elastic Container Instance automatically schedules pods that meet the conditions of the selectors to the VNode. For more information, see Configure an ECI Profile to orchestrate pods.
Why do DaemonSet pods remain in the Pending state after they are scheduled to a VNode?
spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: type operator: NotIn values: - virtual-kubelet
Why does the scheduling fail when I attempt to schedule pods to a VNode by configuring pod labels?
This problem occurs because the version of your Kubernetes cluster is earlier than v1.16.
What do I do if the mount of a NAS volume times out?
After you mount a network attached storage (NAS) file system, Kubernetes recursively runs the chmod and chown commands on the files in the NAS directory based on the permissions and ownership specified in the pods. If the NAS directory contains a large number of files, and you configure the permissions and ownership of the files in the security context when you create the pod, the mount of a NAS volume times out.
When you configure the security context, set fsGroupChangePolicy to OnRootMismatch. This way, the system does not run the chmod and chown commands when the permissions and ownership of the root directory in the NAS file system match the permissions and ownership specified in the pods. For more information, see Configure a Security Context for a Pod or Container.
How do I collect logs to Log Service?
You can install the Logtail component alibaba-log-controller on your self-managed Kubernetes cluster to collect logs to Log Service. When you install the Logtail component, the system automatically performs the following operations:
Creates a Custom Resource Definition (CRD) named aliyunlogconfigs.
Deploys the alibaba-log-controller workload.
Installs Logtail in DaemonSet mode.
For more information, see Install Logtail components.
If your cluster is of an early version, such as v1.13, download a CRD of an early version from alibaba-cloud-log-0.1.1 and deploy the CRD. If you have other questions, submit a ticket.
What do I do if the metrics-server reports a 404 error?
Metrics-server v0.5.x or earlier can call the kubelet API to collect metrics from VNodes. If a 404 error occurs, try to use a metrics-server of a version earlier than the current version.
Metrics-server v0.6.0 and later use
/metrics/resource instead of
/stats/summary to fetch node metrics. The metrics-servers cannot use the metric resource endpoint /metrics/resource to collect metrics from VNodes.
The following code provides the boot parameters of metrics-servers:
- --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --kubelet-insecure-tls