This topic describes the best practices for performing different operations on a Kubernetes cluster.
Create a Kubernetes cluster
This section describes only the best practices for creating a Kubernetes cluster. For more information about Container Service for Kubernetes (ACK) Pro clusters, see Overview of ACK Pro clusters.
For production environments, we recommend that you use ACK Pro managed clusters.
Compared with ACK standard clusters, ACK Pro clusters provide better resource isolation and service level agreements (SLAs). For more information about ACK Pro clusters, see Overview of ACK Pro clusters.
For a production environment, use the Terway network plug-in when you create a Kubernetes cluster.
If you use the Terway network plug-in, you can use a dedicated elastic network interface (ENI) for pods to achieve optimal network performance. For more information about network plug-ins, see Create an ACK Pro cluster.
For a production environment, we recommend that you select the Alibaba Cloud Linux operating system when you create a Kubernetes cluster.
The Alibaba Cloud Linux operating system is the default operating system for ACK. It supports reinforcement based on classified protection for better performance and long-term service support. For more information about the operating system, see Use Alibaba Cloud Linux 2.
For production environments, we recommend that you use advanced security groups.
Compared with basic security groups, advanced security groups can support a larger node capacity. For more information about security groups, see Overview.
If you use basic security groups and fail to scale out nodes or pods when you scale out an application, you can check whether the node capacity is limited by the security group specifications.
Do not modify the security group rules that are added to ACK and Enterprise Distributed Application Service (EDAS).
Normally, security groups are managed by ACK clusters, and the initial configuration of rules, such as the Classless Inter-Domain Routing (CIDR) blocks of Elastic Compute Service (ECS) instances and pods, is complete. If you modify or delete a security group rule, the following exceptions and other feature exceptions may occur: A cluster cannot be imported to EDAS. Pods cannot be scheduled. Pods are disconnected from each other. The kubectl logs command is invalid.
If a security group rule is already modified, make sure that the CIDR block of the cluster node is added to the security group rule. For more information about how to view the security group to which a cluster belongs, see View cluster resources.
If you want to use an ingress to provide external services, you must configure a reasonable number of pod replicas of the Ingress controller for the cluster.
The Ingress controller carries external access traffic. If the number of replicas is insufficient, the overall throughput of your business is affected. You can view the monitoring metrics of the Ingress controller in the Prometheus monitoring view and check the load. Then, you can set a reasonable number of pod replicas. For more information about how to view the monitoring metrics of an Ingress controller, see Analyze and monitor the access log of nginx-ingress-controller.
Enable CoreDNS and configure a reasonable number of pod replicas for CoreDNS.
An appropriate ratio of CoreDNS pods to nodes in a cluster improves the performance of service discovery for the cluster. We recommend that you set the ratio to 1:8. For more information about how to configure CoreDNS pods, see Best practices for DNS services.
Monitor a Kubernetes cluster
This section describes only Prometheus monitoring, node monitoring, and cluster log collection. For more information about cluster monitoring, see Observability system overview.
View Prometheus monitoring metrics and add alerts to nodes and workloads.
Enable CloudMonitor for nodes and add an alert to the resource usage of ECS instances.
Enable the cluster log collection feature for log analysis from Simple Log Service (SLS).
Delete a Kubernetes cluster
If you want to delete a Kubernetes cluster, delete the application that is deployed in the Kubernetes cluster in the EDAS console, cancel the importing of the Kubernetes cluster, and then delete the cluster from ACK.
If you do not perform the operations in the preceding sequence, resources are residual. Then, Virtual Private Clouds (VPCs) may fail to be deleted.
For more information, see the following topics: