Container Service for Kubernetes (ACK) strictly follows the Certified Kubernetes Conformance Program. This topic describes the main changes for ACK in the Kubernetes 1.28 release, including upgrade notes, major changes, features, deprecated features and APIs, and feature gates.
Component versions
The following table lists the supported versions of core components in ACK clusters.
Core component | Version number |
Kubernetes | 1.28.15-aliyun.1, 1.28.9-aliyun.1, and 1.28.3-aliyun.1 |
etcd | v3.5.9 |
CoreDNS | v1.9.3.10-7dfca203-aliyun |
CRI | containerd 1.6.20 |
CSI | Upgrade to the latest supported version of the component. For more information, see the component changelogs for csi-plugin and csi-provisioner. |
CNI | Flannel v0.15.1.22-20a397e6-aliyun |
Terway and TerwayControlplane v1.5.0 and later | |
NVIDIA Container Runtime | v3.13.0 |
Ingress Controller | v1.8.0-aliyun.1 |
Upgrade notes
Component | Notes |
CephFS and Ceph RBD storage volume plugins | If your cluster uses the CephFS and RBD volume plugins, check if they no longer depend on the in-tree driver provided by Kubernetes and have switched to the off-tree driver. Evaluate the risks related to compatibility, stability, or performance. |
Concepts
Understand the following concepts before you read about the feature changes and deprecated resources in this Kubernetes version.
Major changes
In Kubernetes v1.28, the scheduler's logic is optimized to reduce invalid retries, which improves its overall performance.
If your cluster uses a custom scheduler plug-in, we recommend that you optimize and update the plug-in to improve scheduler performance. For more information, see Scheduling framework changes.
For CSI migration, the Kubernetes community has been working to replace in-tree storage plug-ins with out-of-tree drivers that implement the standard CSI interface. This migration reached GA in Kubernetes v1.25. In Kubernetes v1.27, the
storage.k8s.io/v1beta1API and the EBS plug-in were removed. In Kubernetes v1.28, the code for the CephFS volume plug-in was removed,kubernetes.io/rbdwas deprecated, and the CephFS CSI driver is used instead. In addition, you can no longer migrate Ceph RBD volumes to the out-of-tree CSI driver in Kubernetes 1.28.Version 1.28.15-aliyun.1 fixes CVE-2024-10220.
The following CVEs were fixed in version 1.28.9-aliyun.1:
CVE-2023-45288
CVE-2024-24786
Features
In Kubernetes 1.27
The pod termination status is corrected. Pods deleted in the Pending state are set to Failed. Pods deleted in the Running state are set to Succeeded or Failed, depending on the container exit status. This correction fixes an issue where a pod might remain in the Pending state when a pod with a configured failure policy is deleted.
However, if a pod is configured with
RestartPolicy=Always, it may terminate with a Succeeded status after being deleted. Therefore, you may need to modify your controllers. For more information, see Set the termination status for pods that do not require a restart.The ReadWriteOncePod feature for persistent volumes (PVs) has reached Beta. This feature limits volume access to a single pod. For more information, see Single Pod Access Mode for PersistentVolumes Graduates to Beta.
Pod topology spread constraints control how pods are spread across multiple zones in a cluster. Several enhanced features have reached Beta, including support for specifying the minimum number of domains (
minDomains), considering taints (nodeTaintsPolicy) and affinity (nodeAffinityPolicy) during scheduling, and specifying how to treat pods that do not meet constraints during rolling updates (whenUnsatisfiable). For more information, see More fine-grained pod topology spread policies.
The server-side field validation feature for validating resources sent to the API server has reached GA. kubectl will skip client-side validation, automatically use server-side field validation in
Strictmode, and report an error if the validation fails. For more information, see Server Side Field Validation and OpenAPI V3 move to GA.OpenAPI V3 is a new OpenAPI standard. OpenAPI V3 was introduced in Kubernetes 1.23 and has reached GA in Kubernetes 1.27. For more information, see Server Side Field Validation and OpenAPI V3 move to GA.
Horizontal Pod Autoscaler (HPA) lets you configure ContainerResource for containers in a Pod to enable auto-scaling based on the resource usage of each container. This feature reached Beta in Kubernetes 1.27. Unlike the original Resource type that considers the average resource usage of an entire Pod, this approach evaluates the resource usage of each container. This solves the issue where a Pod fails to scale out because it contains a sidecar container with low resource usage and an application container with high resource usage, causing the calculated average to remain below the scale-out threshold.
Multiple StatefulSet features reached Beta. This includes support for starting pod ordinals from a non-zero number and support for automatically deleting created PVCs during specified deletions and scale-ins.
A new feature lets you resize the CPU and memory resources specified in the
resourcesfield for a pod's containers without restarting the pod or its containers. A node allocates resources to a pod based onrequestsand limits its resource usage based onlimits. New fields are added to pods to support this feature. For more information, see Resize CPU and Memory Resources assigned to Containers. This feature has reached Alpha in Kubernetes 1.27 and is disabled by default.You can set the
serializeImagePullsfield of the kubelet tofalseto enable parallel image pulls instead of the default serial image pulls. The maxParallelImagePulls field is added in v1.27 to limit the number of images that can be pulled in parallel. This prevents image pulls from consuming excessive network bandwidth or disk I/O.In addition to the Volume Snapshot API, a crash-consistent volume group snapshot API was introduced in Kubernetes 1.27 that lets you create snapshots for multiple PVs at a point in time. For more information, see Introducing an API for Volume Group Snapshots.
In Kubernetes 1.28
Non-graceful node shutdown has reached GA. This feature allows a StatefulSet to create pods with the same name on another node when the original node is shut down unexpectedly, such as due to a power failure, which helps avoid service interruptions.
The NodeOutOfServiceVolumeDetach feature gate is now GA. It allows immediate volume detachment for pods terminated on an abnormal node. This helps pods recover quickly on other nodes.
The Retroactive default StorageClass assignment feature has reached GA. Previously, if you created a PVC without the
storageClassNamewhen no default StorageClass existed, the PVC would remain in the Pending state indefinitely. Now, when a default StorageClass is created, any PVC without astorageClassNameis automatically updated to use the default StorageClass.Two new features are introduced for handling Job failures.
The JobPodReplacementPolicy (Alpha feature gate) ensures that a pod is replaced only when it reaches the Failed phase (
status.phase: Failed), not when it has adeletionTimestampand is terminating, to prevent two pods from simultaneously occupying the same index and node resources.The JobBackoffLimitPerIndex (Alpha feature gate) lets you configure
.spec.backoffLimitPerIndexto limit the number of failure retries for individual indexes of an Indexed Job, preventing the entire job from failing when a single index persistently fails and reaches the.spec.backoffLimitlimit.
If the
completioncount of an Indexed Job is set to more than 100,000, itsparallelismis set to more than 10,000, and many pods fail, you may be unable to track the Job's termination status. To prevent this issue, warnings are displayed if you set the preceding fields to excessively large values when you create a Job.The
reasonandfieldPathfields are added to CustomResourceDefinition (CRD) validation rules to return a specified reason and field path when validation fails. For more information, see CRD Validation Expression Language.Webhook matching requests now support Common Expression Language (CEL) expressions. Up to 64 matching conditions are supported. For more information, see Matching requests: matchConditions.
The
.status.resizeStatusfield of a PVC is replaced with the.status.allocatedResourceStatusmap field, which indicates the states of resources being resized for the PVC. For more information, see PersistentVolumeClaimStatus.Pods of type Indexed Job and StatefulSet now have the pod index (ordinal number) added to their labels.
ValidatingAdmissionPolicy (in Beta) provides a declarative way to validate resource requests. This serves as an alternative to deploying validating admission webhooks and lets you use CEL expressions to write complex validation rules. The API server validates resource requests against the CEL expressions.
Kube Controller Manager introduces the
--concurrent-cron-job-syncsflag to configure the concurrency of the CronJob controller and the--concurrent-job-syncsflag to configure the concurrency of the Job controller. For more information, see --concurrent-cron-job-syncs and --concurrent-job-syncs.API Server optimizations include the following:
The memory usage of retrieving a list (GetList) from the cache is reduced. For more information, see GetList test data.
Fixed an issue where the endpoint of a Kubernetes Service was not removed when only one API Server replica remained. This ensures that the endpoint is removed promptly during a graceful shutdown.
The OpenAPI v2 controller is set to lazily aggregate CRD information, and the OpenAPI v2 specifications are significantly reduced. When no client sends requests to the OpenAPI v2, the CPU and memory usage of the API server is reduced. In addition, the efficiency of installing large numbers of CRDs is improved. However, this slows down the processing of first-time requests. We recommend that you update your client to a version that supports OpenAPI v3.
The Consistent Reads from Cache feature gate is introduced that lets you use the watch cache to guarantee consistent reads for LIST requests.
More monitoring metrics are available and can be accessed through the metrics endpoint.
Deprecated features
In Kubernetes 1.27
The in-tree AWS EBS storage plug-in is replaced with the out-of-tree CSI plug-in. For more information, see cloud-provider-aws.
The Node
spec.externalIDfield is deprecated. Warnings are returned if clients send requests to update this field. For more information about how to return warnings to clients, see Helpful Warnings Ahead.Seccomp (Secure Computing Mode) became GA in Kubernetes v1.19. It improves workload security by restricting the system calls that a pod or container can execute. The Alpha-stage
seccomp.security.alpha.kubernetes.io/podandcontainer.seccomp.security.alpha.kubernetes.ioannotations were deprecated in v1.19 and completely removed in v1.27.We recommend that you use the
securityContext.seccompProfilefield for pods or containers.The Kube Controller Manager (KCM) removes the startup flags
--pod-eviction-timeout(the graceful period for pod eviction from a NotReady node) and--enable-taint-manager(taint-based eviction, enabled by default).The
--container-runtime,--container-runtime-endpoint, and--image-service-endpointstartup flags are removed from kubelet. For the--container-runtimeflag, its default value remainsremoteafter the removal of dockershim. This flag was deprecated in v1.24 and removed in v1.27. The--container-runtime-endpointand--image-service-endpointflags are no longer supported as startup commands. You must configure these settings in the kubelet configuration file instead.The SecurityContextDeny admission controller is deprecated and will be removed in future versions.
In Kubernetes 1.28
The in-tree CephFS volume plugin code has been removed.
We recommend that you use the CephFS CSI driver instead.
Support for migrating Ceph RBD volumes to the out-of-tree CSI storage driver plugin is deprecated and will be completely removed in a future version.
Complete the migration before the in-tree code is removed.
The RBD volume plugin (kubernetes.io/rbd) is deprecated and will be removed in a future version.
We recommend that you use the CephFS CSI driver instead.
Key Management Service (KMS) v1 is deprecated. If you want to continue to use KMSv1, set
--feature-gates=KMSv1=true. For more information, see Mark KMS v1beta1 as deprecated with no further fixes.Use KMSv2.
The Kubernetes Controller Manager (KCM) has deprecated the startup flags
--volume-host-cidr-denylistand--volume-host-allow-local-loopback.The
--azure-container-registry-configflag in kubelet is deprecated.We recommend that you use the
image-credential-provider-configand--image-credential-provider-bin-dirflags.Creating Windows node pools is no longer supported.
You can create node pools that use other operating systems, such as Alibaba Cloud Linux 3 and ContainerOS 3.1. For more information, see Create and manage a node pool.
Deprecated APIs
The CSIStorageCapacity API exposes the available storage capacity to ensure that Pods are scheduled to nodes with sufficient storage capacity. The storage.k8s.io/v1beta1 API version of CSIStorageCapacity was deprecated in v1.24 and removed in v1.27.
We recommend that you use storage.k8s.io/v1. This API is available in Kubernetes v1.24 and later versions. For more information, see Storage Capacity Constraints for Pod Scheduling KEP.
Feature gates
This section lists only the major changes. For more information, see Feature Gates.
In Kubernetes 1.27
The
NodeLogQueryfeature gate in the Alpha stage is added. After you setenableSystemLogHandlerandenableSystemLogQuerytotruefor the kubelet, you can use kubectl to query node logs.The
StatefulSetStartOrdinalfeature gate has reached Beta. This feature gate lets you assign sequence numbers to pods created by StatefulSets from a number other than zero. By default, this feature gate is enabled.The
StatefulSetAutoDeletePVCfeature gate has reached Beta. The new policy controls whether and when StatefulSets delete PVCs created fromvolumeClaimTemplate.IPv6DualStackwas enabled by default after reaching GA in v1.23 and was completely removed from all component code in v1.27.If you have manually configured this in your cluster, you must remove the configuration before you upgrade the cluster.
The
ServiceNodePortStaticSubrangefeature gate in the Alpha stage is added to reduce conflicts in assigning ports to NodePort Services. This feature gate divides the port range for NodePort Services into two bands. Dynamic port assignment uses the high band. The low band with a lower risk of port conflicts can be used to statically assign ports to NodePort Services. For more information, see Avoid Collisions Assigning Ports to NodePort Services.The
InPlacePodVerticalScalingAlpha feature gate is added to allow you to adjust the CPU and memory resources of a pod without restarting the pod or containers.The following feature gates for expanding volumes have reached GA and are enabled by default:
ExpandCSIVolumes(expands CSI volumes),ExpandInUsePersistentVolumes(expands PVs that are in use), andExpandPersistentVolumes(expands PVs).The
CSIMigrationfeature gate, which migrates in-tree storage plugins to out-of-tree CSI drivers, is always enabled by default and has been removed.CSIInlineVolume, a feature gate for inline volumes, has reached GA in Kubernetes 1.25 and is always enabled by default. This feature gate is removed in Kubernetes 1.27.The
EphemeralContainersfeature has reached GA in v1.25, is always enabled by default, and its feature gate has been removed.The
LocalStorageCapacityIsolationfeature gate provides support for ephemeral storage capacity isolation ofemptyDirvolumes. This lets you set a hard limit on a pod's local storage usage. If the usage exceeds the limit, the pod is evicted by the kubelet. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.NetworkPolicyEndPortis a feature gate that lets you set theendPortfield in network policies to specify multiple ports. Before this feature gate was introduced, you could specify only one port. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
StatefulSetMinReadySecondsfeature gate lets you configureminReadySecondsfor StatefulSets. This feature gate has reached GA in Kubernetes 1.25 and is always enabled by default. The feature gate is removed in Kubernetes 1.27.The
DaemonSetUpdateSurgefeature gate lets you configuremaxSurgefor DaemonSets. This feature gate reached GA in v1.25 and is always enabled by default. The feature gate has been removed.The
IdentifyPodOSfeature gate lets you specify an operating system for pods. It reached GA in v1.25 and is always enabled by default. The feature gate has since been removed.The
ReadWriteOncePodfeature gate has reached Beta and is enabled by default. This feature gate lets you access PVs inReadWriteOncePodmode.
In Kubernetes 1.28
The
NodeOutOfServiceVolumeDetachfeature gate has reached GA in Kubernetes 1.28 and is always enabled by default. When thenode.kubernetes.io/out-of-servicetaint is added to mark a node as out-of-service, pods that do not tolerate this taint are forcefully deleted, and their volumes are immediately detached.The
AdmissionWebhookMatchConditionfeature gate is enabled by default and lets you use CEL expressions as webhook matching conditions.The
UnknownVersionInteroperabilityProxyfeature gate has reached Alpha. This feature gate can send requests to the correct API server when multiple API server versions exist. For more information, see Mixed Version Proxy.The
IPTablesOwnershipCleanupfeature gate has reached GA and no longer creates the KUBE-MARK-DROP and KUBE-MARK-MASQ iptables chains.The
ConsistentListFromCachefeature gate has reached Alpha. This feature gate allows the API server to use the watch cache to serve LIST requests, which guarantees consistent reads.The
ProbeTerminationGracePeriodfeature gate has reached GA and is enabled by default. This feature gate lets you use probe-level terminationGracePeriodSeconds.The following feature gates in the GA stage are removed:
DelegateFSGroupToCSIDriver,DevicePlugins,KubeletCredentialProviders,MixedProtocolLBService,ServiceInternalTrafficPolicy,ServiceIPStaticSubrange, andEndpointSliceTerminatingCondition.
References
For the complete changelogs for Kubernetes 1.27 and 1.28, see CHANGELOG-1.27 and CHANGELOG-1.28.