All Products
Search
Document Center

Container Service for Kubernetes:ack-node-problem-detector

Last Updated:Jan 12, 2024

The ack-node-problem-detector component is optimized and enhanced based on the open source Node Problem Detector (NPD) that is provided by the Kubernetes community. The ack-node-problem-detector component is used to monitor nodes and integrate third-party monitoring plug-ins. This component detects node anomalies in a Container Service for Kubernetes (ACK) cluster and supports the event center feature. You can use ack-node-problem-detector to integrate custom monitoring plug-ins. This allows you to enhance node monitoring and detect more node anomalies. This topic introduces alicloud-monitor-controller and provides usage notes and release notes for alicloud-monitor-controller.

Introduction

  • The DaemonSet of ack-node-problem-detector detects node anomalies. For more information about open source node-problem-detector, see node-problem-detector.

  • If you specify a sink parameter when the event center feature is enabled, ack-node-problem-detector-eventer is configured for the ack-node-problem-detector component. The ack-node-problem-detector-eventer component is used to monitor events of the cluster and report the events to the event center. For more information about kube-eventer, see kube-eventer.

Usage notes

For more information about how to install ack-node-problem-detector and the usage notes and new features of ack-node-problem-detector, see Event monitoring.

Release notes

December 2023

Version number

Image address

Release date

Description

v1.2.18

  • ack-node-problem-detector: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/node-problem-detector:v0.8.13-003ac31-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.8-27a468a-aliyun

  • kube-event-init: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2023-12-18

  • The issue that the cached kernel log causes the system to report pod OOMKilling errors is fixed.

  • Custom component configurations can be inherited when you update ack-node-problem-detector.

August 2023

Version number

Image address

Release date

Description

v1.2.17

  • ack-node-problem-detector: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/node-problem-detector:v0.8.12-bf8aff8-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.8-27a468a-aliyun

  • kube-event-init: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2023-08-24

  • The parameters of ack-node-problem-detector can be modified from the Add-ons page of the ACK console.

  • Labels can be sent together with log data to Simple Log Service, such as cluster names. These labels are displayed in Simple Log Service data in the event center of ACK by default.

June 2023

Version number

Image address

Release date

Description

v1.2.16

  • ack-node-problem-detector: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/node-problem-detector:v0.8.12-bf8aff8-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.8-019546c-aliyun

  • kube-event-init: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2023-06-27

The resource specifications of ack-node-problem-detector can be configured from the Add-ons page of the ACK console.

v1.2.15

  • ack-node-problem-detector: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/node-problem-detector:v0.8.12-bf8aff8-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.8-019546c-aliyun

  • kube-event-init: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2023-06-06

The issue that ack-node-problem-detector affects the performance of the API server and etcd when OOMKilling errors frequently occur in large numbers of pods is fixed.

February 2023

Version number

Image address

Release date

Description

v1.2.14

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.8.11-edc7907-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.6-bbf76f7-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2023-02-03

  • Image pulling is accelerated.

  • migrate-controller is supported by ACK Edge clusters.

September 2022

Version number

Image address

Release date

Description

v1.2.11

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.8.11-edc7907-aliyun

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer:v1.2.6-bbf76f7-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:v1.7-48a2acc-aliyun

2022-09-30

  • The inspection logic of ack-node-problem-detector is optimized. The loads on key components in ACK clusters are reduced.

  • Image security hardening is supported.

February 2022

Version number

Image address

Release date

Description

v1.2.9

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.8.10-e0ff7d2

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-amd64:v1.2.6-f0efecf-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:v1.6-a92aba6-aliyun

2022-02-22

  • Kernel inspection is supported.

  • Security is enhanced.

January 2022

Version number

Image address

Release date

Description

v1.2.8

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.8.10-e0ff7d2

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-amd64:v1.2.5-cc7ec54-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:v1.6-a92aba6-aliyun

2022-01-20

  • Different containerd modes are supported.

  • The Quality of service (QoS) limits of the resources of ack-node-problem-detector are optimized to improve stability.

November 2021

Version number

Image address

Release date

Description

v1.2.7

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.8.10-e0ff7d2

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-amd64:v1.2.5-cc7ec54-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:v1.6-a92aba6-aliyun

2021-11-25

  • This version is compatible with Alibaba Cloud Linux 3 and CentOS 8.

  • ARM architecture environments are supported.

April 2021

Version number

Image address

Release date

Description

v1.2.5

  • ack-node-problem-detector: registry.aliyuncs.com/acs/node-problem-detector:v0.6.3-28-160499f

  • kube-eventer: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/kube-eventer-amd64:v1.2.4-0f5aaee-aliyun

  • kube-event-init: registry.{ .Values.controller.regionId }.aliyuncs.com/acs/kube-eventer-init:1.5-5e0e7c1-aliyun

2021-04-25

  • The following issue is fixed: kube-event-init in the kube-system namespace returns the "414 Request Too Large" error when the event center feature is enabled.

  • The eventer list-watch mechanism is optimized. This prevents etcd from receiving excessive requests. For more information, see eventer list-watch.

  • The following issue is fixed: kube-eventer fails to parse the timestamps of some system events. For more information, see fix FailedScheduling event write to sls with wrong timestamp.

July 2020

Version number

Image address

Release date

Description

v0.6.3-28-160499f

registry.aliyuncs.com/acs/node-problem-detector:v0.6.3-28-160499f

2020-07-27

  • The following information can be added to OOMKilling events: the name of the relative pod, the namespace to which the pod belongs, and the user IDs (UIDs) of the killed processes.

  • The efficiency of the check_fd plug-in is improved.

  • Node events are optimized to report that the process ID (PID) usage of cluster nodes exceeds the threshold.

  • Plug-ins that detect network connections are upgraded.

  • Alert plug-ins are added to send alerts when the inode usage in the system disks of cluster nodes exceeds the threshold.