Metric collection jobs can be created for cluster events. Cluster events can be displayed in the Kubernetes Deployment dashboard.
Self-monitoring metrics can be instrumented based on the service-level agreement (SLA) to stabilize the dashboard data. SLA stability data can be displayed in a self-monitoring dashboard.
ServiceMonitor supports the BasicAuth authentication method. Secrets must be in the same namespace as ServiceMonitor.
Metrics Metadata capabilities are provided to display the description of specific metrics.
The Agent Chart version can be passed to the server. Then, the server initializes or updates the dashboard based on the version.
Remote write self-monitoring metrics are supported to calculate the time consumed to send data in each batch.
Metrics about the errors and latency of basic metric collection are supported.
Metrics about the errors and latency of business metric collection are supported.
The queue_config parameter in remote write settings supports the following default values: min_shards=10, max_samples_per_send=5000, and capacity=10000. This improves the adaptability of large-scale clusters.
The service discovery methods, especially the PV settings of Container Storage Interface (CSI) data collection, are optimized.
The senderLoop distribution frequency is optimized, and the syncWorkersSeries frequency is modified to reduce unnecessary disturbances.
Some logs are simplified. Detailed information, such as the time consumed for trace capturing, can be displayed in some logs.
The collection period and collection timeout settings of basic metric collection jobs are separately configured, and the global configurations are no longer used. This reduces unnecessary interference on basic metric data collection.
The interaction logic in master-slave multi-replica mode is optimized. The Masters and Workers no longer affect each other. This helps improve stability.
The policy that specifies how the Master distributes Targets is optimized. This saves about 30% CPU utilization and 40% memory resources, and improves data collection performance.
metrics_relabel is optimized. CPU utilization is reduced by 70%.
The multi-tenancy listening logic of Informer is optimized to save CPU utilization by 20% in multi-tenancy scenarios.
Cache IP addresses can be automatically used if CoreDNS fails to resolve domain names in real time. This improves the success rate of data transmission.
The distribution and collection configuration logic of SendConfig is optimized to improve configuration stability.
The Master prefetching policy is optimized to reduce the resource overhead of Master, and improve Master service discovery and target scheduling capabilities.
Adaptive control is implemented on data packets that exceed 1 MB in size in a single batch. This reduces data loss caused by backend restrictions.
The issue that some ScrapeLoop Targets are repeatedly collected is fixed.
In multi-tenancy scenarios, the Label caches of pods are not updated in a timely manner. As a result, duplicate timelines are generated. This issue is fixed.
Some targets related to out-of-memory (OOM) errors or replica restarting are not collected. This issue is fixed.
Secret parsing issues and remote write Header transmission issues are fixed.
Occasionally, the Kubernetes-pods cannot be shut down. This issue is fixed.
The issue that the global default parameters and the external_labels parameter do not take effect is fixed. Parameters can be modified.