Community Blog Node.js Performance Monitoring and Alerting

Node.js Performance Monitoring and Alerting

In this article, we will introduce Node.js Performance Platform to monitor , alert and analyze abnormal metrics.

In some scenarios for Node.js troubleshooting, we may need to analyze error logs, disks, and core dump files to eventually locate problems. Node.js Performance Platform is a good choice and has already been used to monitor and troubleshoot almost all of Alibaba Group's online Node.js applications. You can deploy and use it in your production environment without any worries.

Online application alerting is actually a self-discovery mechanism. In our production practices, we can basically solve online problems by analyzing error logs, Node.js process CPU and memory usage, core dump files, and disks. So, we can configure alerting policies in these five sections. Fortunately, these alerts have been preset in the platform.

By clicking Add Quick Rules in Node.js Performance Platform, threshold expression templates and alert description templates will be automatically generated. You can make some modifications based on your project monitoring requirements. For example, to monitor the heap memory of a Node.js process, you can select the Memory Alert option:

Click Add Alert to complete the heap memory alert configuration. At this point, click Notification Settings -> Add to Contacts List to add a contact for this rule.

The default rule in this example will send an SMS message to the contact bound to this alert rule when the heap memory allocated to the Node.js process exceeds 80% of the maximum heap memory. (The default maximum heap memory on a 64-bit machine is 1.4 GB.)

The quick rule list provides some common pre-configured alert policies. If these pre-configured policies cannot meet your needs, you can see the Alert Settings document to see how to customize service alert policies. In addition to SMS message notifications, you can also use DingTalk Chatbot to push alert notifications to DingTalk groups so that a group of people can be notified of Node.js application status.

After you follow instructions in the previous section to configure proper alert rules, you can perform analysis accordingly when you receive SMS alert messages, like Disk Monitoring.

This is a relatively easy question. In the quick rule list, the default disk monitoring rule is to issue an alert if the server disk usage exceeds 85%. When you receive a disk alert, you can connect to the server and use the following command to see which directory has high disk usage:

sudo du -h --max-depth=1 /

After locating the directory and files that consume significant disk space, see if it is necessary to back up these files and then delete them to free up disk space.

For the alerts for Error Logs, Process High CPU Usage, Memory Leaks and Core Dump, you can go to Node.js Application Troubleshooting Manual - Node.js Performance Platform User Guide for details.

Related Blog Posts

Node.js Application Troubleshooting Manual - Outline and General Problem Metrics

This article provides a general explanation of how to troubleshoot and locate online Node.js applications when they fail from several common server problems.

If no suspicious information is displayed during the above error log (in fact, the sequence of checking the error log and the system metrics in this section is not fixed, and you can choose which to execute first based on you own needs), then we should pay attention to whether the problem is caused by the load of the server or the Node.js application itself reaching the limit.

For Disk Usage, Use the df command to observe the current disk usage. This is also a very common problem that many developers may ignore the monitoring alarm for the server disk. When the log dump, core dump, and other large files gradually fill up the disk to 100%, the Node.js application may fail to run normally. The Node.js Performance Platform currently also provides monitoring for the disk, which will also be explained in more detail in the second part of this manual.

Best Practices for Working with Alibaba Cloud Function Compute

In this article, we discuss the best practices for one of the hottest trends in technology right now – serverless computing – using Alibaba Cloud Function Compute.

Serverless Framework is the most commonly used open-source serverless framework for deploying serverless infrastructure, which supports serverless functions written in NodeJS, Python, Java, Go, C# or Scala.

In Alibaba Cloud, users can take advantage of the CloudMonitor service which provides robust real-time cloud monitoring solution for all resources, including Function Compute. Users can monitor the status metrics for Function Compute such as:

  1. Total Invocations
  2. Average Duration (millisecond)
  3. Function Errors
  4. Function Errors Rate (%)
  5. Max Memory Usage (MB)
  6. Billable Invocations
  7. Billable Invocations Rate (%)
  8. Throttles
  9. Throttles Rate (%)
  10. Client Errors
  11. Client Errors Rate (%)
  12. Server Errors
  13. Server Errors Rate (%)

Related Documentation


CloudMonitor provides end-to-end, out-of-the-box, and enterprise-class monitoring solutions for cloud users. CloudMonitor is able to monitor IT infrastructure, external network quality, events, custom metrics, and service logs, and provides you with efficient, comprehensive, and cost-effective monitoring services.

CloudMonitor provides the cloud service events monitoring feature, and more events are being added to this feature. Custom processing of cloud resources can be automatically performed when multiple events trigger custom functions.

Node.js logs

By default, Node.js logs are printed to the console, which makes the data collection and troubleshooting inconvenient. By using Log4js, logs can be printed to files and log format can be customized, which is convenient for data collection and coordination.

Related Products


CloudMonitor collects monitor metrics of Alibaba Cloud resources and custom metrics. The service can be used to detect the availability of your service and allows you to set alarms on specific metrics. CloudMonitor enables you to view and fully understand the usage of the cloud resources, and the status and health of your business, so that you can act promptly to ensure the availability of your application when an alarm is triggered.

Log Service

Log Service is a complete real-time data logging service that has been developed by Alibaba Group. Log Service supports collection, consumption, shipping, search, and analysis of logs, and improves the capacity of processing and analyzing large amounts of logs.

0 0 0
Share on

Alibaba Clouder

2,626 posts | 711 followers

You may also like