Disclaimer: This article may contain information about third-party products. Such information is for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.
This topic describes how to troubleshoot and solve the problem of high CPU usage in Windows instances.
Alibaba Cloud reminds you that:
- Before you perform operations that may cause risks, such as modifying instance configurations or data, we recommend that you check the disaster recovery and fault tolerance capabilities of the instances to ensure data security.
- If you modify the configurations and data of instances including but not limited to ECS and RDS instances, we recommend that you create snapshots or enable RDS log backup.
- If you have authorized or submitted security information such as the logon account and password in the Alibaba Cloud Management console, we recommend that you modify such information in a timely manner.
This article mainly describes the following steps.
- Locate problems. Find the process that affects the high CPU usage.
- Analysis processing. Troubleshoot the processes that affect the high CPU usage by category.
- For normal processes: you need to optimize the program or upgrade server configuration.
- For abnormal processes, you can manually or use third-party security tools to kill them.
- Operation example. The following describes the specific troubleshooting process and solution process.
- More information. This section describes how to use the troubleshooting tool.
Microsoft has a number of tools that can locate the problem of high CPU usage, such as task manager, Resource Monitor, Performance Monitor), Process Explorer, Xperf (after Windows server 2008), KernRate(Windows server 2003), to capture the system Full Memory Dump for inspection. In case of large traffic, you can use Wireshark to capture network packets for a period and analyze the traffic usage.
Tips: for systems of Windows Server 2008 and above, the built-in Resource Monitor is usually used to monitor the CPU.
- Click the start menu at the bottom of the desktop and select run.
- After opening the Run box, enter
perfmon-resin the box, and click OK.
- On the Resource Monitor page, check whether the CPU usage of each process is too high.
- For processes that consume a large amount of resources, view the corresponding process ID and process name.
- After locating the process ID, view the task manager to determine whether the program is abnormal and locate the specific location of the program.
- Before locating the abnormal process, you need to click view in the task manager.> Select column.
- In the dialog box that appears, select PID and click OK.
- In the process page of the Task Manager, The PID item will be added. Click PID, and sort the tasks to locate the abnormal process that was detected by Resource Monitor. Right-click the process name and choose open file location.
Analysis and Processing
Handle high CPU usage
You need to determine whether the processes that affect the high CPU usage are normal processes or abnormal processes, and then classify them.
Analysis and handling of high normal usage
Typically, when you frequently access your business, or Windows services such as update services, they may occupy a large amount of network traffic and CPU resources. Troubleshoot the high CPU usage caused by normal processes in the following order.
Note: We recommend that you configure a memory of 2 GB or more for Windows Server 2008 or Windows Server 2012 instances.
- Checks whether Windows Update has been executed in the background.
- It is recommended to install anti-virus software on the server for anti-virus. If anti-virus software is installed, check whether the anti-virus software scans the cloud in the background when the CPU usage is high. If possible, please upgrade anti-virus software to the latest version, or remove anti-virus software.
- Check whether the applications in the ECS instance have excessive disk access, network access, or high computing requirements. We recommend that you increase the instance type to increase the number of cores or the memory size to solve resource bottlenecks. For example, you can upgrade an instance.
- If the server configuration is high, it is meaningless to upgrade the configuration again. In terms of architecture, the higher the server configuration, the better. In this case, you need to try to implement application separation and optimize the related programs. The example is described as follows:
Problem description: when multiple applications, such as MySQL, PHP, and Web applications, are deployed on the same server, they are prone to abnormal resource load even when the configuration is high.
Solution: use different servers to host different applications. For example, databases are entirely carried by RDS, reducing the resource consumption of the server and the large number of calls inside the server. In terms of program optimization, you can make adjustments according to your configuration, such as adjusting the number of connections and cache configuration, as well as various parameters during Web and database calls.
Analysis and handling of abnormal usage
Abnormal high CPU usage may be caused by malicious viruses and Trojans. And causes high CPU usage. You need to manually detect and remove abnormal processes.
Note: If you cannot determine whether the process is a virus or Trojan, search for the process name on the Internet. In addition, we recommend that you create snapshots to complete the backup before deleting processes.
- Use anti-virus software commercial edition or Microsoft Safety Scanner free Microsoft security tools to scan anti-virus in security mode. The tool link is as follows.
- Run Windows Update to install the latest Microsoft security patch.
- Disable all default Microsoft service drivers and run the MSconfig command to check whether the driver reoccurs. For more information, see configure Windows.
- If a server or site suffers from distributed denial of service attack or HTTP flood attacks, a large number of access requests are generated in a short period of time. Log on to the security center to check whether the thresholds of the protected distributed denial of service attack are adjusted. If the thresholds are set, check whether HTTP flood protection is enabled. If the attack does not trigger a threshold and Alibaba Cloud security does not perform traffic scrubbing, contact after-sales personnel to start up the traffic scrubbing.
The high CPU usage may be caused by one of the following reasons:
- Trojans infected by viruses
- Run third-party antivirus software.
- Applications with abnormal applications, drivers, high I/O usage, or high interrupt handling.
Tips: when using a Windows Server 2012 instance with 1 core and 1GB memory, the Windows Update service is automatically updated, and the CPU usage of the instance suddenly increases. This is a normal phenomenon.
Note: This article references many links to Microsoft's official documents and tools. Microsoft owns the copyright and ownership, and should fully consider the possible problems caused by Microsoft Windows product iteration or timely document update.
- When CPU usage is high, check whether the Windows Update process is being executed in the background.
- Check whether anti-virus software is performing scanning operation in the background. Anti-virus software can be upgraded to the latest version, or anti-virus software can be removed.
- Click run, then enter MSCONFIG, disable all non-Microsoft bring-in service drivers, and then check to see if the problem occurs again. The relevant reference documents are as follows.
- Use the commercial antivirus software or Microsoft security scanner to scan the antivirus in safe mode. The reference documents for Microsoft security scanner are as follows.
- Run Windows Update to install the latest Microsoft security patches.
- When an ECS instance needs a large amount of disk access, network access, and high computing resources, CPU usage is high. You can upgrade the instance type to cope with insufficient resources. For instructions on how to upgrade instance specifications, see the following documents.
- For more solutions, see the following Microsoft document.
The following are the recommendations of the Windows instance troubleshooting tool.
- Visually check the application list and locate applications that consume a high CPU. The following is the task manager page.
- When checking CPU usage on the performance page, right-click the CPU usage icon and select change graph to > Logical processor.
- When the CPU usage of a single process surges to nearly 100%, while the CPU usage of other processes does not change much, it may be caused by network I/O processing.
Checks CPU usage visually and searches for processes by handle and module.
- Process Explorer is a Microsoft Sysinternals tool. It checks the Call Stack of the threads called by the corresponding application by configuring correct Symbols to locate possible problem drivers. The link to download the Process Explorer tool is provided below.
- The following is the Process Explorer tool usage page.
- Performance Monitor is Microsoft's professional tool for collecting Performance counters for each component. Multiple counters are available to check the CPU resource consumption. Start by clicking> Run> perfmon, open the performance monitor.
- Performance has the following three core parameters. Where
\processor(_total%\%processortimeis the sum of
\Processor(_Total)\% Processor Time
\Processor(*)\% User Time
\Processor(*)\% Privileged Time
\Processor(%\%privilegedtimeis the time that the application performs system calls (such as driver, IRP, context switch, etc.) in the kernel. If the operating system spends 30% of its time on the
privileged time, as shown in the following figure, the instance is undergoing high I/O throughput operations.
- When the value of
// privileged timeis high, you need to check
// interrupttime, and
maximum interrupttimeindicate a large number of operations or poor performance problems of unknown devices. For more information, see the following documents.
- A high value of
ContextSwitches/secindicates that a large number of threads are in the Ready state. To resolve this issue, you must reduce the number of threads.
dynamic interrupttimeare very large, you need to use the Microsoft Xperf tool for further analysis. For more information, see the following documents.
- If the Context Switch value is very large, refer to the following documents.
\Processor(%\%usertimeindicates the time consumed by the processor to execute program code, and it can be determined which application or function call consumes more time.
- The following figure shows a high