If the Security Center agent becomes offline upon an exception, the agent fails to be installed or uninstalled, or the processes of the Security Center agent cause high CPU utilization, you can use the agent troubleshooting feature of Security Center to troubleshoot issues. This topic describes how to use the agent troubleshooting feature.

Background information

The troubleshooting results contain the issues and the suggestions on how to solve the issues. You can download diagnostic logs to verify and analyze the issues.

Limits

The agent troubleshooting feature is available for the servers that run the following versions of operating systems:
  • Windows Server 2008 and later
  • 64-bit Linux (versions later than CentOS 5)

Scenarios

Troubleshoot issues for servers that are added to Security Center

  1. Log on to the Security Center console.
  2. In the left-side navigation pane, click Assets.
  3. On the Assets page, click the Server(s) tab.
  4. On the Server(s) tab, select one or more servers for which you want to troubleshoot issues from the server list and click Client Troubleshooting below the server list. Select the servers for which you want to troubleshoot issues
  5. In the Troubleshoot Agent Issues dialog box, configure the Question type and Mode parameters.
    The following table describes the parameters.
    Parameter Description
    Question type The type of the issue that you want to troubleshoot. If you cannot identify the type, select Overall Check (Unknown Issues).
    Mode The mode that you want to use to troubleshoot issues. Valid values:
    • Standard: In this mode, logs of the Security Center agent are collected and then reported to Security Center for analysis. The time required for troubleshooting is approximately 1 minute.
    • Strict: In this mode, the information about the Security Center agent is collected and then reported to Security Center for analysis. The information includes network conditions, processes, and logs. The time required for troubleshooting is approximately 5 minutes.
  6. Click Start Check.
    Note When you troubleshoot issues, the related diagnostic program collects information about the agent that is installed on the servers and reports the information to Security Center for analysis. The information includes the network conditions, the processes of the Security Center agent, and logs.
  7. In the Attention message, click OK. In the Task management panel, view all troubleshooting tasks.
    You can also click Client Mission Management in the upper-right corner of the Assets page to go to the Task management panel.
  8. Find the task whose details you want to view and click Details in the Operation column. The Execution log panel appears.
    The Execution log panel displays the details about the troubleshooting tasks for each server.
    The following table describes the parameters in the Execution log panel.
    Parameter Description
    Start time/end time The time when the troubleshooting task starts and ends.
    Server information The name of the server on which the troubleshooting task is run.
    Status The status of the troubleshooting task. Valid values:
    • Success: The command that is used for troubleshooting is issued.
    • Timeout: The command that is used for troubleshooting is issued for a while, but the troubleshooting result is not returned.
    • Failure: The troubleshooting result is generated.
    Problem The issues that are found after the troubleshooting task is complete.
    Results The solutions to the issues.
    Operation The operation that you can perform on the diagnostic logs of the troubleshooting task. You can download the logs to verify and analyze the issues.
    If the solutions to the issues are provided in the Results column, you can follow the solutions to solve the issues. If no solutions are provided in the Results column, click Download Diagnostic Log in the Operation column to download the diagnostic logs. Then, report the downloaded logs and the ID of your Alibaba Cloud account to Alibaba Cloud engineers for verification and analysis.

Troubleshoot issues for servers that are not added to Security Center

If your servers are not added to Security Center, you can run commands on the servers based on the operating system of each server to troubleshoot issues.

  1. Log on to the server for which you want to troubleshoot issues.
    Note
    • You must log on to a Windows server as an administrator.
    • You must log on to a Linux server as a root user.
  2. Run the command on the server.
    The command that you use to troubleshoot issues varies based on the operating system of an Elastic Compute Service (ECS) instance or a server that is not deployed on Alibaba Cloud. The following table describes the commands.
    Server Operating system Mode Command
    ECS instance Linux Standard Run the following command on the ECS instance as a root user:
    wget "http://update2.aegis.aliyun.com/download/aegis_client_self_check/linux64/aegis_checker.bin" && chmod +x aegis_checker.bin && ./aegis_checker.bin
    If no network connection is established between the ECS instance and Security Center, you must download the aegis_checker program and install the program on the ECS instance. Then, run the following command on the instance:
    chmod +x aegis_checker.bin
     ./aegis_checker.bin
    Note In Standard mode, logs of the Security Center agent are collected and then reported to Security Center for analysis. The time required for troubleshooting is approximately 1 minute.
    Strict Run the following command on the ECS instance as a root user:
    wget "http://update2.aegis.aliyun.com/download/aegis_client_self_check/linux64/aegis_checker.bin" && chmod +x aegis_checker.bin && ./aegis_checker.bin -b "ew0KICAgICJ1dWlkIjogIiIsDQogICAgImNtZF9pZHgiOiAiIiwNCiAgICAiaXNzdWUiOiAib3RoZXJfaXNzdWUiLA0KICAgICJtb2RlIjogMywNCiAgICAianNydl9kb21haW4iOiBbXSwNCiAgICAidXBkYXRlX2RvbWFpbiI6IFtdDQp9"
    Note In Strict mode, the information about the Security Center agent is collected and then reported to Security Center for analysis. The information includes network conditions, processes, and logs. The time required for troubleshooting is approximately 5 minutes.
    Windows Standard Use one of the following methods for troubleshooting:
    • Download the aegis_checker program and run the program as an administrator.
    • Run the following command in Command Prompt as an administrator:
      powershell -executionpolicy bypass -c "(New-Object Net.WebClient).DownloadFile('http://update2.aegis.aliyun.com/download/aegis_client_self_check/win32/aegis_checker.exe', $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath('.\aegis_checker.exe'))"; "./aegis_checker.exe"
    Note Windows servers do not support the Strict mode.
    Server that is not deployed on Alibaba Cloud Linux Standard Run the following command on the server as a root user:
    wget "http://aegis.alicdn.com/download/aegis_client_self_check/linux64/aegis_checker.bin" && chmod +x aegis_checker.bin && ./aegis_checker.bin
    Strict Run the following command on the server as a root user:
    wget "http://aegis.alicdn.com/download/aegis_client_self_check/linux64/aegis_checker.bin" && chmod +x aegis_checker.bin && ./aegis_checker.bin -b "ew0KICAgICJ1dWlkIjogIiIsDQogICAgImNtZF9pZHgiOiAiIiwNCiAgICAiaXNzdWUiOiAib3RoZXJfaXNzdWUiLA0KICAgICJtb2RlIjogMywNCiAgICAianNydl9kb21haW4iOiBbXSwNCiAgICAidXBkYXRlX2RvbWFpbiI6IFtdDQp9"
    Windows Standard Use one of the following methods for troubleshooting:
    • Download the aegis_checker program and run the program as an administrator.
    • Run the following command in Command Prompt as an administrator:
      powershell -executionpolicy bypass -c "(New-Object Net.WebClient).DownloadFile('http://aegis.alicdn.com/download/aegis_client_self_check/win32/aegis_checker.exe', $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath('.\aegis_checker.exe'))"; "./aegis_checker.exe"
    Note Windows servers do not support the Strict mode.
  3. After the troubleshooting is complete, export the generated log package.
    The directory in which the log package is stored varies based on the operating system of a server.
    • Linux servers

      The log package is stored in /root/miniconda2/aegis_checker/output.

    • Windows servers

      The log package is stored in ./miniconda2/aegis_checker/output of the current directory.

    In the extracted log file, logs prefixed with [root cause] include the issues that the aegis_checker program detects on the Security Center agent. If some issues are solved, you can view the details in the logs. If some issues are not solved, the program may provide solutions. You can follow the solutions to solve the issues. If the program does not provide a solution to an issue, take a screenshot of the troubleshooting result. Then, report the screenshot, the log package, and the ID of your Alibaba Cloud account to Alibaba Cloud engineers for verification and analysis.