All Products
Search
Document Center

Elastic Compute Service:Comprehensive instance diagnosis

Last Updated:Jun 25, 2026

The comprehensive instance diagnosis feature diagnoses an instance's system status, network connectivity, and disk health. This helps you understand its health and promptly identify and resolve common issues.

Prerequisites

  • When an instance diagnosis involves the Cost-related diagnosis category, the system checks whether the AliyunServiceRoleForECSSelfService service-linked role exists for your account. If the role does not exist, you are prompted to create it. The system automatically creates the role after your confirmation.

    The AliyunServiceRoleForECSSelfService role includes the AliyunServiceRolePolicyForECSSelfService system policy. You cannot add, modify, or delete the permissions granted by this policy.

    Policy document

    {
      "Version": "1",
      "Statement": [
        {
          "Action": [
            "ecs:StartInstance",
            "ecs:StopInstance",
            "ecs:DescribeInstances",
            "ecs:CreateSnapshot",
            "ecs:DescribeSnapshots",
            "ecs:DeleteSnapshot",
            "ecs:DescribeDisks",
            "ecs:DescribeDisksFullStatus",
            "ecs:ResetDisk",
            "ecs:DescribeInvocationResults",
            "ecs:DescribeInvocations",
            "ecs:RunCommand",
            "ecs:CreateDiagnosticReport",
            "oos:StartExecution",
            "oos:ListExecutions",
            "oos:ListExecutionLogs",
            "oos:ListTaskExecutions",
            "oos:CancelExecution",
            "actiontrail:LookupEvents"
          ],
          "Resource": "*",
          "Effect": "Allow"
        },
        {
          "Action": "ram:DeleteServiceLinkedRole",
          "Resource": "*",
          "Effect": "Allow",
          "Condition": {
            "StringEquals": {
              "ram:ServiceName": "selfservice.ecs.aliyuncs.com"
            }
          }
        }
      ]
    }

    If you are a RAM user, the Alibaba Cloud account owner must first grant you permission to create service-linked roles before you can run an instance diagnosis that involves the Cost-related diagnosis category. For more information, see Create a custom policy on the JSON tab and Grant permissions to a RAM user.

    The following policy grants a RAM user permission to use the Diagnostics feature.

    {
        "Statement": [
            {
                "Action": [
                    "ram:CreateServiceLinkedRole"
                ],
                "Resource": "acs:ram:*:<account ID>:role/*",
                "Effect": "Allow",
                "Condition": {
                    "StringEquals": {
                        "ram:ServiceName": [
                            "selfservice.ecs.aliyuncs.com"
                        ]
                    }
                }
            }
        ],
        "Version": "1"
    }
  • For a comprehensive health check or network exception diagnosis, ensure the instance meets the following requirements:

    • Instance family: The instance must belong to a purchasable instance family. For more information, see Instance families.

      Note

      The instance health diagnosis feature is not supported for discontinued instance families.

    • Instance status: The instance must be in the Running state.

    • Operating system: If your diagnosis scenario involves checking OS-level configurations, ensure the operating system meets the requirements in the following table.

      OS architecture

      OS version

      OS configuration

      x86_64

      • Windows Server 2008 and later

      • Alibaba Cloud Linux 2/3

      • AlmaLinux 8.x and later

      • Anolis OS 7.x/8.x

      • CentOS 7.x/8.x

      • CentOS Stream 8 and later

      • Debian 8.x and later

      • Fedora 33/34

      • OpenSUSE 15.x/42.x

      • Rocky Linux 8.x and later

      • SUSE Linux Enterprise Server 12.x/15.x

      • Ubuntu 16.04/18.04/20.04/24.04

      Note

      Diagnostic performance is not guaranteed on unsupported distributions.

  • For an instance startup failure scenario, ensure the instance meets the following requirements:

    • Instance status: The instance must be in the Stopped state.

    • Operating system: If your diagnosis scenario involves checking OS-level configurations, ensure the operating system meets the requirements in the following table.

      OS architecture

      OS version

      x86_64

      • Windows Server 2008 and later

      • Alibaba Cloud Linux 2/3

      • AlmaLinux 8.x and later

      • Anolis OS 7.x/8.x

      • CentOS 7.x/8.x

      • CentOS Stream 8 and later

      • Debian 8.x and later

      • Fedora 33/34

      • OpenSUSE 15.x/42.x

      • Rocky Linux 8.x and later

      • SUSE Linux Enterprise Server 12.x/15.x

      • Ubuntu 16.04/18.04/20.04/24.04

      Note

      Diagnostic performance is not guaranteed on unsupported distributions.

Use cases

Use the comprehensive instance diagnosis feature to understand instance health in the following scenarios:

  • Troubleshoot and resolve issues: If an instance has a problem, such as a network connection failure, run targeted diagnostics to identify the cause and find a solution.

  • Perform regular health checks: Run full checks during routine O&M to understand the overall health of an instance and resolve potential issues before they affect your business.

Note

The instance health diagnosis feature provides descriptions and recommended solutions for each diagnostic item. For more information, see Diagnostic items and results.

Procedure

ECS console

Create a diagnostic report

  1. Log in to the ECS console.

  2. In the navigation pane on the left, choose Maintenance & Monitoring > Self-service Troubleshooting.

  3. In the top navigation bar, select a region.

  4. Select a time range and an instance ID, and then click Initiate Diagnostics.

    Note

    Only one diagnostic task can run on an instance at a time. A 5-minute interval is required between consecutive diagnoses.

    The actual diagnostic items are subject to what is displayed on the page. In the diagnostic report, you can click the categories under Diagnostic Item Details to view the specific items and their progress. The diagnosis takes a few minutes. You can monitor the progress on the current page or close the dialog box and check the report later in the diagnostic task list.

  5. View the diagnostic report.

    The diagnostic report contains the following information:

    • Basic information: Includes the diagnosis time range, resource ID, report ID, and diagnosis time.

    • Diagnosis result: If all checks pass, the result is No exceptions are detected on the instance. If any issues are found, the report lists the abnormal diagnostic items and recommends solutions.

    • Diagnostic item details: Shows the results for each diagnostic item, with severity levels of Critical, Warning, and Passed.

    Note

    For the Cost-related diagnosis category, you can also obtain more information in the following ways:

    You can use the diagnostic report to resolve issues yourself:

View diagnostic history

To understand an instance's health history, view its diagnostic history.

  1. Log in to the ECS console.

  2. View the diagnostic history of the instance.

    1. In the navigation pane on the left, choose Maintenance & Monitoring > Self-service Troubleshooting.

    2. In the top navigation bar, select a region.

    3. In the Comprehensive Instance Diagnosis area, click View History.

    4. On the Diagnostic History page, on the Instance Diagnostic Report tab, enter a resource ID or report ID and press Enter to search.

  3. For any entry in the diagnostic history, click View Report to see the report or Re-diagnose to start a new diagnosis.

API

  1. Query diagnostic metrics.

    You can call the DescribeDiagnosticMetrics operation to query diagnostic metrics. For a list of published diagnostic metrics, see Diagnostic items and results.

  2. Manage diagnostic metric sets.

    Two types of diagnostic metric sets are available for creating diagnostic reports.

    • Public diagnostic metric sets: Public diagnostic metric sets are compiled from common user issues to simplify diagnostics.

      Public diagnostic metric sets are maintained by Alibaba Cloud and cannot be modified. You can call the DescribeDiagnosticMetricSets operation to query public diagnostic metric sets. The following table describes the supported public diagnostic metric set.

      Metric set name

      Description

      Use case

      dms-instancedefault

      Default diagnostic set

      To perform a comprehensive check on an ECS instance.

    • Custom diagnostic metric sets: If you need to check only a subset of diagnostic metrics, you can call the CreateDiagnosticMetricSet operation to create a custom diagnostic metric set. After a set is created, you can call the DescribeDiagnosticMetricSets operation to query your custom sets.

      The following sample response shows a newly created custom diagnostic metric set named test.

      {
        "RequestId": "6AF68D67-601A-5278-AB10-4195CCA7****",
        "MetricSets": [
          {
            "Type": "User",
            "MetricIds": [
              "Instance.ControllerError",
              "Instance.CPUException",
              "Instance.CPUSplitLock"
            ],
            "MetricSetId": "dms-uf6ck3iljpbft15i****",
            "ResourceType": "instance",
            "MetricSetName": "test"
          }
        ]
      }
  3. Create a diagnostic report.

    You can call the CreateDiagnosticReport operation to create a diagnostic report from a public or custom diagnostic metric set.

    The following sample response shows a successfully created diagnostic report.

    {
      "RequestId": "A1283ACE-2F19-54B9-9464-401EBD1A****",
      "ReportId": "dr-uf6aacg5g2fjp64i****"
    }
  4. Query a diagnostic report.

    You can call the DescribeDiagnosticReports operation to query a report's details. The report returns the diagnosis results for each diagnostic metric in the set. For more information about diagnostic results, see Diagnostic items and results.

    The following sample response shows a report where the diagnosis completed normally and found no issues.

    {
      "RequestId": "20381C19-C31B-52AE-AC9B-8AD672E4****",
      "NextToken": "",
      "Reports": [
        {
          "Status": "Finished",
          "EndTime": "2022-09-07T15:36Z",
          "ResourceId": "i-uf653eye7pkftni****",
          "MetricSetId": "dms-uf6ck3iljpbft15i****",
          "Issues": [],
          "StartTime": "2022-09-05T15:36Z",
          "CreationTime": "2022-09-07T15:36Z",
          "ReportId": "dr-uf6aacg5g2fjp64i****",
          "ResourceType": "instance",
          "Severity": "Normal",
          "FinishedTime": "2022-09-07T15:36Z"
        }
      ]
    }

Related topics