All Products
Search
Document Center

Managed Service for Prometheus:Monitor the data of multiple Prometheus instances across Alibaba Cloud accounts

Last Updated:Apr 11, 2024

You can use the custom authentication feature provided by Managed Service for Prometheus to aggregate the data of multiple Prometheus instances that are owned by different Alibaba Cloud accounts. This way, you can monitor Prometheus instance metrics, visualize them in Grafana dashboards, and configure alerting for them in a unified manner.

Prerequisites

Important

You can aggregate the Prometheus instance data that belongs to multiple Alibaba Cloud accounts into a Prometheus instance that belongs to another Alibaba Cloud account. In this example, Prometheus instance data that belongs to Account B is aggregated into the global aggregation instance that belongs to Account A.

Usage notes

Custom authentication is implemented by using a RAM user and a RAM role. The RAM user assumes the RAM role to access the required resources.

The RAM user and RAM role features are provided by Resource Access Management (RAM). RAM roles can be assumed only by RAM users, and cannot be assumed by Alibaba Cloud accounts.

Workflow

Step 1: Use Account B to create a RAM role

Step 2: Grant permissions to the AliyunPrometheusQueryRole role

(Optional) Step 3: Create a RAM user for Account A (If you want to aggregate Prometheus instance data into an existing RAM user of Account A, proceed to the next step.)

Step 4: Grant permissions to the RAM user

Step 5: Aggregate Prometheus instance data

Step 1: Use Account B to create a RAM role

  1. Use Account B to log on to the RAM console.

  2. In the left-side navigation pane, choose Identities > Roles. Then, click Create Role.

  3. In the panel that appears, select Alibaba Cloud Account and click Next.

  4. In the Configure Role step, set the RAM Role Name parameter to AliyunPrometheusQueryRole (icon 1), set the Select Trusted Alibaba Cloud Account parameter to Other Alibaba Cloud Account, enter the ID of Account A (icon 2), and then click OK.image.png

  5. Click the name of the AliyunPrometheusQueryRole role. On the page that appears, click the Trust Policy Management tab, and then click Edit Trust Policy. In the panel that appears, modify the trust policy to grant permissions to Account A.image.png

    Note

    You can enter an array of Alibaba Cloud accounts in the trust policy to grant permissions to them.

Step 2: Grant permissions to the AliyunPrometheusQueryRole role

Perform the following steps to attach the AliyunRAMReadOnlyAccess and AliyunARMSReadOnlyAccess policies to the AliyunPrometheusQueryRole role.

Important

To grant the read-only permissions on all ARMS features to a specific resource group, you must attach the AliyunARMSReadOnlyAccess policy to and grant the ReadTraceApp permission to the resource group. Otherwise, ARMS cannot display the application list that belongs to the authenticated resource group.

  1. In the left-side navigation pane, choose Identities > Roles. Click the name of the AliyunPrometheusQueryRole role. On the Permissions tab of the page that appears, click Grant Permission.

  2. In the Select Policy section of the panel that appears, search for the AliyunRAMReadOnlyAccess and AliyunARMSReadOnlyAccess policies, select the policies, and then click OK.

    image.png

(Optional) Step 3: Create a RAM user for Account A

  • If you want to aggregate Prometheus instance data from Account B into Account A, you need to create a RAM user for Account A.

  • If you want to aggregate Prometheus instance data from Account B into an existing RAM user of Account A, proceed to the next step.

Procedure

  1. Log on to the RAM console by using an Alibaba Cloud account or a RAM user who has administrative rights.

  2. In the left-side navigation pane, choose Identities > Users.

  3. On the Users page, click Create User.

  4. In the User Account Information section of the Create User page, configure the following parameters:

    • Logon Name: The logon name can be up to 64 characters in length, and can contain letters, digits, periods (.), hyphens (-), and underscores (_).

    • Display Name: The display name can be up to 128 characters in length.

    • Tag: Click the edit icon and enter a tag key and a tag value. You can add one or more tags to the RAM user. This way, you can manage the RAM user based on the tags.

    Note

    You can click Add User to create multiple RAM users at a time.

  5. In the Access Mode section, select an access mode and configure the required parameters.

    To ensure the security of your Alibaba Cloud account, we recommend that you select only one access mode for the RAM user. This way, the RAM user for an individual is separated from the RAM user for a program.

    • Console Access

      If the RAM user represents an individual, we recommend that you select Console Access for the RAM user. This way, the RAM user can use a username and password to access Alibaba Cloud. If you select Console Access, you must configure the following parameters:

      • Set Console Password: You can select Automatically Regenerate Default Password or Reset Custom Password. If you select Reset Custom Password, you must specify a password. The password must meet the complexity requirements. For more information, see Configure a password policy for RAM users.

      • Password Reset: specifies whether the RAM user is required to reset the password upon the next logon.

      • Enable MAF: specifies whether to enable multi-factor authentication (MFA) for the RAM user. After you enable MFA, you must bind an MFA device to the RAM user or allow the RAM user to bind an MFA device. For more information, see Bind an MFA device to a RAM user.

    • OpenAPI Access

      If the RAM user represents a program, we recommend that you select OpenAPI Access for the RAM user. This way, the RAM user can use an AccessKey pair to access Alibaba Cloud. If you select OpenAPI Access, the system automatically generates an AccessKey ID and AccessKey secret for the RAM user. For more information, see Obtain an AccessKey pair.

      Important

      An AccessKey secret for a RAM user is displayed only after you click Create AccessKey. You cannot query the AccessKey secret in subsequent operations. Therefore, you must back up your AccessKey secret.

  6. Click OK.

  7. Complete security verification as prompted.

Step 4: Grant permissions to the RAM user

  1. Click the name of the RAM user. On the page that appears, click the Permissions tab.

  2. Click Grant Permission. In the Select Policy section of the panel that appears, search for the AliyunSTSAssumeRoleAccess and AliyunARMSFullAccess policies, select the policies, and then click OK.image.png

Step 5: Aggregate Prometheus instance data

  1. Log on to the Managed Service for Prometheus console as the RAM user of Account A.

  2. Find the Prometheus instance into which you want to aggregate the data, and click Edit in the Actions column. In the STEP3 section of the panel that appears, set the Select the instances to be aggregated parameter to Other Accounts (Custom Authentication) (icon 1).

  3. In the search box on the right of Alibaba Cloud Account, enter the UID of Account B in the search box and click OK. Then, all Prometheus instances that belong to Account B are displayed.image.png

    Note

    Only RAM users that have permissions on another Alibaba Cloud account can modify the Prometheus instances within the account, whereas the Alibaba Cloud account that owns the RAM users cannot.

Step 6: View the data of the global aggregation instance

After the Prometheus instance data of Account B is aggregated into the global aggregation instance of Account A, you can view the aggregated performance metrics of the Prometheus instances on the preset Grafana dashboard.

  1. On the Managed Service for Prometheus page, click the name of the global aggregation instance. In the left-side navigation pane, click Dashboards.

  2. On the Dashboards page, click the name of the preset dashboard to view the performance metrics of all aggregated Prometheus instances.

Step 7: Create an alert rule for the global aggregation instance

  1. On the Managed Service for Prometheus page, click the name of the global aggregation instance. In the left-side navigation pane, click Alert Rules.

  2. In the upper-right corner of the page, click Create Prometheus Alert Rule. On the Create Prometheus Alert Rule page, configure parameters for the alert rule. For more information, see Create an alert rule for a Prometheus instance.

    Note

    In the Data Preview section of the Create Prometheus Alert Rule page, the global aggregation instance provides the unique_cluster_id and unique_cluster_name fields. The unique_cluster_id field specifies the unique identifier of a Prometheus instance, and the unique_cluster_name field specifies the name of a Prometheus instance. If an alert is triggered, you can use the fields to quickly locate the Prometheus instance.

What to do next

Edit the global aggregation instance

On the Managed Service for Prometheus page, find the global aggregation instance, and click Edit in the Actions column. If you change the endpoint, the alert rule configured for the original endpoint becomes invalid. Therefore, we recommend that you do not change the endpoint unless it is necessary.

Uninstall the global aggregation instance

If you no longer use the global aggregation instance, you can uninstall the instance.

On the Instances page, find the Prometheus instance, and click Uninstall in the Actions column. In the message that appears, click OK. After the Prometheus instance is uninstalled, it is no longer displayed on the Instances page.

Uninstalled global aggregation instances are no longer billed. Global aggregation instances support only the pay-as-you-go billing method. You are not charged upfront fees. Therefore, you cannot apply for a refund.

References

  • For information about RAM, see What is RAM?

  • For information about billing of Managed Service for Prometheus, see Refund policy.