All Products
Search
Document Center

Application Real-Time Monitoring Service:Manage Prometheus monitoring with Terraform

Last Updated:Mar 10, 2026

The alicloud_arms_prometheus_monitoring Terraform resource lets you define ServiceMonitor, PodMonitor, custom scrape jobs, and health check agents (Probes) as code. This lets you version-control, review, and reproduce monitoring configurations across Prometheus instances.

Prerequisites

Before you begin, make sure that you have:

  • A Prometheus instance for Container Service or ECS. See

  • Terraform 0.12.28 or later. Run terraform --version to check your version

  • Alibaba Cloud credentials configured through one of the methods described in the following section

Configure credentials

To improve the flexibility and security of permission management, we recommend that you create a Resource Access Management (RAM) user named Terraform. Then, create an AccessKey pair for the RAM user and grant permissions to the RAM user. For more information, see Create a RAM user and Grant permissions to a RAM user.

Method 1: Environment variables

export ALICLOUD_ACCESS_KEY="<your-access-key>"
export ALICLOUD_SECRET_KEY="<your-secret-key>"
export ALICLOUD_REGION="cn-beijing"

Method 2: Provider block

provider "alicloud" {
  access_key = "<your-access-key>"
  secret_key = "<your-secret-key>"
  region     = "cn-beijing"
}

Replace the following placeholders with your actual values:

PlaceholderDescriptionExample
<your-access-key>AccessKey ID of your RAM userLTAI5tXxx
<your-secret-key>AccessKey secret of your RAM userxXxXxXx
Note

Specify the region based on your business requirements. For example, cn-beijing.

Note

Resource Orchestration Service (ROS) is a native infrastructure-as-code (IaC) service provided by Alibaba Cloud. It also supports the integration of Terraform templates. By using Terraform with ROS, you can define and manage resources in Alibaba Cloud, Amazon Web Services (AWS), or Microsoft Azure, specify resource parameters, and configure dependency relationships for the resources. See Create a Terraform template and Create a Terraform stack.

Supported monitoring types

The monitoring types available depend on your Prometheus instance type:

Instance typeSupported monitoring types
Prometheus for Container ServiceServiceMonitor, PodMonitor, custom jobs, health check agents
Prometheus for ECSCustom jobs and health check agents only

Health check agent constraints:

  • The status parameter is not supported for Probe resources.

  • Name format: <custom-name>-{tcp|http|ping}-blackbox. For example, name1-tcp-blackbox indicates a TCP health check.

  • For ECS instances (fully managed), the namespace must be empty or follow the format <vpc-id>-<user-id>. For example, vpc-0jl4q1q2of2tagvwxxxx-11032353609xxxx.

Argument reference

All monitoring types use the alicloud_arms_prometheus_monitoring resource with the following arguments:

ArgumentRequiredDescription
cluster_idYesThe ID of the Prometheus instance
typeYesMonitoring type: serviceMonitor, podMonitor, customJob, or probe
config_yamlYesYAML configuration for the monitoring resource (heredoc format)
statusNoRun status. Set to run to activate. Not supported for the probe type

Deploy a monitoring resource

All monitoring types share the same Terraform workflow. Each section below provides the config_yaml for a specific type.

  1. Create a working directory and add a main.tf file with the provider block:

       provider "alicloud" {
       }
  2. Initialize Terraform: Expected output:

       terraform init
       Initializing the backend...
    
       Initializing provider plugins...
       - Checking for available provider plugins...
       - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1...
       ...
    
       You may now begin working with Terraform. Try running "terraform plan" to see
       any changes that are required for your infrastructure. All Terraform commands
       should now work.
    
       If you ever set or change modules or backend configuration for Terraform,
       rerun this command to reinitialize your working directory. If you forget, other
       commands will detect it and remind you to do so if necessary.
  3. Add the resource configuration for your monitoring type to main.tf. See the configuration examples in the following sections.

  4. Preview the changes: The output shows the resources that Terraform will create. For example:

       terraform plan
       Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
         + create
    
       Terraform will perform the following actions:
       ...
       Plan: 1 to add, 0 to change, 0 to destroy.
  5. Apply the configuration: The output shows the execution plan and prompts for confirmation: If yes is returned, the monitoring resource is created for the current Prometheus instance.

       terraform apply
       ...
       Plan: 1 to add, 0 to change, 0 to destroy.
    
       Do you want to perform these actions?
         Terraform will perform the actions described above.
         Only 'yes' will be accepted to approve.
    
         Enter a value: yes

Add a ServiceMonitor

Add the following resource block to main.tf:

resource "alicloud_arms_prometheus_monitoring" "myServiceMonitor1" {
  cluster_id  = "c77e1106f429e4b46b0ee1720cxxxxx"   # The ID of the Prometheus instance.
  status      = "run"
  type        = "serviceMonitor"
  config_yaml = <<-EOT
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: tomcat-demo
      namespace: default
    spec:
      endpoints:
        - interval: 30s
          path: /metrics
          port: tomcat-monitor
      namespaceSelector:
        any: true
      selector:
        matchLabels:
          app: tomcat
  EOT
}

Key fields in config_yaml:

FieldDescription
metadata.nameName of the ServiceMonitor
metadata.namespaceNamespace where the ServiceMonitor is created
spec.endpoints[].intervalScrape interval (for example, 30s)
spec.endpoints[].pathMetrics endpoint path (typically /metrics)
spec.endpoints[].portNamed port to scrape
spec.namespaceSelector.anySet to true to match Services in all namespaces
spec.selector.matchLabelsLabel selector to match target Services

Verify the ServiceMonitor

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

  3. Click the name of your Prometheus instance to open the Integration Center.

  4. In the Installed section, click the custom component. On the Service Discovery Configurations tab, verify that the ServiceMonitor appears.

Add a PodMonitor

Add the following resource block to main.tf:

resource "alicloud_arms_prometheus_monitoring" "myPodMonitor1" {
  cluster_id  = "c77e1106f429e4b46b0ee1720cxxxxx"   # The ID of the Prometheus instance.
  status      = "run"
  type        = "podMonitor"
  config_yaml = <<-EOT
    apiVersion: "monitoring.coreos.com/v1"
    kind: "PodMonitor"
    metadata:
      name: "podmonitor-demo"
      namespace: "default"
    spec:
      namespaceSelector:
        any: true
      podMetricsEndpoints:
        - interval: "30s"
          path: "/metrics"
          port: "tomcat-monitor"
      selector:
        matchLabels:
          app: "nginx2-exporter"
  EOT
}

Key fields in config_yaml:

FieldDescription
metadata.nameName of the PodMonitor
metadata.namespaceNamespace where the PodMonitor is created
spec.podMetricsEndpoints[].intervalScrape interval (for example, 30s)
spec.podMetricsEndpoints[].pathMetrics endpoint path
spec.podMetricsEndpoints[].portNamed port to scrape
spec.namespaceSelector.anySet to true to match Pods in all namespaces
spec.selector.matchLabelsLabel selector to match target Pods

Verify the PodMonitor

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

  3. Click the name of your Prometheus instance to open the Integration Center.

  4. In the Installed section, click the custom component. On the Service Discovery Configurations tab, verify that the PodMonitor appears.

Add a custom job

Add the following resource block to main.tf:

resource "alicloud_arms_prometheus_monitoring" "myCustomJob1" {
  cluster_id  = "c77e1106f429e4b46b0ee1720cxxxxx"   # The ID of the Prometheus instance.
  status      = "run"
  type        = "customJob"
  config_yaml = <<-EOT
    scrape_configs:
      - job_name: prometheus1
        honor_timestamps: false
        honor_labels: false
        scheme: http
        metrics_path: /metric
        static_configs:
          - targets:
              - 127.0.0.1:9090
  EOT
}

Key fields in config_yaml:

FieldDescription
scrape_configs[].job_nameName of the scrape job
scrape_configs[].schemeProtocol to use (http or https)
scrape_configs[].metrics_pathPath to the metrics endpoint
scrape_configs[].static_configs[].targetsList of host:port targets to scrape
scrape_configs[].honor_timestampsWhen true, uses timestamps from scraped metrics instead of server time
scrape_configs[].honor_labelsWhen true, keeps scraped labels that conflict with server-attached labels

Verify the custom job

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

  3. Click the name of your Prometheus instance to open the Integration Center.

  4. In the Installed section, click the custom component. On the Service Discovery Configurations tab, verify that the custom job appears.

Add a health check agent

Add the following resource block to main.tf:

resource "alicloud_arms_prometheus_monitoring" "myProbe1" {
  cluster_id  = "c77e1106f429e4b46b0ee1720cxxxxx"   # The ID of the Prometheus instance.
  type        = "probe"
  config_yaml = <<-EOT
    apiVersion: monitoring.coreos.com/v1
    kind: Probe
    metadata:
      name: name1-tcp-blackbox
      namespace: arms-prom
    spec:
      interval: 30s
      jobName: blackbox
      module: tcp_connect
      prober:
        path: /blackbox/probe
        scheme: http
        url: 'localhost:9335'
      targets:
        staticConfig:
          static:
            - 'arms-prom-admin.arms-prom:9335'
  EOT
}
Note

Do not set the status parameter for Probe resources. It is not supported.

Key fields in config_yaml:

FieldDescription
metadata.nameAgent name. Must follow the format <custom-name>-{tcp|http|ping}-blackbox
metadata.namespaceNamespace. For ECS instances, leave blank or use the format <vpc-id>-<user-id>
spec.intervalHealth check interval (for example, 30s)
spec.jobNameKeep the default value blackbox
spec.moduleProbe module: tcp_connect, http_2xx, or icmp
spec.proberProber endpoint configuration. Keep the default values
spec.targets.staticConfig.staticList of host:port targets to check

Verify the health check agent

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

  3. Click the name of your Prometheus instance to open the Integration Center.

  4. In the Installed section, click the Blackbox component. On the Health Check tab, verify that the agent appears.

Delete monitoring resources

To remove all monitoring resources managed by the current Terraform configuration, run:

terraform destroy

Enter yes when prompted to confirm. Expected output:

...
Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes
...
Destroy complete! Resources: 1 destroyed.

Verify the deletion

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

  3. Click the name of your Prometheus instance to open the Integration Center.

  4. In the Installed section, click the custom or Blackbox component. On the Service Discovery Configurations or Health Check tab, confirm that the monitoring settings have been removed.

What's next