This topic describes how to use Terraform to manage the monitoring settings of a Prometheus instance, including ServiceMonitor, PodMonitor, custom jobs, and health inspection agents.
Prerequisites
A Prometheus instance for Container Service or ECS is created. For more information, see Use Terraform to manage Prometheus instances.
Terraform is installed.
By default, Cloud Shell is preinstalled with Terraform and configured with your account information. You do not need to modify the configurations.
If you do not use Cloud Shell, you can directly install Terraform. For more information, see Install and configure Terraform.
NoteYou must install Terraform V0.12.28 or later. You can run the
terraform --version
command to query the Terraform version.While Resource Orchestration Service (ROS) is a native infrastructure-as-code (IaC) service provided by Alibaba Cloud, it also supports the integration of Terraform templates. By using Terraform with ROS, you can define and manage resources in Alibaba Cloud, Amazon Web Services (AWS), or Microsoft Azure, specify resource parameters, and configure dependency relationships for the resources. For more information, see Create a Terraform template and Create a Terraform stack.
Your Alibaba Cloud account information is configured. You can use one of the following methods to configure Alibaba Cloud account information:
NoteTo improve the flexibility and security of permission management, we recommend that you create a Resource Access Management (RAM) user named Terraform. Then, create an AccessKey pair for the RAM user and grant permissions to the RAM user. For more information, see Create a RAM user and Grant permissions to a RAM user.
Method 1: Add environment variables to store authentication information.
export ALICLOUD_ACCESS_KEY="************" export ALICLOUD_SECRET_KEY="************" export ALICLOUD_REGION="cn-beijing"
NoteSpecify the value of the
export ALICLOUD_REGION
parameter based on your business requirements.Method 2: Specify identity information in the provider section of the configuration file.
provider "alicloud" { access_key = "************" secret_key = "************" region = "cn-beijing" }
NoteSpecify the value of the
export ALICLOUD_REGION
parameter based on your business requirements.
Limits
Prometheus instances for Container Service: ServiceMonitor, PodMonitor, custom jobs, and health inspection agents are supported.
Prometheus instances for ECS: Only custom jobs and health inspection agents are supported.
Health inspection agents:
The Status parameter is not supported.
Create the name of a health inspection agent in the following format:
Custom name-{tcp/http/ping}-blackbox
. For example, an agent namedxxx-tcp-blackbox
indicates a TCP inspection.Prometheus instances for ECS are fully managed. Therefore, the namespace of an agent must either be empty or fixed. Sample namespace:
vpcId-userId
. Sample agent name:vpc-0jl4q1q2of2tagvwxxxx-11032353609xxxx
.
Add monitoring settings to a Prometheus instance
Add a ServiceMonitor
Create a working directory and a file named main.tf in the directory.
provider "alicloud" { }
Run the following command to initialize the runtime environment for Terraform:
terraform init
Expected output:
Initializing the backend... Initializing provider plugins... - Checking for available provider plugins... - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1... ... You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
Import the monitoring resources.
Add the monitoring resources to the main.tf file.
# ServiceMonitor configurations of the Prometheus instance. resource "alicloud_arms_prometheus_monitoring" "myServiceMonitor1" { cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" # The ID of the Prometheus instance. status = "run" # The status of the ServiceMonitor. type = "serviceMonitor" config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: tomcat-demo # The name of the ServiceMonitor. namespace: default # The namespace where the ServiceMonitor resides. spec: endpoints: - interval: 30s # The interval at which metrics are captured. Unit: seconds. path: /metrics # The path where the captured metrics are saved. port: tomcat-monitor # The name of the port for capturing metrics. namespaceSelector: any: true # Optional. The namespace where the service resides. selector: matchLabels: app: tomcat # Optional. The tags attached to the service. EOT }
Run the following command to create an execution plan:
terraform plan
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myServiceMonitor1 will be created + resource "alicloud_arms_prometheus_monitoring" "myServiceMonitor1" { + cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "serviceMonitor" + config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: tomcat-demo namespace: default spec: endpoints: - interval: 30s path: /metrics port: tomcat-monitor namespaceSelector: any: true selector: matchLabels: app: tomcat EOT } Plan: 1 to add, 0 to change, 0 to destroy.
Run the following command to create a ServiceMonitor:
terraform apply
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myServiceMonitor1 will be created + resource "alicloud_arms_prometheus_monitoring" "myServiceMonitor1" { + cluster_id = "c77e1106f429e4b46b0ee1720c9xxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "serviceMonitor" + config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: tomcat-demo namespace: default spec: endpoints: - interval: 30s path: /metrics port: tomcat-monitor namespaceSelector: any: true selector: matchLabels: app: tomcat EOT } Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes
If
yes
is returned, the ServiceMonitor is created for the current Prometheus instance.
Verify the result
You can log on to the Managed Service for Prometheus console and view the configurations of the ServiceMonitor in the Integration Center of the Prometheus instance. To do this, perform the following steps:
Log on to the Managed Service for Prometheus console.
In the left-side navigation pane, click Instances.
Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.
Click the custom component in the Installed section. In the panel that appears, click the Service Discovery Configurations tab to view the configurations of the ServiceMonitor.
Add a PodMonitor
Create a working directory and a file named main.tf in the directory.
provider "alicloud" { }
Run the following command to initialize the runtime environment for Terraform:
terraform init
Expected output:
Initializing the backend... Initializing provider plugins... - Checking for available provider plugins... - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1... ... You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
Import the monitoring resources.
Add the monitoring resources to the main.tf file.
# PodMonitor configurations of the Prometheus instance. resource "alicloud_arms_prometheus_monitoring" "myPodMonitor1" { cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" # The ID of the Prometheus instance. status = "run" # The status of the PodMonitor. type = "podMonitor" config_yaml = <<-EOT apiVersion: "monitoring.coreos.com/v1" kind: "PodMonitor" metadata: name: "podmonitor-demo" # The name of the PodMonitor. namespace: "default" # The namespace where the PodMonitor resides. spec: namespaceSelector: any: true # Optional. The namespace where the pod resides. podMetricsEndpoints: - interval: "30s" # The interval at which metrics are captured. Unit: seconds. path: "/metrics" # The path where the captured metrics are saved. port: "tomcat-monitor" # The name of the port for capturing metrics. selector: matchLabels: app: "nginx2-exporter" # Optional. The tags attached to the pod. EOT }
Run the following command to create an execution plan:
terraform plan
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myPodMonitor1 will be created + resource "alicloud_arms_prometheus_monitoring" "myPodMonitor1" { + cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "podMonitor" + config_yaml = <<-EOT apiVersion: "monitoring.coreos.com/v1" kind: "PodMonitor" metadata: name: "podmonitor-demo" namespace: "default" spec: namespaceSelector: any: true podMetricsEndpoints: - interval: "30s" path: "/metrics" port: "tomcat-monitor" selector: matchLabels: app: "nginx2-exporter" EOT } Plan: 1 to add, 0 to change, 0 to destroy.
Run the following command to create a PodMonitor:
terraform apply
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myPodMonitor1 will be created + resource "alicloud_arms_prometheus_monitoring" "myPodMonitor1" { + cluster_id = "c77e1106f429e4b46b0ee1720c9xxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "podMonitor" + config_yaml = <<-EOT apiVersion: "monitoring.coreos.com/v1" kind: "PodMonitor" metadata: name: "podmonitor-demo" namespace: "default" spec: namespaceSelector: any: true podMetricsEndpoints: - interval: "30s" path: "/metrics" port: "tomcat-monitor" selector: matchLabels: app: "nginx2-exporter" EOT } Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes
If
yes
is returned, the PodMonitor is created for the current Prometheus instance.
Verify the result
You can log on to the Managed Service for Prometheus console and view the configurations of the PodMonitor in the Integration Center of the Prometheus instance. To do this, perform the following steps:
Log on to the Managed Service for Prometheus console.
In the left-side navigation pane, click Instances.
Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.
Click the custom component in the Installed section. In the panel that appears, click the Service Discovery Configurations tab to view the configurations of the PodMonitor.
Add a custom job
Create a working directory and a file named main.tf in the directory.
provider "alicloud" { }
Run the following command to initialize the runtime environment for Terraform:
terraform init
Expected output:
Initializing the backend... Initializing provider plugins... - Checking for available provider plugins... - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1... ... You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
Import the monitoring resources.
Add the monitoring resources to the main.tf file.
# Job configurations of the Prometheus instance. resource "alicloud_arms_prometheus_monitoring" "myCustomJob1" { cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" # The ID of the Prometheus instance. status = "run" # The status of the custom job. type = "customJob" config_yaml = <<-EOT scrape_configs: - job_name: prometheus1 # The name of the custom job. honor_timestamps: false honor_labels: false scheme: http metrics_path: /metric static_configs: - targets: - 127.0.0.1:9090 EOT }
Run the following command to create an execution plan:
terraform plan
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myCustomJob1 will be created + resource "alicloud_arms_prometheus_monitoring" "myCustomJob1" { + cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "customJob" + config_yaml = <<-EOT scrape_configs: - job_name: prometheus1 honor_timestamps: false honor_labels: false scheme: http metrics_path: /metric static_configs: - targets: - 127.0.0.1:9090 EOT } Plan: 1 to add, 0 to change, 0 to destroy.
Run the following command to create a custom job:
terraform apply
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myCustomJob1 will be created + resource "alicloud_arms_prometheus_monitoring" "myCustomJob1" { + cluster_id = "c77e1106f429e4b46b0ee1720c9xxxxx" + id = (known after apply) + monitoring_name = (known after apply) + status = "run" + type = "customJob" + config_yaml = <<-EOT scrape_configs: - job_name: prometheus1 honor_timestamps: false honor_labels: false scheme: http metrics_path: /metric static_configs: - targets: - 127.0.0.1:9090 EOT } Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes
If
yes
is returned, the custom job is created for the current Prometheus instance.
Verify the result
You can log on to the Managed Service for Prometheus console and view the configurations of the custom job in the Integration Center of the Prometheus instance. To do this, perform the following steps:
Log on to the Managed Service for Prometheus console.
In the left-side navigation pane, click Instances.
Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.
Click the custom component in the Installed section. In the panel that appears, click the Service Discovery Configurations tab to view the configurations of the custom job.
Configure a health inspection agent
Create a working directory and a file named main.tf in the directory.
provider "alicloud" { }
Run the following command to initialize the runtime environment for Terraform:
terraform init
Expected output:
Initializing the backend... Initializing provider plugins... - Checking for available provider plugins... - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1... ... You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
Import the monitoring resources.
Add the monitoring resources to the main.tf file.
# Agent configurations of the Prometheus instance. resource "alicloud_arms_prometheus_monitoring" "myProbe1" { cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" # The ID of the Prometheus instance. type = "probe" config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: name1-tcp-blackbox # The name of the agent. Format: xxx-{tcp/http/ping}-blackbox. namespace: arms-prom # Optional. spec: interval: 30s # The interval at which health inspection is performed. jobName: blackbox # Keep the default value. module: tcp_connect prober: # The configuration information of the agent. Keep the default value. path: /blackbox/probe scheme: http url: 'localhost:9335' targets: staticConfig: static: - 'arms-prom-admin.arms-prom:9335' # The address for health inspection. EOT }
Run the following command to create an execution plan:
terraform plan
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myProbe1 will be created + resource "alicloud_arms_prometheus_monitoring" "myProbe1" { + cluster_id = "c77e1106f429e4b46b0ee1720cxxxxx" + id = (known after apply) + monitoring_name = (known after apply) + type = "probe" + config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: name1-tcp-blackbox namespace: arms-prom spec: interval: 30s jobName: blackbox module: tcp_connect prober: path: /blackbox/probe scheme: http url: 'localhost:9335' targets: staticConfig: static: - 'arms-prom-admin.arms-prom:9335' EOT } Plan: 1 to add, 0 to change, 0 to destroy. Plan: 1 to add, 0 to change, 0 to destroy.
Run the following command to create a health inspection agent:
terraform apply
Expected output:
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # alicloud_arms_prometheus_monitoring.myProbe1 will be created + resource "alicloud_arms_prometheus_monitoring" "myProbe1" { + cluster_id = "c77e1106f429e4b46b0ee1720c9xxxxx" + id = (known after apply) + monitoring_name = (known after apply) + type = "probe" + config_yaml = <<-EOT apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: name1-tcp-blackbox namespace: arms-prom spec: interval: 30s jobName: blackbox module: tcp_connect prober: path: /blackbox/probe scheme: http url: 'localhost:9335' targets: staticConfig: static: - 'arms-prom-admin.arms-prom:9335' EOT } Plan: 1 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes
If
yes
is returned, the health inspection agent is created for the current Prometheus instance.
Verify the result
You can log on to the Managed Service for Prometheus console and view the configurations of the agent in the Integration Center of the Prometheus instance. To do this, perform the following steps:
Log on to the Managed Service for Prometheus console.
In the left-side navigation pane, click Instances.
Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.
Click the Blackbox component in the Installed section. In the panel that appears, click the Health Check tab to view the configurations of the agent.
Delete monitoring settings from a Prometheus instance
Procedure
You can run the following command to delete a cluster created by using Terraform:
terraform destroy
Expected output
...
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
...
Destroy complete! Resources: 1 destroyed.
Verify the result
You can log on to the Managed Service for Prometheus console and check whether the monitoring settings are deleted in the Integration Center of the Prometheus instance.
Log on to the Managed Service for Prometheus console.
In the left-side navigation pane, click Instances.
Click the name of the Prometheus instance instance that you want to manage to go to the Integration Center page.
Click the custom or Blackbox component in the Installed section. In the panel that appears, click the Service Discovery Configurations or Health Check tab to check whether the monitoring settings are deleted.