This topic describes how to define application Service Level Objectives (SLOs) by using ASM.
Prerequisites
- The cluster is added to the ASM instance. of version 1.15.3 or later.
- Auto-injection is enabled. For more information, see Configure a sidecar injection policy.
Background
A Service Level Objective (SLO) is a target value or range of values for a service level, measured by one or more Service Level Indicators (SLIs). You can manually define SLOs based on Prometheus metrics, but the process can be complex. ASM simplifies this process by allowing you to generate SLOs and corresponding alert rules from the ASM console. For more information, see Service Level Objective (SLO) overview and SLO CRD field reference.
Preparations
Deploy the httpbin application in an ACK cluster and configure its virtual service and gateway rules.
Deploy httpbin in an ACK cluster
- Create the httpbin.yaml file with the following content.
################################################################################################## # httpbin service ################################################################################################## apiVersion: v1 kind: ServiceAccount metadata: name: httpbin --- apiVersion: v1 kind: Service metadata: name: httpbin labels: app: httpbin service: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin --- apiVersion: apps/v1 kind: Deployment metadata: name: httpbin spec: replicas: 1 selector: matchLabels: app: httpbin version: v1 template: metadata: labels: app: httpbin version: v1 spec: serviceAccountName: httpbin containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin ports: - containerPort: 80 - Connect to the ACK cluster by using kubectl to manage your cluster and its applications. For more information, see Connect to a cluster by using kubectl.
- Run the following command to deploy httpbin in the ACK cluster.
kubectl apply -f httpbin.yaml
Configure virtual service and gateway rules
- Create the httpbin-gateway.yaml file with the following content.
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - "*" --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin spec: hosts: - "*" gateways: - httpbin-gateway http: - route: - destination: host: httpbin port: number: 8000 - Connect to the ASM instance by using kubectl to manage the instance. For more information, see Use kubectl on the control plane to access Istio resources.
- Run the following command to deploy the virtual service and gateway rules.
kubectl apply -f httpbin-gateway.yaml - In the address bar of your browser, enter http://{IP address of the ingress gateway service}.
The httpbin application page appears, indicating a successful deployment.
Define SLO configuration
This section shows how to create a service availability SLO for the httpbin service in the default namespace with a target value of 99%, a 30-day duration, and two alert levels: Page and Ticket.
-
Log on to the ASM console. In the left-side navigation pane, choose .
-
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose .
- On the SLO Configuration page, select the Namespace where the target service is located. In this example, the namespace is default. In the row for the httpbin service, click Create.
- On the Create page, in the Basic Information section, select 30 days for Duration.
- Click the SLO Rule tab. Set Name to asm-slo, Target to 99, and Plugin Type to availability. Turn on the Enable Alert Rule switch, set Alert Rule Name to asm-alert, and then turn on the Enable Page-level Alert Rule and Enable Ticket-level Alert Rule switches. In the Basic Information section, select default for Namespace, enter httpbin for , and set to .
- Optional: At the bottom of the page, click Preview to review the configuration. After confirming the information is correct, click Confirm.
For more information about the fields in the configuration file, see SLO CRD field reference.
- When you are finished, click Create at the bottom of the page.
Automatically generated Prometheus rules
---
# Code generated by Alibaba Cloud Service Mesh (ASM).
# DO NOT EDIT.
groups:
- name: asm-slo-sli-recordings-httpbin-asm-slo
rules:
- record: slo:sli_error:ratio_rate5m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_naxxx\n }[5m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destinaxxx\n }[5m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 5m
- record: slo:sli_error:ratio_rate30m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_naxxx\n }[30m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destinaxxx\n }[30m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 30m
- record: slo:sli_error:ratio_rate1h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_naxxx\n }[1h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destinaxxx\n }[1h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-sloExpand to view an example of the Prometheus rules.
groups:
- name: asm-slo-sli-recordings-httpbin-asm-slo
rules:
- record: slo:sli_error:ratio_rate5m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[5m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[5m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 5m
- record: slo:sli_error:ratio_rate30m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[30m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[30m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 30m
- record: slo:sli_error:ratio_rate1h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[1h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[1h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 1h
- record: slo:sli_error:ratio_rate2h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[2h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[2h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 2h
- record: slo:sli_error:ratio_rate6h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[6h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[6h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 6h
- record: slo:sli_error:ratio_rate1d
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[1d])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[1d])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 1d
- record: slo:sli_error:ratio_rate3d
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[3d])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[3d])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 3d
- record: slo:sli_error:ratio_rate30d
expr: |
sum_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
/ ignoring (slo_window)
count_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
labels:
slo_window: 30d
- name: asm-slo-meta-recordings-httpbin-asm-slo
rules:
- record: slo:objective:ratio
expr: vector(0.99)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:error_budget:ratio
expr: vector(1-0.99)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:time_period:days
expr: vector(30)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:current_burn_rate:ratio
expr: |
slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
/ on(slo_id, asm_slo, slo_service) group_left
slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:period_burn_rate:ratio
expr: |
slo:sli_error:ratio_rate30d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
/ on(slo_id, asm_slo, slo_service) group_left
slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:period_error_budget_remaining:ratio
expr: 1 - slo:period_burn_rate:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo",
slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: asm_slo_info
expr: vector(1)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_mode: cli-gen-prom
slo_objective: "99"
slo_service: httpbin
slo_spec: prometheus/v1
slo_version: dev
- name: asm-slo-alerts-httpbin-asm-slo
rules:
- alert: asm-alert
expr: |
(
(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate1h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
)
or ignoring (slo_window)
(
(slo:sli_error:ratio_rate30m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
)
labels:
slo_severity: page
annotations:
summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is over expected.'
title: (page) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is too fast.
- alert: asm-alert
expr: |
(
(slo:sli_error:ratio_rate2h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate1d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
)
or ignoring (slo_window)
(
(slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate3d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
)
labels:
slo_severity: ticket
annotations:
summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is over expected.'
title: (ticket) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget
burn rate is too fast.
You can import these rules into Prometheus to apply the SLOs. For more information, see Import generated rules into Prometheus to implement SLOs.