當您需要對應用程式的服務水平進行管理和監控時,可以在ASM控制台佈建服務等級目標SLO(Service Level Objectives)和相應的警示規則,確保應用程式按照期望的服務水平運行。一旦應用程式的服務水平達到或超過預設的閾值,ASM將根據故障的嚴重程度,在故障發生時及時發出不同等級的提醒,提高應用程式服務水平管理的效率和響應速度。
前提條件
已添加叢集到ASM執行個體,且ASM執行個體為1.15.3及以上版本。
步驟一:部署httpbin應用樣本
使用以下內容,建立httpbin.yaml。
展開查看httpbin.yaml
################################################################################################## # httpbin service ################################################################################################## apiVersion: v1 kind: ServiceAccount metadata: name: httpbin --- apiVersion: v1 kind: Service metadata: name: httpbin labels: app: httpbin service: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin --- apiVersion: apps/v1 kind: Deployment metadata: name: httpbin spec: replicas: 1 selector: matchLabels: app: httpbin version: v1 template: metadata: labels: app: httpbin version: v1 spec: serviceAccountName: httpbin containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin ports: - containerPort: 80使用kubectl串連ACK叢集,執行以下命令,在ACK叢集中部署httpbin。
關於如何通過kubectl工具串連ACK叢集,請參見通過kubectl工具串連叢集。
kubectl apply -f httpbin.yaml
步驟二:配置虛擬服務和網關規則
使用以下內容,建立httpbin-gateway.yaml。
展開查看httpbin-gateway.yaml
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - "*" --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin spec: hosts: - "*" gateways: - httpbin-gateway http: - route: - destination: host: httpbin port: number: 8000通過kubectl串連ASM執行個體,執行以下命令,部署虛擬服務和網關規則。
關於如何通過kubectl工具串連ASM執行個體,請參見通過控制面kubectl訪問Istio資源。
kubectl apply -f httpbin-gateway.yaml在瀏覽器地址欄,輸入
http://{入口網關的IP地址}。關於如何擷取網關IP,請參見擷取入口網關地址。如果您可以看到httpbin應用的頁面,說明httpbin應用部署成功。
步驟三:定義SLO配置
本文將為default命名空間下的httpbin服務產生服務可用性SLO。其中,目標值為99%,期間為30天,配置Page和Ticket兩個等級的警示。關於SLO的相關概念說明,請參見服務等級目標SLO概述。
登入ASM控制台,在左側導覽列,選擇服務網格 > 網格管理。
在網格管理頁面,單擊目標執行個體名稱,然後在左側導覽列,選擇可觀測管理中心 > SLO配置。
在SLO配置頁面上方,選擇命名空間為目標服務所在的命名空間(本文為default),在目標服務httpbin右側,單擊建立。
在建立頁面的基本資料地區,期間選擇30天。
單擊SLO規則,配置名稱為asm-slo,外掛程式類型選擇availability,目標值為99,開啟開啟警示規則開關,配置警示規則名稱為asm-alert,然後開啟開啟緊急層級的警示規則和開啟警告層級的警示規則開關。

可選:在頁面下方,單擊預覽,查看配置資訊。確認無誤後,單擊確認。
關於設定檔的欄位說明,請參見SLO CRD欄位說明。
配置完成後,在頁面下方,單擊建立。
步驟四:自動產生Prometheus規則
SLO配置成功後,您可以在SLO配置頁面的目標服務httpbin右側,單擊查看Promethe規則,查看產生的結果。

展開查看Promethe規則樣本
groups:
- name: asm-slo-sli-recordings-httpbin-asm-slo
rules:
- record: slo:sli_error:ratio_rate5m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[5m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[5m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 5m
- record: slo:sli_error:ratio_rate30m
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[30m])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[30m])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 30m
- record: slo:sli_error:ratio_rate1h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[1h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[1h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 1h
- record: slo:sli_error:ratio_rate2h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[2h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[2h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 2h
- record: slo:sli_error:ratio_rate6h
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[6h])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[6h])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 6h
- record: slo:sli_error:ratio_rate1d
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[1d])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[1d])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 1d
- record: slo:sli_error:ratio_rate3d
expr: "(\n(\n sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\",response_code=~\"(5..|429)\"
}[3d])) \n / \n (sum(rate(istio_requests_total{ destination_service_name=\"httpbin\",destination_service_namespace=\"default\"
}[3d])) > 0)\n) OR on() vector(0)\n)"
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
slo_window: 3d
- record: slo:sli_error:ratio_rate30d
expr: |
sum_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
/ ignoring (slo_window)
count_over_time(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}[30d])
labels:
slo_window: 30d
- name: asm-slo-meta-recordings-httpbin-asm-slo
rules:
- record: slo:objective:ratio
expr: vector(0.99)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:error_budget:ratio
expr: vector(1-0.99)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:time_period:days
expr: vector(30)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:current_burn_rate:ratio
expr: |
slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
/ on(slo_id, asm_slo, slo_service) group_left
slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:period_burn_rate:ratio
expr: |
slo:sli_error:ratio_rate30d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
/ on(slo_id, asm_slo, slo_service) group_left
slo:error_budget:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: slo:period_error_budget_remaining:ratio
expr: 1 - slo:period_burn_rate:ratio{asm_slo="asm-slo", slo_id="httpbin-asm-slo",
slo_service="httpbin"}
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_service: httpbin
- record: asm_slo_info
expr: vector(1)
labels:
asm_slo: asm-slo
slo_id: httpbin-asm-slo
slo_mode: cli-gen-prom
slo_objective: "99"
slo_service: httpbin
slo_spec: prometheus/v1
slo_version: dev
- name: asm-slo-alerts-httpbin-asm-slo
rules:
- alert: asm-alert
expr: |
(
(slo:sli_error:ratio_rate5m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate1h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (14.4 * 0.01))
)
or ignoring (slo_window)
(
(slo:sli_error:ratio_rate30m{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (6 * 0.01))
)
labels:
slo_severity: page
annotations:
summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is over expected.'
title: (page) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is too fast.
- alert: asm-alert
expr: |
(
(slo:sli_error:ratio_rate2h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate1d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (3 * 0.01))
)
or ignoring (slo_window)
(
(slo:sli_error:ratio_rate6h{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
and ignoring (slo_window)
(slo:sli_error:ratio_rate3d{asm_slo="asm-slo", slo_id="httpbin-asm-slo", slo_service="httpbin"} > (1 * 0.01))
)
labels:
slo_severity: ticket
annotations:
summary: '{{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget burn
rate is over expected.'
title: (ticket) {{$labels.slo_service}} {{$labels.asm_slo}} SLO error budget
burn rate is too fast.後續步驟
您可以將產生的Prometheus規則匯入Prometheus中執行SLO,並使用Grafana查看SLO相關指標。具體操作,請參見將產生的規則匯入Prometheus中執行SLO和使用Grafana查看SLO。