Kruise Rollout是基於Kubernetes的一個標準向外延展群組件,可以配合原生工作負載(Deployment、StatefulSet)以及OpenKruise工作負載(CloneSet、Adcanced StatefulSet),實現金絲雀發布、A/B Testing發布和分批發布等功能。本文通過樣本介紹如何使用Kruise Rollout灰階發布雲原生應用。
前提條件
-
已建立Kubernetes叢集。具體操作,請參見建立ACK託管叢集。
-
如需使用A/B Testing或金絲雀發布的能力,叢集版本需為1.19及以上版本。
-
如需使用分批發布能力,則叢集版本需為1.16及以上版本。
-
-
已安裝kubectl-kruise。關於kubectl-kruise安裝路徑,請參見kubectl-kruise。
Kruise Rollout介紹
Kruise Rollout是OpenKruise社區開源的漸進式交付架構。Kruise Rollout支援配合流量和執行個體灰階的灰階發布、藍綠髮布、A/B Testing發布。基於Prometheus Metrics指標,Kruise Rollout還可以實現發布過程的自動化分批與暫停,並提供旁路的無感對接、相容已有的多種工作負載(Deployment、CloneSet、StatefulSet)。更多資訊,請參見Kruise Rollout。
使用Kruise Rollout,只需配置一份Rollout資源並將其下發到K8s叢集中,後續的業務發布、升級均無需額外操作,並且可以與Helm、PaaS平台低成本地無縫對接。使用Kruise Rollout實現灰階發布架構如下圖所示。
準備工作
-
安裝Kruise Rollout組件。
登入Container Service管理主控台,在左側導覽列選擇叢集列表。
在叢集列表頁面,單擊目的地組群名稱,然後在左側導覽列,單擊組件管理。
-
在組件管理頁面,單擊應用管理頁簽,然後在ack-kruise卡片右下方,單擊安裝。
-
在彈出的對話方塊確認資訊後,單擊確認。
說明1.8版本以上的ack-kruise組件已經支援v1beta1版本的API。更多資訊,請參見API Specifications。
-
部署業務應用(Deployment和Service)。
說明業務應用配置基於Deployment部署一個
echoserver服務,並通過Nginx Ingress對外暴露服務。-
建立echoserver.yaml檔案。
-
下文基於不同樣本介紹如何?金絲雀發布、A/B Testing發布和分批發布等。
情境一:基於Ingress實現金絲雀或A/B Testing發布
Nginx Ingress和MSE Ingress是目前較為普遍的對外暴露服務的方式。本樣本示範如何使用Kruise Rollout + Nginx Ingress/MSE Ingress實現金絲雀或A/B Testing發布。
-
安裝Ingress組件並建立業務應用Ingress。
Nginx Ingress Controller
-
安裝Nginx Ingress Controller。
-
建立叢集:建立叢集時,在Ingress配置地區,選擇安裝Nginx Ingress。具體操作,請參見建立ACK託管叢集。
-
已有叢集:關於安裝Nginx Ingress Controller的具體操作,請參見建立並使用Nginx Ingress對外暴露服務。
-
-
建立echoserver-ingress.yaml。
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: echoserver spec: ingressClassName: nginx rules: - http: paths: - backend: service: name: echoserver port: number: 80 path: /apis/echo pathType: Exact -
部署業務應用Ingress。
kubectl apply -f echoserver-ingress.yaml
MSE Ingress Controller
-
安裝MSE Ingress Controller並建立MseIngressConfig和IngressClass。具體操作,請參見通過MSE Ingress訪問Container Service。
-
建立echoserver-ingress.yaml。
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: echoserver spec: # 此處須指定ingressClassName為mse。 ingressClassName: mse rules: - http: paths: - backend: service: name: echoserver port: number: 80 path: /apis/echo pathType: Exact -
部署業務應用Ingress。
kubectl apply -f echoserver-ingress.yaml
-
-
驗證訪問。
-
擷取外部IP。
Nginx Ingress
export EXTERNAL_IP=$(kubectl get ingress echoserver -o jsonpath="{.status.loadBalancer.ingress[0].ip}" )MSE Ingress
export EXTERNAL_IP=$(kubectl get ingress echoserver -o jsonpath="{.status.loadBalancer.ingress[0].hostname}" ) -
測試訪問。
curl http://${EXTERNAL_IP}/apis/echo預期輸出:
Hostname: echoserver-75d49c475c-ls2bs Pod Information: node name: version1 pod name: echoserver-75d49c475c-ls2bs pod namespace: default Server values: server_version=nginx: 1.13.3 - lua: 10008 ...
-
-
定義Kruise Rollout灰階發布規則。
以下Rollout資源將定義灰階發布規則,發布分為三批:
金絲雀
-
第一批:金絲雀發布,20%的流量將匯入到新版本,其他則為老版本。
-
第二批:按照流量比例進行灰階,此批次將灰階50%的執行個體及流量。
-
第三批:將灰階完成所有的執行個體。
A/B Test
-
第一批:A/B Testing發布,具有
header[User-Agent]=Android的流量將匯入到新版本,其他則為老版本。 -
第二批:按照Pod比例進行灰階,此批次將灰階50%的執行個體。
-
第三批:將灰階完成所有的執行個體。
-
使用以下內容,建立rollout.yaml檔案。
金絲雀
apiVersion: rollouts.kruise.io/v1alpha1 kind: Rollout metadata: name: rollouts-demo spec: objectRef: workloadRef: apiVersion: apps/v1 kind: Deployment name: echoserver strategy: canary: steps: - weight: 20 replicas: 1 pause: {} - weight: 50 replicas: 50% pause: {duration: 60} - weight: 100 replicas: 100% pause: {duration: 60} trafficRoutings: - service: echoserver ingress: name: echoserverA/B Test
apiVersion: rollouts.kruise.io/v1alpha1 kind: Rollout metadata: name: rollouts-demo spec: objectRef: workloadRef: apiVersion: apps/v1 kind: Deployment name: echoserver strategy: canary: steps: # 階段1:1個Pod,匹配Android流量 - matches: - headers: - type: Exact name: User-Agent value: Android pause: {} replicas: 1 # 階段2:50% Pod,自動暫停60秒 - matches: - headers: - type: Exact name: User-Agent value: Android pause: {duration: 60} replicas: 50% # 階段3:100% 匹配流量的Pod - matches: - headers: - type: Exact name: User-Agent value: Android pause: {duration: 60} replicas: 100% trafficRoutings: - service: echoserver ingress: name: echoserver -
執行以下命令,將該Rollout資源下發到叢集。
kubectl apply -f rollout.yaml -
執行以下命令,查看Rollout資源狀態。
kubectl get rollout預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Healthy 3 Completed workload deployment is completed 7s rollout is healthy 32s預期輸出
STATUS=Healthy:表明Rollout資源正常工作。
-
-
升級應用版本。
Kruise Rollout是一個常態化的配置,將其下發到叢集中後,後續業務版本發布只需調整Deployment配置,無需再對Kruise Rollout進行額外操作。例如,業務將echoserver服務鏡像版本升級到1.10.3,然後通過執行
kubectl apply -f echoserver.yaml命令將Deployment部署到叢集。將Deployment配置下發到K8s叢集時,除kubectl方式外,也可以使用Helm、Vela等方式。-
修改echoserver.yaml檔案,將echoserver服務鏡像版本升級到1.10.3。
# echoserver.yaml apiVersion: apps/v1 kind: Deployment metadata: name: echoserver ... spec: ... containers: - name: echoserver # mac m1 can choice image e2eteam/echoserver:2.2-linux-arm image: openkruise-registry.cn-shanghai.cr.aliyuncs.com/openkruise/demo:1.10.3 imagePullPolicy: IfNotPresent env: - name: NODE_NAME # 可選操作。此處為清晰展示灰階效果,將value改為version2。 value: version2 -
執行以下命令,查看Rollout資源的狀態。
kubectl get rollouts rollouts-demo -n default預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Progressing 1 StepPaused Rollout is in step(1/3), and you need manually confirm to enter the next step 41m通過預期輸出的
STATUS和CANARY,可觀察Rollout的過程以及步驟。-
若預期輸出
STATUS=Progressing:表明已經在金絲雀發布過程中。 -
若預期輸出
CANARY_STEP=1:表明當前處於第一批次。 -
若預期輸出
CANARY_STATE=StepPaused:表明當前批次已經完成,是否需要繼續,可以通過人工確認。
-
-
驗證新老版本流量情況。
金絲雀
-
重複執行十次訪問服務,查看返回的
node name值。for i in {1..10}; do curl -s http://${EXTERNAL_IP}/apis/echo | grep 'node name'; sleep 1; done預期輸出:
node name: version1 node name: version1 node name: version2 node name: version1 node name: version2 node name: version1 node name: version1 node name: version1 node name: version1 node name: version1可以看到,version1與version2的比例約為8:2,符合第一階段權重預期。
-
手動進行階段切換。
kubectl-kruise rollout approve rollouts/rollouts-demo -n default -
持續查看rollout狀態。
kubectl get rollouts rollouts-demo -n default -w預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Progressing 2 StepTrafficRouting Rollout is in step(2/3), and upgrade workload to new version 31m rollouts-demo Progressing 2 StepMetricsAnalysis Rollout is in step(2/3), and upgrade workload to new version 31m rollouts-demo Progressing 2 StepPaused Rollout is in step(2/3), and upgrade workload to new version 31m rollouts-demo Progressing 2 StepPaused Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 31m rollouts-demo Progressing 2 StepReady Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 32m rollouts-demo Progressing 3 BeforeStepUpgrade Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepTrafficRouting Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepMetricsAnalysis Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepPaused Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 StepReady Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 Completed Rollout is in step(3/3), and upgrade workload to new version 32m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 32m rollouts-demo Progressing 3 Completed Rollout progressing has been completed 33m rollouts-demo Healthy 3 Completed Rollout progressing has been completed 33m可以看到,
approve之後rollout資源進入階段二,並在等待時間之後自動進入階段三,最終STATUS=Healthy,且CANARY_STATE=Completed,表明本次rollout已經全部完結。
A/B Test
-
分別訪問帶有
header[User-Agent]=Android要求標頭的應用和不帶要求標頭的應用。curl -s http://${EXTERNAL_IP}/apis/echo |grep 'Pod Information:' -A 3 curl -sH "User-Agent: Android" http://${EXTERNAL_IP}/apis/echo | grep 'Pod Information:' -A 3預期輸出:
Pod Information: node name: version1 pod name: echoserver-69598f9458-7c66v pod namespace: default Pod Information: node name: version2 pod name: echoserver-fvhzg-687b4b56-qbhc8 pod namespace: default可以看到,兩個請求分別返回了
version1和version2,說明要求標頭路由生效。 -
手動進行階段切換。
kubectl-kruise rollout approve rollouts/rollouts-demo -n default -
持續查看rollout狀態。
kubectl get rollouts rollouts-demo -n default -w預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Progressing 2 StepTrafficRouting Rollout is in step(2/3), and upgrade workload to new version 26m rollouts-demo Progressing 2 StepMetricsAnalysis Rollout is in step(2/3), and upgrade workload to new version 26m rollouts-demo Progressing 2 StepPaused Rollout is in step(2/3), and upgrade workload to new version 26m rollouts-demo Progressing 2 StepPaused Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 26m rollouts-demo Progressing 2 StepReady Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 27m rollouts-demo Progressing 3 BeforeStepUpgrade Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepTrafficRouting Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepMetricsAnalysis Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepPaused Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 StepReady Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 Completed Rollout is in step(3/3), and upgrade workload to new version 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 27m rollouts-demo Progressing 3 Completed Rollout progressing has been completed 27m rollouts-demo Healthy 3 Completed Rollout progressing has been completed 27m可以看到,
approve之後rollout資源進入階段二,並在等待時間之後自動進入階段三,最終STATUS=Healthy,且CANARY_STATE=Completed,表明本次rollout已經全部完結。
-
-
-
(可選)若新版本服務異常,可進行業務復原。
如果在Rollout過程中,發現新版本服務異常,可以通過Deployment配置恢複到之前版本。然後通過
kubectl apply -f echoserver.yaml命令進行部署,無需對Rollout資源做任何改動。# echoserver.yaml apiVersion: apps/v1 kind: Deployment metadata: name: echoserver ... spec: ... containers: - name: echoserver # mac m1 can choice image e2eteam/echoserver:2.2-linux-arm. image: openkruise-registry.cn-shanghai.cr.aliyuncs.com/openkruise/demo:1.10.2 imagePullPolicy: IfNotPresent env: - name: NODE_NAME value: version1
情境二:基於Pod執行個體個數灰階的分批發布(基於Nacos等微服務架構的應用)
大多基於微服務架構的應用(例如Nacos)部署到K8s叢集時,並不需要配置對應的Service和Ingress,流量調度部分微服務架構已經整合。因此該類型的應用更適合使用Kruise Rollout的分批發布能力。
由於流量灰階的部分則由微服務架構提供,因此當前情境將跳過新舊版本的驗證流程和結果,只示範Rollout階段切換。
-
定義並部署Kruise Rollout灰階發布規則。
以下Rollout資源將定義灰階發布規則(無需配置
trafficRoutings欄位),發布分為三批:-
第一批:灰階1個Pod。
-
第二批:灰階50%的Pod。
-
第三批:將灰階完所有的執行個體。
# 將如下內容儲存到檔案rollout.yaml。 apiVersion: rollouts.kruise.io/v1alpha1 kind: Rollout metadata: name: rollouts-demo annotations: rollouts.kruise.io/rolling-style: partition spec: objectRef: workloadRef: apiVersion: apps/v1 kind: Deployment # Deployment Name name: echoserver strategy: canary: steps: # 第1步:更新1個Pod,然後暫停等待手動確認。 - replicas: 1 pause: {} # 手動決定是否進入下一批次 # 第2步:更新50%的Pod執行個體。 - replicas: 50% # 暫停60秒後自動進入下一批次。 pause: {duration: 60} # 第3步:全量發布,更新所有Pod到新版本。 - replicas: 100% pause: {duration: 60} -
-
修改echoserver.yaml檔案,將echoserver服務鏡像版本升級到1.10.3。
# echoserver.yaml apiVersion: apps/v1 kind: Deployment metadata: name: echoserver ... spec: ... containers: - name: echoserver # mac m1 can choice image e2eteam/echoserver:2.2-linux-arm image: openkruise-registry.cn-shanghai.cr.aliyuncs.com/openkruise/demo:1.10.3 imagePullPolicy: IfNotPresent env: - name: NODE_NAME # 可選操作。此處為清晰展示灰階效果,將value改為version2。 value: version2 -
查看Rollout資源的狀態。
kubectl get rollouts rollouts-demo -n default預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Progressing 1 StepPaused Rollout is in step(1/3), and you need manually confirm to enter the next step 41m通過預期輸出的
STATUS和CANARY,可觀察Rollout的過程以及步驟。-
若預期輸出
STATUS=Progressing:表明已經在金絲雀發布過程中。 -
若預期輸出
CANARY_STEP=1:表明當前處於第一批次。 -
若預期輸出
CANARY_STATE=StepPaused:表明當前批次已經完成,是否需要繼續,可以通過人工確認。
-
-
手動進行階段切換。
kubectl-kruise rollout approve rollouts/rollouts-demo -n default -
持續查看rollout狀態。
kubectl get rollouts rollouts-demo -n default -w預期輸出:
NAME STATUS CANARY_STEP CANARY_STATE MESSAGE AGE rollouts-demo Progressing 2 StepPaused Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 45m rollouts-demo Progressing 2 StepReady Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 45m rollouts-demo Progressing 3 BeforeStepUpgrade Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(2/3), and wait duration(60 seconds) to enter the next step 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 45m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepUpgrade Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepMetricsAnalysis Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepPaused Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 StepReady Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 Completed Rollout is in step(3/3), and upgrade workload to new version 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout has been completed and some closing work is being done 46m rollouts-demo Progressing 3 Completed Rollout progressing has been completed 46m rollouts-demo Healthy 3 Completed Rollout progressing has been completed 46m