>> Get hands-on experience with this tutorial in a lab environment.
This lab shows how to use Knative to deploy an enterprise-class elastic Stable Diffusion Service in a serverless Kubernetes (ASK) cluster, and then run stress tests to verify the elasticity of the ASK cluster.
Experiment keywords | ASK Knative, Stable Diffusion, enterprise-class, elastic |
Experiment duration | 30 minutes |
Difficulty | Easy |
What you can learn from the experiment | How to create an enterprise-class elastic Stable Diffusion Service Experience the elasticity of ASK Use the Stable Diffusion AI to generate images |
Whether experiment resources and an environment are required | No |
Note
Cautions
● Alibaba Cloud does not guarantee the authenticity, security, and accuracy of third-party models. Alibaba Cloud is not liable for any loss or damage arising from use of third-party models.
● To use third-party models, you shall comply with the user agreement, follow the specifications for proper usage, and adhere to the corresponding laws and regulations. You are liable for the authenticity and security of the third-party models that you use.
1. Click and log on to the Microservices Engine (MSE) console on the right.
2. Click Immediate authorization to activate MSE cloud native gateways.
Click and open the Cloud Resource Access Authorization page on the right window and click Confirm Authorization Policy to authorize MSE to access Elastic Container Instance (ECI).
1. Log on to the ACK console. In the left-side navigation pane, click Clusters.
2. In the upper-right corner of the Clusters page, click Create Kubernetes Cluster.
3. Click the Serverless Kubernetes tab and configure the following parameters as required. Keep the default settings for other parameters.
Parameter | Description | Example |
Cluster Name | Enter a name for the cluster. | knative-sd-demo |
Cluster Specification | Select a cluster type. Valid values: Professional and Standard edition. | Professional |
Region | Select the region in which the cluster is deployed. We recommend that you select the China (Hong Kong) region. | China (Hong Kong) |
VPC | Configure a virtual private cloud (VPC) for the cluster. Container Service for Kubernetes (ACK) clusters support only VPCs. You can choose Create VPC or Select Existing VPC. | Create VPC |
Zone | Specify the zone in which the cluster is deployed. The following zones are available in the China (Beijing) region: Hongkong Zone B Hongkong Zone C Hongkong Zone D |
Hongkong Zone D |
1. Click Next:Component Configurations. Then, set Ingress to MSE Ingress and select Enable Knative. Keep the default settings for other parameters. If you do not need to use Log Service, clear Enable Log Service in case unexpected service fees are incurred.
2. Click Next:Confirm Order. Then, confirm the configurations, read and select Terms of Service, and click Create Cluster.
1. On the Clusters page, click the cluster named knative-sd-demo to go to the cluster information page. In the left-side navigation pane, choose Applications > Knative.
2. On the Knative page, click the Services tab and then click Create from Template.
3. Select Resouce - Knative Service from the Sample Template drop-down list. Paste the following YAML content to the template and click Create. The YAML content creates a message processing Service.
By default, a Service named knative-sd-demo is created.
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: knative-sd-demo
annotations:
serving.knative.dev.alibabacloud/affinity: "cookie"
serving.knative.dev.alibabacloud/cookie-name: "sd"
serving.knative.dev.alibabacloud/cookie-timeout: "1800"
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/class: mpa.autoscaling.knative.dev
autoscaling.knative.dev/maxScale: '10'
autoscaling.knative.dev/targetUtilizationPercentage: "100"
k8s.aliyun.com/eci-use-specs: ecs.gn5-c4g1.xlarge,ecs.gn5i-c8g1.2xlarge,ecs.gn5-c8g1.2xlarge
spec:
containerConcurrency: 1
containers:
- args:
- --listen
- --skip-torch-cuda-test
- --api
command:
- python3
- launch.py
image: yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/stable-diffusion@sha256:62b3228f4b02d9e89e221abe6f1731498a894b042925ab8d4326a571b3e992bc
imagePullPolicy: IfNotPresent
ports:
- containerPort: 7860
name: http1
protocol: TCP
name: stable-diffusion
readinessProbe:
tcpSocket:
port: 7860
initialDelaySeconds: 5
periodSeconds: 1
failureThreshold: 3
4. On the Services tab, refresh the page. After the status of the knative-sd-demo Service changes to Created, the Stable Diffusion Service is deployed.
Create a traffic generator named portal-server to test the Stable Diffusion Service and perform stress tests.
1. On the Knative page, click the Services tab and then click Create from Template.
2. Select Custom from the Sample Template drop-down list. Paste the following YAML content to the template and click Create. The YAML content creates a load generator named portal-server.
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: portal-server
name: portal-server
spec:
replicas: 1
selector:
matchLabels:
app: portal-server
template:
metadata:
labels:
app: portal-server
spec:
serviceAccountName: portal-server
containers:
- name: portal-server
image: registry-vpc.cn-hongkong.aliyuncs.com/acs/sd-yunqi-server:v1.0.2-en
imagePullPolicy: IfNotPresent
env:
- name: MAX_CONCURRENT_REQUESTS
value: "5"
- name: POD_NAMESPACE
value: "default"
readinessProbe:
failureThreshold: 3
periodSeconds: 1
successThreshold: 1
tcpSocket:
port: 8080
timeoutSeconds: 1
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: internet
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type: PayByCLCU
name: portal-server
spec:
externalTrafficPolicy: Local
ports:
- name: http-80
port: 80
protocol: TCP
targetPort: 8080
- name: http-8888
port: 8888
protocol: TCP
targetPort: 8888
selector:
app: portal-server
type: LoadBalancer
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-list-cluster-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pod-list-cluster-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: pod-list-cluster-role
subjects:
- kind: ServiceAccount
name: portal-server
namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: portal-server
namespace: default
3. In the left-side navigation pane, choose Network > Services.
4. On the Services page, find the load generator named portal-server. The IP address of the load generator is 8.217.XX.XX.
5. Enter http://8.217.XX.XX
into the address bar of your web browser. On the page that appears, click Stable Diffusion to navigate to the Stable Diffusion page.
a) The following figure shows the Stable Diffusion page. For example, enter cat into the prompt text box in the following figure and click Generate. Then, an image is generated based on the prompt.
b) On the stress test page, set Concurrency to 5 and Total Number of Requests to 20, and then click Start to view the stress test result.
During the stress test, five pods are created and one image is generated for each request. The generated images are displayed on the page.
ASK clusters are in public preview and free to use. However, ASK clusters may need to access other Alibaba Cloud services. You are charged based on the billing rules of the cloud services used by ASK clusters. After you complete the experiment, you can handle the ASK Pro cluster in one of the following ways:
● Continue to use the ASK Pro cluster. For more information about the billing of the Alibaba Cloud services used by ASK Pro clusters, see Billing of cloud services.
● If you no longer need to use the ASK Pro cluster, perform the following steps to release resources in case unexpected fees are incurred.
1. On the Clusters page, find your ASK Pro cluster and choose More > Delete in the Actions column.
2. In the Delete Cluster dialog box, select Delete ALB Instances Created by the Cluster, Delete Alibaba Cloud DNS PrivateZone instances Created by the Cluster, and I understand the above information and want to delete the specified cluster, and then click OK.
2. Enter the verification code and click OK to delete the ASK Pro cluster.
1. Log on to the MSE console on the right.
3. On the top navigation bar of the Gateways page, select the China (Hong Kong) region in which the MSE cloud native gateway is deployed, click in the Actions column of the gateway, and then click Remove Instance.
4. In the Remove dialog box, select Delete Purchased SLB Instance and click OK to delete the MSE cloud native gateway.
1. Log on to the NAT Gateway console on the right.
2. In the left-side navigation pane, choose NAT Gateway > Internet NAT Gateway.
3. In the top navigation bar of the Internet NAT Gateway page, select the China (Hong Kong) region in which the NAT gateway is deployed, click in the Actions column of the NAT gateway, and then click Delete.
4. In the Delete Gateway dialog box, select Delete (Delete NAT gateway and resources) and click OK to delete the NAT gateway.
Get hands-on experience with this tutorial in a lab environment.
Hands-on Labs | Get Started with Flink MySQL Connector in 5 Minutes
Comprehensive Reviews of Alibaba Cloud Model Studio: Insights from Alibaba Cloud MVPs Worldwide
1,091 posts | 292 followers
FollowAlibaba Cloud Native Community - September 19, 2023
Alibaba Cloud Community - December 12, 2023
Alibaba Cloud Community - July 3, 2024
Alibaba Cloud Data Intelligence - April 22, 2024
Alibaba Cloud Community - July 3, 2024
Alibaba Cloud Data Intelligence - December 5, 2023
1,091 posts | 292 followers
FollowAlibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn MoreAlibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.
Learn MoreVisualization, O&M-free orchestration, and Coordination of Stateful Application Scenarios
Learn MoreServerless Application Engine (SAE) is the world's first application-oriented serverless PaaS, providing a cost-effective and highly efficient one-stop application hosting solution.
Learn MoreMore Posts by Alibaba Cloud Community