Use the Scalable Job service for AI Portrait inference to prevent resource underutilization and request interruptions during scale-in.
Prerequisites
A virtual private cloud (VPC) is created and Internet access is enabled for the VPC.
A VPC, vSwitch, and security group are created. For more information, see VPCs and vSwitchesand Use security groups.
An Internet NAT gateway is created in the VPC. An elastic IP address (EIP) is associated with the gateway and SNAT entries are configured on the gateway. For more information, see Internet NAT gateway.
For model training and portrait creation, 5 to 20 training images and 1 template image are prepared. The following image formats are supported:
.jpg,.jpeg, and.png. Make sure that the size of each image is greater than 512 x 512 pixels.Single-person portrait: The template image must contain the face of a person. The faces in multiple training images belong to the same person.
Multi-person portrait: The template image must contain multiple faces, and the number of faces must be the same as the value of the model_id parameter specified for model training.
An OSS bucket is created. For more information, see Create a bucket.
Limitations
The AI portrait solution is available only in the China (Beijing) and Singapore regions.
Deploy a scalable job service for inference
Deploy a verification service
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.
On the Custom Deployment page, configure the following parameters. Use default values for the rest. For more information, see Custom Deployment.
In the Basic Information section, set the service name. Example:
photog_check.In the Environment Information section, configure the following parameters:
Parameter
Description
Deployment Method
Select Image-based Deployment and then select Asynchronous Queue.
Image Configuration
Select Image Address and enter the image address:
China (Beijing):
registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:check.1.0.0.pub.Singapore:
registry.ap-southeast-1.aliyuncs.com/mybigpai/photog_pub:check.1.0.0.pub.
Code Build
Select OSS as the mount type and set the following parameters:
Uri: Your OSS bucket path. Example:
oss://examplebucket/.Mount Path: Set to
/photog_oss.
Command to Run
Set to
python app.py.Port Number
Set to 7860.
In the Resource Information section, set the following parameters:
Parameter
Description
Resource Type
Select Public Resources.
Deployment
Select a GU30-series instance type on the GPU tab. Recommended:
ml.gu7i.c32m188.1-gu30.Configure a system disk
Set to 120 GiB.
In the Asynchronous Queue section, set the following parameters:
Parameter
Description
Resource Type
Select Public Resources.
Deployment
Number of replicas: 1
CPU (cores): 8
Memory (GB): 64
Maximum Data for A Single Input Request
Set to 20480 KB to ensure sufficient storage for each request in the queue.
Maximum Data for A Single Output
In the Service Access section, select the VPC, vSwitch, and security group you created.
In the Service Configurations section, add the following configurations. Refer to the complete configuration example below for the new parameters.
Field
New parameters
metadata
Add the following parameters:
{ "metadata": { "name": "photog_check", "instance": 1, "rpc": { "keepalive": 3600000, "worker_threads": 1 }, "type": "Async" }, "cloud": { "computing": { "instance_type": "ml.gu7i.c32m188.1-gu30", "instances": null }, "networking": { "vswitch_id": "vsw-2ze4o9kww55051tf2****", "security_group_id": "sg-2ze0kgiee55d0fn4****", "vpc_id": "vpc-2ze5hl4ozjl4fo7q3****" } }, "features": { "eas.aliyun.com/extra-ephemeral-storage": "100Gi" }, "queue": { "cpu": 8, "max_delivery": 1, "min_replica": 1, "memory": 64000, "resource": "", "source": { "max_payload_size_kb": 20480 }, "sink": { "max_payload_size_kb": 20480 } }, "storage": [ { "oss": { "path": "oss://examplebucket/", "readOnly": false }, "properties": { "resource_type": "code" }, "mount_path": "/photog_oss" } ], "containers": [ { "image": "registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:check.1.0.0.pub", "script": "python app.py", "port": 7860 } ] }keepalive: Maximum processing time for a single request, in milliseconds. Set to 3600000.
worker_threads: Number of concurrent processing threads per Elastic Algorithm Service (EAS) instance.
Default value: 5, which means the first five queued tasks are assigned to the same instance. Set to 1 to process requests sequentially.
queue
Add
"max_delivery": 1to prevent repeated deliveries after a failure.
Click Deploy.
Deploy a training service
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
Click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.
On the Custom Deployment page, configure the following parameters. Use default values for the rest. For more information, see Custom Deployment.
In the Basic Information section, set the service name. Example:
photog_train_pmml.In the Environment Information section, set the following parameters:
Parameter
Description
Deployment Method
Select Image-based Deployment and then select Asynchronous Queue.
Image Configuration
Select Image Address and enter the image address:
China (Beijing):
registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:train.1.0.0.pub.Singapore:
registry.ap-southeast-1.aliyuncs.com/mybigpai/photog_pub:train.1.0.0.pub.
Code configuration
Select OSS as the mount type and set the following parameters:
Uri: Path to your OSS bucket. Must match the path specified for the verification service. Example:
oss://examplebucket/.Mount Path: Set to
/photog_oss.
Command to Run
Set to
python app.py.Port Number
Set to 7860.
In the Resource Information section, set the following parameters:
Parameter
Description
Resource Type
Select Public Resources.
Deployment
Select a GU30-series instance type on the GPU tab. Recommended:
ml.gu7i.c32m188.1-gu30.Configure a system disk
Set to 120 GiB.
In the Asynchronous Queue section, set the following parameters:
Parameter
Description
Resource Type
Select Public Resources.
Deployment
Number of replicas: 1
CPU (cores): 8
Memory (GB): 64
Maximum Data for A Single Input Request
Set to 20480 KB to ensure sufficient storage for each request in the queue.
Maximum Data for A Single Output
In the Service Access section, select the VPC, vSwitch, and security group you created.
In the Service Configurations section, add the following configurations. Refer to the complete configuration example below for the new parameters.
Field
New parameters
autoscaler
(Optional) Horizontal auto scaling configuration. For more information, see Horizontal auto scaling.
{ "autoscaler": { "behavior": { "scaleDown": { "stabilizationWindowSeconds": 60 } }, "max": 5, "min": 1, "strategies": { "queue[backlog]": 1 } }, "metadata": { "name": "photog_train_pmml", "instance": 1, "rpc": { "keepalive": 3600000, "worker_threads": 1 }, "type": "Async" }, "cloud": { "computing": { "instance_type": "ml.gu7i.c32m188.1-gu30", "instances": null }, "networking": { "vswitch_id": "vsw-2ze4o9kww55051tf2****", "security_group_id": "sg-2ze0kgiee55d0fn4****", "vpc_id": "vpc-2ze5hl4ozjl4fo7q3****" } }, "features": { "eas.aliyun.com/extra-ephemeral-storage": "120Gi" }, "queue": { "cpu": 8, "max_delivery": 1, "min_replica": 1, "memory": 64000, "resource": "", "source": { "max_payload_size_kb": 20480 }, "sink": { "max_payload_size_kb": 20480 } }, "storage": [ { "oss": { "path": "oss://examplebucket/", "readOnly": false }, "properties": { "resource_type": "code" }, "mount_path": "/photog_oss" } ], "containers": [ { "image": "registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:train.1.0.0.pub", "script": "python app.py", "port": 7860 } ] }metadata
Add the following parameters:
{ "metadata": { "name": "photog_check", "instance": 1, "rpc": { "keepalive": 3600000, "worker_threads": 1 }, "type": "Async" }, "cloud": { "computing": { "instance_type": "ml.gu7i.c32m188.1-gu30", "instances": null }, "networking": { "vswitch_id": "vsw-2ze4o9kww55051tf2****", "security_group_id": "sg-2ze0kgiee55d0fn4****", "vpc_id": "vpc-2ze5hl4ozjl4fo7q3****" } }, "features": { "eas.aliyun.com/extra-ephemeral-storage": "100Gi" }, "queue": { "cpu": 8, "max_delivery": 1, "min_replica": 1, "memory": 64000, "resource": "", "source": { "max_payload_size_kb": 20480 }, "sink": { "max_payload_size_kb": 20480 } }, "storage": [ { "oss": { "path": "oss://examplebucket/", "readOnly": false }, "properties": { "resource_type": "code" }, "mount_path": "/photog_oss" } ], "containers": [ { "image": "registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:check.1.0.0.pub", "script": "python app.py", "port": 7860 } ] }keepalive: Maximum processing time for a single request, in milliseconds. Set to 3600000.
worker_threads: Number of concurrent processing threads per EAS instance.
Default value: 5, which means the first five queued tasks are assigned to the same instance. Set to 1 to process requests sequentially.
queue
Add
"max_delivery": 1to prevent multiple redeliveries after a failure.
Click Deploy.
Deploy a prediction service
The prediction service is deployed as a Scalable Job service.
Click Deploy Service. In the Custom Model Deployment section, click JSON Deployment.
Enter the configuration in the JSON editor.
{ "metadata": { "name": "photog_pre_pmml", "instance": 1, "rpc": { "keepalive": 3600000, "worker_threads": 1 }, "type": "ScalableJob" }, "cloud": { "computing": { "instance_type": "ecs.gn6v-c8g1.2xlarge", "instances": null }, "networking": { "vswitch_id": "vsw-2ze4o9kww55051tf2****", "security_group_id": "sg-2ze0kgiee55d0fn4****", "vpc_id": "vpc-2ze5hl4ozjl4fo7q3****" } }, "features": { "eas.aliyun.com/extra-ephemeral-storage": "120Gi" }, "queue": { "cpu": 8, "max_delivery": 1, "min_replica": 1, "memory": 64000, "resource": "", "source": { "max_payload_size_kb": 20480 }, "sink": { "max_payload_size_kb": 20480 } }, "storage": [ { "oss": { "path": "oss://examplebucket/", "readOnly": false }, "properties": { "resource_type": "code" }, "mount_path": "/photog_oss" } ], "containers": [ { "image": "registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:infer.1.0.0.pub", "env": [ { "name": "URL", "value": "http://127.0.0.1:8000" }, { "name": "AUTHORIZATION", "value": "=" } ], "script": "python app.py", "port": 7861 }, { "image": "eas-registry-vpc.cn-beijing.cr.aliyuncs.com/pai-eas/stable-diffusion-webui:3.2", "port": 8000, "script": "./webui.sh --listen --port 8000 --skip-version-check --no-hashing --no-download-sd-model --skip-install --api --filebrowser --sd-dynamic-cache --data-dir /photog_oss/webui/" } ] }The following table describes key parameters. For details about other parameters, see JSON Deployment.
Parameter
Description
metadata
Name
Service name. Must be unique within the region.
Type
Set to ScalableJob to deploy the asynchronous inference service as a Scalable Job service.
containers
Image
Image addresses for the AI portrait prediction service and WebUI prediction service. The following list provides supported images. This solution uses images for the China (Beijing) region.
Image addresses for China (Beijing):
AI portrait prediction service:
registry.cn-beijing.aliyuncs.com/mybigpai/photog_pub:infer.1.0.0.pub.WebUI prediction service:
eas-registry-vpc.cn-beijing.cr.aliyuncs.com/pai-eas/stable-diffusion-webui:3.2.
Image addresses for Singapore:
AI portrait prediction service:
registry.ap-southeast-1.aliyuncs.com/mybigpai/photog_pub:infer.1.0.0.pub.WebUI prediction service:
eas-registry-vpc.ap-southeast-1.cr.aliyuncs.com/pai-eas/stable-diffusion-webui:3.2.
storage
Path
OSS mount path. Use the same OSS bucket path as the verification service. Example:
oss://examplebucket/.Download and extract the model files required by WebUI, and store them in your OSS bucket at
oss://examplebucket/photog_oss/webuiwith the directory structure shown below. For more information about uploading files to OSS, see Command-line tool ossutil 1.0. For more information about uploading files to NAS, see Quick start (Linux) and Use Workbench to manage files on an ECS instance.
Mount path
Set to
/photog_oss.Click Deploy.
When you deploy a Scalable Job service, a queue service is automatically created with horizontal auto scaling enabled by default.
Call the service
After the service is deployed, call it to generate AI portraits.
When calling the service, set taskType to query for an inference request, as described in Call services. use the following sample code:
import json
from eas_prediction import QueueClient
# Create an input queue client to write input data.
input_queue = QueueClient('182848887922****.cn-shanghai.pai-eas.aliyuncs.com', 'photog_check')
input_queue.set_token('<token>')
input_queue.init()
datas = json.dumps(
{
'request_id' : 12345,
'images' : ["xx.jpg", "xx.jpg"], # urls, a list
'configure' : {
'face_reconize' : True, # Checks if all pictures are of the same person.
}
}
)
# Specify taskType as query.
tags = {"taskType": "query"}
index, request_id = input_queue.put(f'{datas}', tags)
print(index, request_id)
# View the details of the input queue.
attrs = input_queue.attributes()
print(attrs)Related documentation
For information about how to use the Scalable Job service for training, see Deploy a scalable Kohya training service.
For information about the Scalable Job service, see Scalable Job service overview.