In Elastic Algorithm Service (EAS), you can define and deploy online services using a JSON configuration file. Once you prepare the file, you can deploy the service using the PAI console, the EASCMD client, or SDKs.
1. Prepare a JSON configuration file
Service deployment centers on a JSON file that defines the service configuration. For first-time users, we recommend using the console to configure basic settings on the service deployment page. The system automatically generates the corresponding JSON content, which you can then modify and extend.
The following code provides an example of a service.json file. For a complete list of parameters and their descriptions, see Appendix: JSON parameter descriptions.
{
"cloud": {
"computing": {
"instances": [
{
"type": "ecs.c7a.large"
}
]
}
},
"containers": [
{
"image": "****-registry.cn-beijing.cr.aliyuncs.com/***/***:latest",
"port": 8000,
"script": "python app.py"
}
],
"metadata": {
"cpu": 2,
"instance": 1,
"memory": 4000,
"name": "demo"
}
}2. Deploy a service using a JSON file
Console
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
On the Inference Service tab, click Deploy Service. On the Deploy Service page, select .
Paste the content of your prepared JSON file and click Deploy. The service is successfully deployed when its status changes to Running.
EASCMD
You can use the EASCMD client tool to manage model services on your own server, including creating, viewing, deleting, and updating services. Follow these steps:
Download and authenticate the client
If you use a Data Science Workshop (DSW) development environment with an official image, the EASCMD client is pre-installed at
/etc/dsw/eascmd64. Otherwise, you must download and authenticate the client.Run the deployment command
In the directory where your JSON file is located, run the following command to deploy the service. This example uses the 64-bit version for Windows. For more information about other operations, see Command reference.
eascmdwin64 create <service.json>Replace <service.json> with the actual name of your JSON file.
NoteIf you use a DSW development environment and need to upload the JSON configuration file. For more information, see Upload and download files.
The system returns a result similar to the following.
[RequestId]: 1651567F-8F8D-4A2B-933D-F8D3E2DD**** +-------------------+----------------------------------------------------------------------------+ | Intranet Endpoint | http://166233998075****.cn-shanghai.pai-eas.aliyuncs.com/api/predict/test_eascmd | | Token | YjhjOWQ2ZjNkYzdiYjEzMDZjOGEyNGY5MDIxMzczZWUzNGEyMzhi**** | +-------------------+--------------------------------------------------------------------------+ [OK] Creating api gateway [OK] Building image [registry-vpc.cn-shanghai.aliyuncs.com/eas/test_eascmd_cn-shanghai:v0.0.1-20221122114614] [OK] Pushing image [registry-vpc.cn-shanghai.aliyuncs.com/eas/test_eascmd_cn-shanghai:v0.0.1-20221122114614] [OK] Waiting [Total: 1, Pending: 1, Running: 0] [OK] Waiting [Total: 1, Pending: 1, Running: 0] [OK] Service is running
Appendix: JSON parameter descriptions
Parameter | Required | Description |
metadata | Yes | The metadata of the service. For more information about the parameters, see metadata parameter descriptions. |
cloud | No | The configurations of computing resources and VPCs. For more information, see cloud parameter descriptions. |
containers | No | The image configurations. For more information, see containers parameter descriptions. |
dockerAuth | No | When the image comes from a private repository, set dockerAuth to a Base64-encoded string of the image repository's |
networking | No | The call configurations of the service. For more information about the parameters, see networking parameter descriptions. |
storage | No | The information about service storage mounting. For more information about the configurations, see Mount storage. |
token | No | The token string for access authentication. If you do not specify this parameter, the system automatically generates a token. |
aimaster | No | You can enable Computing power check and fault tolerance for the multi-machine distributed inference service. |
model_path | Yes | This parameter is required when you deploy a service using a processor. The model_path and processor_path parameters specify the paths of the input data sources for the model and processor. The following address formats are supported:
|
oss_endpoint | No | The endpoint of OSS. Example: oss-cn-beijing.aliyuncs.com. For other values, see OSS regions and endpoints. Note By default, you do not need to specify this parameter. The system uses the internal endpoint of OSS in the current region to download the model file or processor file. If you want to access OSS across regions, you must specify this parameter. For example, if you deploy a service in the China (Hangzhou) region and specify an OSS address in the China (Beijing) region for the model_path parameter, you must use this parameter to specify the public endpoint of OSS in the China (Beijing) region. |
model_entry | No | The entry file of the model. It can be any file. If you do not specify this parameter, the file name in model_path is used. The path of the main file is passed to the initialize() function in the processor. |
model_config | No | The configuration of the model. Any text is supported. The value of this parameter is passed to the second parameter of the initialize() function in the processor. |
processor | No |
|
processor_path | No | The path of the processor file package. For more information, see the description of the model_path parameter. |
processor_entry | No | The main file of the processor. Examples: libprocessor.so and app.py. The main file contains the implementations of the If you set processor_type to cpp or python, you must specify this parameter. |
processor_mainclass | No | The main class of the processor in the JAR package. Example: com.aliyun.TestProcessor. If you set processor_type to java, you must specify this parameter. |
processor_type | No | The language in which the processor is implemented. Valid values:
|
warm_up_data_path | No | The path of the request file used for model prefetching. For more information about the model prefetch feature, see Prefetch a model service. |
runtime.enable_crash_block | No | Specifies whether a service instance automatically restarts after it crashes due to an exception in the processor code. Valid values:
|
autoscaler | No | The configuration information for automatic horizontal scaling of the model service. For more information about the parameters, see Horizontal auto scaling. |
labels | No | The labels to configure for the EAS service. The format is |
unit.size | No | The number of machines deployed for a single instance in a distributed inference configuration. Default value: 2. |
sinker | No | Supports the persistence of all service requests and responses to MaxCompute or Simple Log Service (SLS). For more information about the parameters, see sinker parameter descriptions. |
confidential | No | By configuring the system trust management service, you can ensure that information such as data, models, and code is securely encrypted during service deployment and invocation. This enables secure and verifiable inference services. The format is as follows: Note The secure encryption environment is mainly for your mounted storage files. Complete the mounting of storage files before you enable this feature. The following table describes the parameters.
|
metadata parameter descriptions
Advanced parameters
cloud parameter descriptions
Parameter | Required | Description | |
computing | instances | No | When deploying a service using a public resource group, you must set this parameter to specify a list of resource specifications. If a spot instance bid fails or inventory is insufficient, the system attempts to create the service using the next instance specification in the configured order.
|
disable_spot_protection_period | No | When using spot instances, you must set this parameter. Valid values:
| |
networking | vpc_id | No | The VPC, vSwitch, and security group to bind to the EAS service. |
vswitch_id | No | ||
security_group_id | No | ||
Example:
{
"cloud": {
"computing": {
"instances": [
{
"type": "ecs.c8i.2xlarge",
"spot_price_limit": 1
},
{
"type": "ecs.c8i.xlarge",
"capacity": "20%"
}
],
"disable_spot_protection_period": false
},
"networking": {
"vpc_id": "vpc-bp1oll7xawovg9*****",
"vswitch_id": "vsw-bp1jjgkw51nsca1e****",
"security_group_id": "sg-bp1ej061cnyfn0b*****"
}
}
}containers parameter descriptions
When deploying a service using a custom image, see Custom images.
Parameter | Required | Description | |
image | Yes | Required when deploying with an image. The address of the image used to deploy the model service. | |
env | name | No | The name of an environment variable for the container. |
value | No | The value of the environment variable for the container. | |
command | One of the two is required. | The entry point command for the image. Only single commands are supported. Complex scripts, such as | |
script | The entry point script for the image. More complex script formats can be specified. Use | ||
port | No | The container port. Important
| |
prepare | pythonRequirements | No | A list of Python requirements to install before the instance starts. The image must have |
pythonRequirementsPath | No | The path to a | |
networking parameter descriptions
Parameter | Required | Description |
gateway | No | The dedicated gateway configured for the EAS service. |
gateway_policy | No |
Rate limit configuration example: |
sinker parameter descriptions
Parameter | Required | Description | |
type | No | The storage type for persisting records. The following types are supported:
| |
config | maxcompute.project | No | The MaxCompute project name. |
maxcompute.table | No | The MaxCompute data table. | |
sls.project | No | The SLS project name. | |
sls.logstore | No | The SLS Logstore. | |
The following are configuration examples:
Store in MaxCompute
"sinker": {
"type": "maxcompute",
"config": {
"maxcompute": {
"project": "cl****",
"table": "te****"
}
}
}Store in Simple Log Service
"sinker": {
"type": "sls",
"config": {
"sls": {
"project": "k8s-log-****",
"logstore": "d****"
}
}
}Appendix: JSON configuration example
The following is a sample JSON file that shows how the preceding parameters can be configured:
{
"token": "****M5Mjk0NDZhM2EwYzUzOGE0OGMx****",
"processor": "tensorflow_cpu_1.12",
"model_path": "oss://examplebucket/exampledir/",
"oss_endpoint": "oss-cn-beijing.aliyuncs.com",
"model_entry": "",
"model_config": "",
"processor_path": "",
"processor_entry": "",
"processor_mainclass": "",
"processor_type": "",
"warm_up_data_path": "",
"runtime": {
"enable_crash_block": false
},
"unit": {
"size": 2
},
"sinker": {
"type": "maxcompute",
"config": {
"maxcompute": {
"project": "cl****",
"table": "te****"
}
}
},
"cloud": {
"computing": {
"instances": [
{
"capacity": 800,
"type": "dedicated_resource"
},
{
"capacity": 200,
"type": "ecs.c7.4xlarge",
"spot_price_limit": 3.6
}
],
"disable_spot_protection_period": true
},
"networking": {
"vpc_id": "vpc-bp1oll7xawovg9t8****",
"vswitch_id": "vsw-bp1jjgkw51nsca1e****",
"security_group_id": "sg-bp1ej061cnyfn0b****"
}
},
"autoscaler": {
"min": 2,
"max": 5,
"strategies": {
"qps": 10
}
},
"storage": [
{
"mount_path": "/data_oss",
"oss": {
"endpoint": "oss-cn-shanghai-internal.aliyuncs.com",
"path": "oss://bucket/path/"
}
}
],
"confidential": {
"trustee_endpoint": "xx",
"decryption_key": "xx"
},
"metadata": {
"name": "test_eascmd",
"resource": "eas-r-9lkbl2jvdm0puv****",
"instance": 1,
"workspace_id": "1405**",
"gpu": 0,
"cpu": 1,
"memory": 2000,
"gpu_memory": 10,
"gpu_core_percentage": 10,
"qos": "",
"cuda": "11.2",
"enable_grpc": false,
"enable_webservice": false,
"rdma": 1,
"rpc": {
"batching": false,
"keepalive": 5000,
"io_threads": 4,
"max_batch_size": 16,
"max_batch_timeout": 50,
"max_queue_size": 64,
"worker_threads": 5,
"rate_limit": 0,
"enable_sigterm": false
},
"rolling_strategy": {
"max_surge": 1,
"max_unavailable": 1
},
"eas.termination_grace_period": 30,
"scheduling": {
"spread": {
"policy": "host"
}
},
"resource_rebalancing": false,
"workload_type": "elasticjob",
"shm_size": 100
},
"features": {
"eas.aliyun.com/extra-ephemeral-storage": "100Gi",
"eas.aliyun.com/gpu-driver-version": "tesla=550.127.08"
},
"networking": {
"gateway": "gw-m2vkzbpixm7mo****"
},
"containers": [
{
"image": "registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz",
"prepare": {
"pythonRequirements": [
"numpy==1.16.4",
"absl-py==0.11.0"
]
},
"command": "python app.py",
"port": 8000
}
],
"dockerAuth": "dGVzdGNhbzoxM*******"
}