Configure liveness, readiness, and startup probes using HTTP GET, TCP socket, or custom commands to monitor container health and prevent traffic to failed instances.
Limitation
Health checks are available only for services deployed using custom images that include health check logic.
How it works
EAS uses the Kubernetes health check mechanism with probes and health check methods to monitor service health and availability.
-
Probe types:
Probe type
Description
Liveness probes
Determines if a container is running. If a liveness probe fails, kubelet kills the container and applies the restart policy. If no liveness probe exists, kubelet assumes the probe always returns
Success.Readiness probes
Determines if a container is ready to serve requests. Only ready Pods can receive traffic. The association between a Service and its Endpoints depends on Pod readiness:
-
If a Pod is not ready, its IP address is removed from the Endpoint list.
-
When the Pod becomes ready, its IP address is added back.
Startup probes
Determines when a container's application has started. This delays liveness and readiness checks until the container fully initializes, preventing termination of slow-starting containers.
-
-
Health check methods:
Health check method
Description
http_getSends an HTTP GET request to check service health. The response status code determines success.
tcp_socketOpens a TCP connection to check service health.
execExecutes a specified command inside the container. The check succeeds if the command exits with code 0.
Prepare a custom image
Use a web framework to encapsulate your prediction logic. The following example uses the Flask framework and an app.py file:
import json
from flask import Flask, request, make_response
app = Flask(__name__)
@app.route('/', methods = ['GET','POST'])
def process_handle_func():
"""
Parse the request body based on your requirements.
"""
data = request.get_data().decode('utf-8')
body = json.loads(data)
res = process(body)
"""
Set the response based on your requirements.
"""
response = make_response(res)
response.status_code = 200
return response
def process(data):
"""
Your prediction logic
"""
return 'result'
if __name__ == '__main__':
"""
Note: Set host to '0.0.0.0'. Otherwise, the health check fails during service deployment.
The port must match the port specified in the JSON deployment configuration file for the service.
"""
app.run(host='0.0.0.0', port=8000)
Write a simple Dockerfile to copy the prediction code and install required packages. The following is an example Dockerfile:
# This example uses Python.
FROM registry.cn-shanghai.aliyuncs.com/eas/bashbase-amd64:0.0.1
COPY ./process_code /eas
RUN /xxx/pip install required_packages
CMD ["/xxx/python", "/eas/xxx/app.py"]
For steps on building a custom image, see Build Images on a Container Registry Enterprise Edition Instance. For more information, see Custom images. Alternatively, store your code in a NAS file system or Git repository and attach it to the service instance via storage mount during deployment. For more information, see Storage mount. This topic describes how to configure health checks during service deployment.
Configure health checks
Custom deployment
-
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Elastic Algorithm Service (EAS).
-
On the Inference Service tab, click Deploy Service. In the Custom Model Deployment section, click Custom Deployment.
-
In the Environment Information section, configure the key parameters. For other parameters, see Deploy a custom inference service.
Parameter
Description
Image Configuration
Select Image Address and enter the custom image address. For example,
registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz.Command
Entry command for the image. Only single commands are supported, not complex scripts. This command must match the Dockerfile command. For example,
/data/eas/ENV/bin/python /data/eas/app.py.Enter a port number. This is the local HTTP port the image listens on after starting, such as 8000.
Important-
The EAS engine listens on fixed ports 8080 and 9090. Your container port cannot be 8080 or 9090.
-
This port must match the port specified in the xxx.py file referenced by the run command.
Health Check
Turn on the Health Check switch, configure the parameters, and click OK. For parameter details, see the Health check parameters table.
NoteYou can add up to three health checks, each with a unique probe type.
-
-
After configuring the parameters, click Deploy.
JSON deployment
Create a JSON file named service.json. The following is an example of the file content.
{
"metadata": {
"name": "test",
"instance": 1,
"enable_webservice": true
},
"cloud": {
"computing": {
"instance_type": "ml.gu7i.c16m60.1-gu30"
}
},
"containers": [
{
"image":"registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz",
"env":[
{
"name":"VAR_NAME",
"value":"var_value"
}
],
"liveness_check":{
"http_get":{
"path":"/",
"port":8000
},
"initial_delay_seconds":3,
"period_seconds":3,
"timeout_seconds":1,
"success_threshold":2,
"failure_threshold":4
},
"command":"/data/eas/ENV/bin/python /data/eas/app1.py",
"port":8000
}
]
}
The following table describes the key parameters. For other parameters, see JSON deployment.
|
Parameter |
Description |
|
|
image |
Address of the custom image used to deploy the model service. EAS does not provide public network access. Use a VPC internal registry address for deployment. For example, |
|
|
env |
name |
Name of the environment variable. |
|
value |
Value of the environment variable. |
|
|
command |
Entry command for an image. Supports only single command format, not complex scripts. For example: |
|
|
port |
Network port that the process in the image listens on. For example, 8000. Important
This port must match the port configured in the xxx.py file specified in the command. |
|
|
liveness_check Note
This example uses a liveness probe. You can also use |
http_get |
Uses the HTTP GET method to check the specified port. Parameters:
Two other health check methods:
|
|
initial_delay_seconds |
Delay in seconds after the container starts before the first health check runs. Default: 0. |
|
|
period_seconds |
Interval in seconds between health checks. Default: 10. A short interval increases Pod overhead, while a long interval delays failure detection. |
|
|
timeout_seconds |
Number of seconds after which the health check times out. Default: 1. A timeout is marked as failure. |
|
|
failure_threshold |
Number of consecutive failures after a success required to mark the container as failed. Default: 3 for readiness probe, 1 for liveness and startup probes. |
|
|
success_threshold |
Number of consecutive successes required after a failure to mark the container as successful. Default: 1. |
|