All Products
Search
Document Center

Platform For AI:Automatic service stress testing

Last Updated:Dec 05, 2025

Elastic Algorithm Service (EAS) provides stress testing methods for large language model (LLM) services and general services. You can easily create stress testing tasks and perform stress testing with a few clicks. This helps you fully understand the performance of EAS services. This topic describes how to create and manage stress testing tasks.

Step 1: Go to the Create Stress Testing Task page

  1. Log on to the PAI console. On the page that appears, select the desired region in the top navigation bar. Then, select the workspace that you want to manage, and click Enter Elastic Algorithm Service (EAS).

  2. On the Benchmark Task tab, click Create Stress Testing Task.

Step 2: Create a stress testing task

LLM service stress testing

If your EAS service is an LLM service, we recommend that you select the LLM Service check box. This way, you will receive a stress testing report designed for LLM scenarios.

You can create LLM service stress testing tasks only in the console. The EASCMD client does not support such tasks.

Important

LLM service stress testing only supports two OpenAI APIs: /v1/completions and /v1/chat/completions. Therefore, only EAS services deployed by using inference engines that are compatible with the two OpenAI APIs, such as vLLM, SGLang, LMDeploy, and BladeLLM, can use the LLM service stress testing feature.

On the Create Stress Testing Task page, configure the following parameters. After the parameters are configured, click Confirm.

Basic information

image

Parameter

Description

Service

Select a service on which you want to perform stress testing and select LLM Service.

Service Endpoint

Only the following two OpenAI API-compatible endpoints are supported:

  • Completions: /v1/completions, used for single-turn completion tasks.

  • Chat: /v1/chat/completions, used for multi-turn conversational tasks.

Stress Testing URL

After you select the service endpoint, the system automatically generates a URL.

Model ID

Required. Enter the ID of the model from the ModelScope or Hugging Face open source community. Example: Qwen/Qwen2.5-7B-Instruct.

  • Used to load the related tokenizer to accurately calculate the number of tokens during stress testing.

  • Used as the model parameter in the request if the model name is not specified.

Model Name

Optional. Enter the model name to construct the model parameter in the request. This parameter takes precedence over the Model ID parameter. If this parameter is not specified, the model ID is used as the model parameter.

  • You do not need to configure the model name in the following scenarios:

    • The LLM service is deployed using the BladeLLM inference engine.

    • The LLM service is deployed using vLLM, SGLang, or LMDeploy inference engines, and the model parameter in the request is the model ID.

  • You must configure the model name in the following scenario:

    The LLM service is deployed using vLLM, SGLang, or LMDeploy inference engines, and the model parameter in the request is not the model ID. Instead, the model parameter is the local path of the model or a custom name specified by --served-model-name when the vLLM service is started.

Data Type and related parameters

Data Type

Description

Public Dataset

The public ShareGPT dataset is used for stress testing. The dataset contains multiple conversation records and can be used to evaluate the performance of LLM services. You must configure the following parameters:

  • Dataset: Only supports ShareGPT.

  • Output Length: The length of generated text. If you leave this parameter empty, the output result will not be truncated. This parameter helps you test the stability and performance of LLM services under different loads.

Custom Dataset

Based on your specific usage scenario, configure a custom dataset:

  • Data Source: Custom data files of the following data sources can be used for stress testing:

    • Single Data Entry: Enter the stress testing request data in the Single Data Entry field. The format must be a Base64-encoded string.

    • Data Address: Enter the HTTP path of the test data source in the Data Address field. You can enter the path of a single file or a ZIP package. ZIP packages will be automatically decompressed after download.

    • OSS Object: Select an Object Storage Service (OSS) path to store the stress testing files.

    • Upload Local File: Select an OSS bucket and upload local stress testing files to the OSS bucket.

  • Split File Data By Line: If Data Source is set to Data Address, OSS Object, or Upload Local File, this switch is available. If you turn on this switch, the uploaded stress testing file is separated by lines and each line is used as a stress testing request. If you turn off this switch, the entire file content is used as a single stress testing request.

    Note

    For stress testing file configuration examples, see benchmark_demo.json. Each data entry in the file is an actual user request in JSON format. First use the online debugging feature to verify whether the format of each single request data is correct.

Simulation Data

  • Data Generation Mode: Only supports Uniform Distribution.

  • Input Tokens: The length range of input tokens. The minimum value is 10, and the maximum value is 10000. Default value: 1024.

  • Output Tokens: The length range of output tokens. The minimum value is 10, and the maximum value is 10000. Default value: 128.

Test Mode and related parameters

The following three test modes are supported:

  • Fixed Concurrency Test: Specify a fixed concurrency to test the service performance under the specific concurrency level.

  • Fixed Request Rate Test: Specify a fixed request rate to test the service performance under the specific request rate.

  • Extreme Throughput Test: All requests are sent at once to find the maximum request queries per second (QPS) that the inference service can handle. This mode is suitable for testing the maximum capacity of the system.

For Fixed Concurrency Test and Fixed Request Rate Test modes, you can enable Continuous Testing.

  • If you enable Continuous Testing, the task will run until the testing duration ends, regardless of the Number Of Request Samples.

  • If you disable Continuous Testing, the task will stop after completing the specified Number Of Request Samples or reaching the Maximum Duration.

The parameters for different modes:

Test mode

Description

Fixed Concurrency Test

  • Concurrency: Specify a concurrency to simulate the number of concurrent users. Value range: [1, 200].

  • Maximum Duration (s): The stress testing duration, in seconds. Default value: 300. Minimum value: 30.

  • Number Of Request Samples: If you enable Continuous Test, you do not need to set this parameter. This parameter specifies the number of requests sent during stress testing. Value range: [100, 1000].

Fixed Request Rate Test

  • Request Rate: The number of requests sent per second.

  • Maximum Concurrency: Specify the maximum concurrency to simulate the number of concurrent users.

  • Maximum Duration (s): The stress testing duration, in seconds. Default value: 300. Minimum value: 30.

  • Number Of Request Samples: If you enable Continuous Testing, you do not need to set this parameter. This parameter specifies the number of requests sent during stress testing. Value range: [100, 1000].

Extreme Throughput Test

  • Maximum Duration (s): The stress testing duration, in seconds. Default value: 300. Minimum value: 30.

  • Number Of Request Samples: The number of requests sent during stress testing. Value range: [100, 1000].

Other Configurations

Parameter

Description

HTTP Header

The request header, in key-value pairs. Examples:

  • Pass authorization information: Authorization: EAS_TOKEN

  • Set the data format of the request body: Content-Type: application/json

Burstiness

  • Default value: 1. Granularity: 0.1. Range: 0.1-200.

  • Burstiness controls the time distribution pattern of request generation and is only effective in Fixed Request Rate mode. The default value is 1, which follows a Poisson distribution. Other values follow a gamma distribution. The smaller the value, the more bursty the request flow. The larger the value, the more uniform the request flow.

Random Seed

Default value: 0. Data type: Integer. Range: 0-4294967295 (2**32-1).

Ignore EOS

Enabling Ignore EOS means that the model will ignore the End-of-Sequence (EOS) token when generating text, forcing generation until the preset maximum generation length is reached.

Common service stress testing

Common service stress testing supports the following three modes:

  • auto mode: Automatic load increase mode. In this mode, eas-benchmark automatically creates agent workers for stress testing, configures the concurrency, and uses the optimal algorithm to determine the maximum capacity of the service.

  • scan mode: Periodic mode. In this mode, eas-benchmark dynamically increases the service load based on parameters such as minQPS, maxQPS, adjustInterval, and qpsGrowthDelta, until the actual value of the maxRT, maxQPS, or faultTolerate parameter reaches the specified one.

  • manual mode: Manual mode. In this mode, you can manually adjust the number of agents and the concurrency of each agent during stress testing.

The console only supports auto mode, while the EASCMD client supports all three stress testing modes: auto, scan, and manual. Usage:

Use the console

Note

The stress testing console limits the timeout for stress testing requests to 20 seconds. If you see a 512 return code in the stress testing report, it is likely due to request timeout. Currently, the EAS stress testing console does not support custom timeout configuration.

On the Create Stress Testing Task page, configure the following parameters. After the parameters are configured, click Confirm.

image

Parameter

Description

Basic Information

Service

Select a service on which you want to perform stress testing.

Stress Testing URL

The URL to call the service.

Stress Testing Configurations

Data Source

Valid values: Single Data Entry, Data Address, OSS Object, and Upload Local File. For information about how to construct stress testing data and supported file types, see Appendix 1: Stress testing data.

Note
  • The format of a single stress testing request must be a Base64-encoded string.

  • Stress testing supports single files or ZIP packages. ZIP packages will be automatically decompressed after download.

Split File Data By Line

If Data Source is set to Data Address, OSS Object, or Upload Local File, this switch is available.

If you turn on this switch, the uploaded stress testing file is separated by lines and each line is used as a stress testing request. If you turn off this switch, the entire file content is used as a single stress testing request.

Maximum Duration (s)

The stress testing duration, in seconds. Default value: 300.

Maximum QPS

The maximum QPS for stress testing. Default value: 10000.

Maximum Response Time (ms)

The maximum response time of the stress testing, in milliseconds. If the response time exceeds this threshold, the system automatically adjusts QPS until the real-time response time meets your business requirements.

HTTP Header

The request header, in key-value pairs. Examples:

  • Pass authorization information: Authorization: EAS_TOKEN

  • Set the data format of the request body: Content-Type: application/json

Use the EASCMD client

Run the bench create command to create a stress testing task (Download the EASCMD client and complete identity authentication). After the task is created, you can view the real-time monitoring data through the returned URL. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

eascmdwin64.exe bench create [bench_desc_json]

The bench_desc_json parameter indicates the JSON configuration file for the stress testing task. Examples:

Single stress testing request data

{
    "service": {
        "serviceName": "xgb_test"
    },
    "data": {
        "content": "W1sxLDAsMCwwLDEsMSwwLDEsMCwxLDEsMCwwLDEsMCwxLDAsMSwwLDAsMSwxLDEsMCwxLDEsMCwwLDAsMSwxLDEsMCwxLDEsMSwxLDAsMSwxLDEsMCwxLDAsMCwwLDEsMSwwLDAsMCwxLDAsMSwwLDEsMCwwLDEsMCwwLDEsMCwxLDAsMCwxLDAsMCwwLDAsMSwwLDEsMCwxLDAsMCwxLDEsMSwwLDAsMSwwLDAsMCwwLDEsMSwxLDAsMSwxLDAsMCwxLDAsMSwwLDEsMSwxLDEsMCwxLDAsMCwxLDEsMSwxLDAsMCwwLDEsMSwwXV0K"
    }
}
                

OSS file stress testing data

Use the path parameter to specify the path of multiple OSS objects that store stress testing data.

You can also package multiple stress testing files into a ZIP file and then set the oss://XX.zip path for the path parameter.

{
    "service": {
        "serviceName": "xgb_test"
    },
    "data": {
        "path": "oss://examplebucket/test1.bin,oss://examplebucket/test2.bin"
    }
}
                

For JSON parameter descriptions, see Appendix 3: JSON configuration parameters for stress testing.

Sample output:

[RequestId]: DE240637-4976-59AF-A28C-BAA55C0A****
[OK] Task [benchmark-xgb-test-b514] is creating
[OK] [Agnet: 0/1]: Succeed to start benchmark master
[OK] [Agnet: 0/1]: Succeed to start benchmark master
[OK] [Agnet: 1/1]: Benchmark task is Running
[OK] Benchmark task is Running
[OK] Click the link http://127.0.0.1:18222/eas-benchmark/statsview to observe realtime visualization details, you can turn it off with CTRL+C.
Turning off will not interrupt the benchmark test task, and you can reopen it by the visualize command:
eascmd -c [config_file] bench visualize benchmark-xgb-test-b514

Step 3: View details of the stress testing task

View real-time monitoring data

When the Status of the stress testing task changes to Running, click the task name to view real-time monitoring data.image

View stress testing report

When the Status of the stress testing task changes to Completed, click the task name to view the stress testing report.

The stress testing report includes Basic Information, Stress Testing Configurations, Test Result, and Test Monitoring sections. The following content describes the details of the Test Monitoring section:

  • Metrics only available for LLM services

    TTFT (Time To First Token)

    The latency of the first output token. This metric displays the time from a request is sent to the time when the first token generated by the service is received.

    image

    TPOT (Time per Output Token)

    The latency of an output token. This metric displays the time interval between two adjacent tokens generated by the service.

    image

    TPS (Token Per Second)

    The number of tokens transmitted per second.

    image

  • Metrics available for all services

    QPS Distribution

    The distribution of the number of requests received by the service per second.

    image

    Response Time Distribution

    The distribution of the number of responses returned by the service within the specified time range.

    image

    Data Transfer Distribution

    The distribution of the requests sent from the client to the service and responses returned by the service to the client within the specified time range.

    image

    Response Time Distribution in Range

    The proportion of response times returned by the service within different ranges. Unit: milliseconds.

    image

    Response Time Distribution

    The end-to-end latency of requests at different percentiles. Unit: milliseconds.

    image

    Status Code Distribution

    The distribution of status codes returned by the service.

    image

Step 4: Manage the stress testing task

Use the console

You can view the created stress testing tasks and enable, clone, copy report, and delete tasks on the Stress Testing Tasks tab.

image

Use the EASCMD client

  • View the stress testing tasks

    Run the bench list command to view the stress testing tasks that you created. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench ls

    Sample output:

    [RequestId]: 7F953F8E-8897-5785-808A-CA648302****
    +-------------------------+--------------------------+-------------+----------------+---------+---------------------+
    |        TASKNAME         |          TASKID          |   REGION    | AVAILABLEAGENT | STATUS  |     CREATETIME      |
    +-------------------------+--------------------------+-------------+----------------+---------+---------------------+
    | benchmark-***-test-**** | eas-b-ql470xog6qeh25**** | cn-shanghai |              0 | Stopped | 2022-06-17 17:58:01 |
    | benchmark-***-test-**** | eas-b-bdnzvwq0z0h3xq**** | cn-shanghai |              2 | Running | 2022-06-20 12:18:54 |
    +-------------------------+--------------------------+-------------+----------------+---------+---------------------+
  • View the details of a stress testing task

    Run the bench desc command to view the details of a specific stress testing task. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench desc [benchmark_task_name]

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    +----------------+------------------------------------------------------------------------------+
    |     TaskName   | benchmark-***-test-b514                                                      |
    |     TaskId     | eas-b-bdnzvwq0z0h3xq****                                                     |
    |    ServiceName | xgb_test                                                                     |
    |         Region | cn-shanghai                                                                  |
    |   DesiredAgent | 2                                                                            |
    | AvailableAgent | 2                                                                            |
    |         Status | Running                                                                      |
    |        Message | Benchmark task is running                                                    |
    |     CreateTime | 2021-10-20 12:38:35                                                          |
    |     UpdateTime | 2021-10-20 12:38:45                                                          |
    |         Config | {                                                                            |
    |                |   "base": {                                                                  |
    |                |     "agentCount": 2,                                                         |
    |                |     "concurrency": 40,                                                       |
    |                |     "duration": 1200,                                                        |
    |                |     "requestCount":                                                          |
    |                | 922337203685477****,                                                         |
    |                |   },                                                                         |
    |                |  ...                                                                         |
    |                | }                                                                            |
    +----------------+------------------------------------------------------------------------------+
  • Enable real-time visualization for a stress testing task

    Run the bench visualize command to enable real-time visualization for a stress testing task. This command launches a web server on your local machine at 127.0.0.1, which provides a web page that displays the real-time monitoring data of the stress testing task. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench visualize [benchmark_task_name]

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    [OK] Click the link http://127.0.0.1:18734/eas-benchmark/statsview to observe realtime visualization details, you can turn it off with CTRL+C.
    Turning off will not interrupt the benchmark test task, and you can reopen it by the visualize command:
    eascmd -c [config_file] bench visualize benchmark-xgb-test-b514

    To view the real-time monitoring data, visit http://127.0.0.1:18734/eas-benchmark/statsview in a browser.

  • Obtain the stress testing report

    When the status of a stress testing task changes to Stopped, the task is completed. The stress testing report is saved to OSS, and you can run the bench report command to obtain the report. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe  bench report [benchmark_task_name]

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    [OK] Benchmark task benchmark-demo-test-c7eb report url: http://eas-benchmark.oss-cn-chengdu.aliyuncs.com/summary/benchmark-demo-test-c7eb-10004.html

    Copy the link after url: and paste it in a browser to view the stress testing report. The following figure shows a sample report.image.png

  • Dynamically change the number of agents and concurrency

    If you use the manual mode for stress testing, you can use the bench update command to dynamically change the number of agents and concurrency. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench update [benchmark_task_name] -Doptional.concurrency=<attr_value> -Doptional.agentCount=<attr_value>

    Replace <attr_value> with a specific value that you want to configure. Example:

    eascmdwin64.exe bench update benchmark-demo-b99c -Doptional.concurrency=2 -Doptional.agentCount=1

    Sample output:

    [RequestId]: 9920C672-4D41-5CC4-8EC0-C690F76EB2BA
    [OK] Running [TaskName: benchmark-demo-b99c, DesiredAgent:1, AvailableAgent: 1, Message: Benchmark task is Updating]
    [OK] Benchmark task benchmark-demo-b99c was updated successfully
  • Stop a stress testing task

    Run the bench stop command to stop a stress testing task that is running. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench stop [benchmark_task_name]

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    Are you sure to stop the benchmark task [benchmark-***-test-b514] in [cn-shanghai]? [Y/n]
    [OK] Task [benchmark-***-test-b514] is stopping
    [OK] [Agnet: 0/1]: Benchmark task is Running
    [OK] [Agnet: 0/1]: Benchmark task is Stopped
    [OK] Benchmark task is stopped

    If you ran the command that enables real-time visualization in a terminal window before you stop the task, the system generates the stress testing report in the terminal. You can also run the bench report command to obtain the detailed HTML report with charts and text.

  • Start a stress testing task

    Run the bench start command to start a stopped stress testing task. In this example, the 64-bit Windows version of the EASCMD client is used. Sample command:

    eascmdwin64.exe bench start [benchmark_task_name]
    Note

    Compared with the bench create command, this command restarts a stress testing task based on the most recently updated configuration of the task.

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    Are you sure to start the benchmark task [benchmark-***-test-b514] in [cn-shanghai]? [Y/n]
    [OK] Task [benchmark-***-test-b514] is starting
    [OK] [Agnet: 0/1]: Succeed to start benchmark master
    [OK] [Agnet: 1/1]: Benchmark task is Running
    [OK] Benchmark task is Running
    [OK] Click the link http://127.0.0.1:18947/eas-benchmark/statsview to observe realtime visualization details, you can turn it off with CTRL+C.
    Turning off will not interrupt the benchmark test task, and you can reopen it by the visualize command:
    eascmd -c [config_file] bench visualize benchmark-xgb-test-b514
  • Delete a stress testing task

    After a stress testing task is completed, the stress testing controller retains records of the task based on the final state of the task. The following table describes the retention periods.

    Final state

    Retention period

    Stopped

    48 hours

    CreateFailed, UpdateFailed, Terminated, or Error

    10 minutes

    When the retention period ends, the system automatically deletes the task.

    You can also run the bench delete command to manually delete a stress testing task. Format:

    eascmdwin64.exe bench delete [benchmark_task_name]

    Replace [benchmark_task_name] with the name of the stress testing task that you want to manage.

    Sample output:

    Are you sure to delete the benchmark task [benchmark-***-test-b514] in [cn-shanghai]? [Y/n]
    [OK] Benchmark task benchmark-***-test-b514 is Deleting
    [OK] Benchmark task was deleted successfully

Appendix 1: Stress testing data

Format

The format of service request data varies based on the model and processor of your service:

  • For unstructured request data, such as audio, images, text, directly upload the files that contain the request data.

  • For structured request data, such as data in the TFRequest format, construct request data using the EAS SDK (see Warm up model services), and upload the generated binary data as a file.

File type

Use a supported file type based on your business requirements, such as .txt, .jpg, .bin, and .zip.

Appendix 2: Stress testing configuration examples

When you use the EASCMD client to create stress testing tasks for common services, you can use auto, scan, or manual mode.

You can configure the mode parameter in the optional section of the configuration file to specify a stress testing mode. Configuration examples:

Auto mode

In auto mode, you need to only specify the service name and stress testing data in the configuration file. Retain the default settings of other parameters. Sample configuration:

{
    "service": {
        "serviceName": "demo"
    },
    "data": {
        "path": "https://examplebucket.oss-cn-chengdu.aliyuncs.com/data/warmup.tf.bin"
    },
    "optional": {
        "maxQPS": 1000,
        "duration": 300
     }
}

Scan mode

{
    "service": {
        "serviceName": "demo"
    },
    "data": {
        "content": "aGVsbG8K"
    },
    "optional": {
        "mode": "scan",
        "maxQPS": 1000,
        "minQPS": 500,
        "qpsGrowthDelta": 100,
        "adjustInterval": 30
    }
}

Manual mode

{
    "service": {
        "serviceName": "demo"
    },
    "data": {
        "content": "aGVsbG8K"
    },
    "optional": {
        "mode": "manual",
        "agentCount": 1,
        "concurrency": 5
    }
}

Appendix 3: JSON configuration parameters for stress testing

Item

Parameter

Required

Description

service

serviceName

Yes

The name of the service on which you want to perform stress testing.

data

content

No

A single entry of stress testing data in the Base64-encoded string format.

You can specify the path parameter to configure multiple requests. For information about how to construct stress testing data and supported file types, see Appendix 1: Stress testing data.

path

No

The path where the stress testing data is stored. You can specify multiple HTTP or OSS paths and separate them with commas (,). You can also package multiple stress testing files into a ZIP file and then specify the path of the ZIP file.

Note

You do not need to use base64 to encode the files that store the stress testing data.

multiLine

No

Specifies whether to separate the stress testing data by line. Valid values: true and false. Default value: false. If you set the parameter to true, the downloaded data is processed line by line.

http

headers

No

The HTTP request headers. The value of this parameter is of the LIST type. Example: ["Authorization:aaa", "Content-Type:text"].

timeout

No

The HTTP request latency, in milliseconds. Default value: 20000.

optional

mode

No

The stress testing mode. Valid values:

  • auto: automatic mode. This is the default value.

  • scan: periodic mode.

  • manual: manual mode.

duration

No

The stress testing duration in seconds. Default value: 600. Maximum value: 1200.

agentCount

No

The number of stress testing agents when you set the mode parameter to manual. A larger number of agents results in higher loads. Default value: 1.

concurrency

No

The concurrency of each agent when you set the mode parameter to manual. Default value: 2. A larger value indicates a higher load. If you want to increase the loads of the service, increase the concurrency value first. If the loads stop increasing with the concurrency, try to add more agent workers.

adjustInterval

No

The interval at which the load is automatically increased when you set the mode parameter to scan. Unit: seconds. Default value: 60.

minQPS

No

The initial QPS value when you set the mode parameter to scan (automatic load increase mode). Default value: 100.

maxQPS

No

The maximum QPS value when you set the mode parameter to scan or auto.

maxRT

No

The maximum RT value (TP99) when you set the mode parameter to scan or auto. If the actual RT exceeds the threshold, the QPS is automatically adjusted until the RT drops below the threshold.

qpsGrowthDelta

No

The increment of QPS for each adjustment when you set the mode parameter to scan (automatic load increase mode). Default value: 50.

faultTolerate

No

The maximum tolerance rate for request errors, which occur if the returned status code is not 200. A value of 0.01 specifies that the error handling mechanism is triggered when over 1% of requests encounter an error. Default value: 0.001.

faultAction

No

The operation the stress testing controller performs when you set the mode parameter to scan or auto and the request error rate exceeds the threshold specified by the faultTolerate parameter. Valid values:

  • stop: maintains the current QPS.

  • revise: dynamically adjusts the QPS until the request error rate drops below the threshold. This is the default value.

References

  • You can create and manage stress testing tasks by calling APIs. For more information, see Benchmark Task.

  • After the performance of the service meets your business requirements, you can call the service to run inference tasks. For more information, see Overview.