This topic describes the background information, scenarios, benefits, and limits of instance concurrency. This topic also describes how to configure instance concurrency in the Function Compute console.
Background
Function Compute calculates fees based on the execution duration of requests by instances. For example, if the database access latency is 10 seconds and three requests are processed by three instances, the total execution duration of the requests by the three instances is 30 seconds. If the three requests are concurrently processed by one instance, the total execution duration of the requests by the instance is 10 seconds. To save costs for using instances, Function Compute allows you to use a single instance to concurrently process multiple requests. You can use InstanceConcurrency to specify the number of requests that can be concurrently processed by an instance. The following figure shows the differences between how requests are concurrently processed by a single instance and by multiple instances.

- If InstanceConcurrency is set to 1, each instance processes one request at a time. Function Compute needs to create three instances to process the three requests.
- If InstanceConcurrency is set to 10, each instance can process 10 requests at a time. Function Compute needs only to create one instance to process the three requests.
Scenarios
If a function spends a large quantity of time waiting for responses from downstream services, we recommend that you use a single instance to concurrently process multiple requests. Usually, resources are not consumed when requests are waiting for responses. If you use a single instance to concurrently process multiple requests, costs can be reduced.
Benefits
- Shorter execution duration and lower costs
For example, for functions that involve a great number of input/output (I/O) operations, you can use a single instance to concurrently process multiple requests. This reduces the number of instances that are used to process requests to reduce the total execution duration of requests by the instances.
- Shareable status among requests
Multiple requests can share the connection pool of a database in one instance to minimize the connections between requests and the database.
- Lower frequency of cold starts
Fewer instances need to be created because one instance can process multiple requests. This reduces the frequency of cold starts.
- Fewer number of IP addresses used in a VPC
For a fixed number of requests to be processed, the number of required instances is reduced if each instance can process multiple requests. This reduces the number of IP addresses used in the VPC.
Important Make sure that the vSwitch associated with your VPC has at least two available IP addresses. Otherwise, the service may be unavailable, leading to request errors.
Impacts
This section describes the differences between the scenarios in which a single instance processes a single request at a time (InstanceConcurrency = 1) and the scenarios in which a single instance can process multiple requests at a time (InstanceConcurrency > 1):
Billing
- A single instance processes a single request at a timeAn instance can process only one request at a time. The billing duration starts when the first request starts to be processed and ends when the last request is processed.
- A single instance concurrently processes multiple requests
For a single instance that concurrently processes multiple requests, you are charged based on the execution duration of the requests by the instance. The billing duration starts when the first request starts to be processed and ends when the last request is processed.
For more information, see Billing overview.
Concurrency throttling
By default, Function Compute supports a maximum of 300 on-demand instances in a region. The maximum number of requests that can be concurrently processed in a region is calculated based on the following formula: 300 × Value of InstanceConcurrency. For example, if you set InstanceConcurrency to 10, a maximum of 3,000 requests can be concurrently processed in a region. If the number of concurrent requests exceeds the maximum number of requests that Function Compute can process, the ResourceExhausted error is returned.
Logs
- For a single instance that processes a single request at a time, if you specify
X-Fc-Log-Type: Tail
in the HTTP header when you invoke a function, Function Compute returns the function logs in theX-Fc-Log-Result
field that is in the response header. For a single instance that concurrently processes multiple requests, the response header does not include function logs because the logs of a specific request cannot be obtained among concurrent requests. - For the Node.js runtime, the
console.info()
function is used to return the ID of the current request in the log. If an instance concurrently processes multiple requests, theconsole.info()
function cannot display the correct IDs of all the requests. All the request IDs are changed toreq 2
. The following example shows a sample log:2019-11-06T14:23:37.587Z req1 [info] logger begin 2019-11-06T14:23:37.587Z req1 [info] ctxlogger begin 2019-11-06T14:23:37.587Z req2 [info] logger begin 2019-11-06T14:23:37.587Z req2 [info] ctxlogger begin 2019-11-06T14:23:40.587Z req1 [info] ctxlogger end 2019-11-06T14:23:40.587Z req2 [info] ctxlogger end 2019-11-06T14:23:37.587Z req2 [info] logger end 2019-11-06T14:23:37.587Z req2 [info] logger end
In this case, thecontext.logger.info()
function can be used to display logs. This ensures that the correct ID of a request is returned. The following sample code shows an example:exports.handler = (event, context, callback) => { console.info('logger begin'); context.logger.info('ctxlogger begin'); setTimeout(function() { context.logger.info('ctxlogger end'); console.info('logger end'); callback(null, 'hello world'); }, 3000); };
Error handling
When an instance concurrently processes multiple requests, unexpected process quits caused by failed requests affect other concurrent requests. Therefore, you must compile logic to capture request-level exceptions in the function code and prevent impacts on other requests. The following example shows the sample Node.js code:
exports.handler = (event, context, callback) => {
try {
JSON.parse(event);
} catch (ex) {
callback(ex);
}
callback(null, 'hello world');
};
Shared variables
When an instance concurrently processes multiple requests, errors may occur if multiple requests attempt to modify a variable at the same time. You must use the mutual exclusion to prevent variable modifications that are not safe for threads when you write your function code. The following example shows the sample Java code:
public class App implements StreamRequestHandler
{
private static int counter = 0;
@Override
public void handleRequest(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
synchronized (this) {
counter = counter + 1;
}
outputStream.write(new String("hello world").getBytes());
}
}
Monitoring metrics

Limits
Item | Limit |
---|---|
Supported runtime environments |
|
Number of requests that can be concurrently processed by a single instance | 1~200 |
Function execution logs provided in the X-Fc-Log-Result field in the response header | Not supported if the InstanceConcurrency parameter is set to a value greater than 1 |
Configure instance concurrency for a function
You can configure InstanceConcurrency when you create or update a function. For more information, see Manage functions.

If you use provisioned instances, the function in provisioned mode can concurrently process multiple requests. For more information, see Configure provisioned instances and auto scaling rules.
References
For more information about how to use the SDK for Node.js to configure instance concurrency, see Specify the instance concurrency.