This topic describes the background, benefits, scenarios, and impacts of concurrent requests in a single instance.

Background

Function Compute is billed based on the total duration on instances. If three requests are processed in three instances and the latency to access a database is 10 seconds, the total duration is 30 seconds. If the three requests are concurrently processed in one instance, the total duration is only 10 seconds. Function Compute supports the feature of processing concurrent requests in one instance, which saves instance resources. Function Compute allows you to specify the InstanceConcurrency parameter for a function. This parameter limits the maximum number of requests that an instance can concurrently process. The following figure shows the differences between processing one request and concurrent requests in an instance.instanceconcurrency

Assume that three requests need to be concurrently processed. When the instance concurrency is set to 1, Function Compute must create three instances to process the three requests, and each instance processes one request. When the instance concurrency is set to 10 (one instance can process 10 requests concurrently), Function Compute can create only one instance to process the three requests.

Note By default, the value of InstanceConcurrency is 1. This indicates that an instance can process only one request at a time. After you set InstanceConcurrency to a value greater than 1, Function Compute will make full use of the concurrency of an instance before new instances are created.

Benefits

  • Reduces the total duration and saves costs

    Functions that require input/output operations can process multiple requests in one instance to minimize the total duration on multiple instances.

  • Provides shared state for requests

    Multiple requests in one instance can share the connection pool of the database to minimize the connections between requests and the database.

  • Reduces the frequency of cold starts

    Fewer instances need to be created because multiple requests can be processed in one instance, which reduces the frequency of cold starts.

  • Reduces the number of IP addresses used in a virtual private cloud (VPC)

    For a fixed number of requests to be processed, the number of occupied instances is reduced when each instance can handle multiple requests. Therefore, the number of IP addresses used in the VPC can be reduced.

Scenarios

This feature is not applicable to all functions. The following table lists the scenarios of this feature.
Scenario Applicable Reason
Requests are waiting for responses from the downstream service for an extended period of time Yes Resources are generally not consumed when requests are waiting for responses. Requests can be processed in a single instance to save costs.
Requests are using shared state that cannot be concurrently accessed No If multiple requests are concurrently processed to change the shared state such as global variables, errors may occur.
A request consumes a large amount of CPU and memory resources No Multiple requests compete for resources, which leads to insufficient memory or longer latency.

Impacts

After you set InstanceConcurrency to a value greater than 1, it differs from the value 1 in the following aspects:

  • Billing
    • Single request processing in a single instance
      A function instance can process only one request at a time. The billing duration starts when the first request starts and ends when the last request is complete.instanceconcurrency=1
    • Concurrent request processing in a single instance

      When an instance processes multiple requests concurrently, Function Compute calculates charges based on the duration on an instance. This duration begins when the first request starts and ends when the last request is complete.

      instanceconcurrency

    For more information, see Billing.

  • Concurrent request limit

    Function Compute allows a maximum of 300 pay-as-you-go instances in a region by default. The maximum number of requests that can be concurrently processed in a region is 300 × InstanceConcurrency. For example, if InstanceConcurrency is set to 10, a maximum of 3,000 requests can be concurrently processed in a region. If the number of concurrent requests exceeds the maximum number of requests that Function Compute can process, the ResourceExhausted error occurs.

    Note If you want to increase the number of pay-as-you-go instances in a region, Contact us.
  • Logs
    • When an instance processes one request, if you specify X-Fc-Log-Type: Tail in the HTTP header, Function Compute returns the function logs in the X-Fc-Log-Result field that is in the response header. When an instance processes multiple requests concurrently, the response header does not include function logs because the logs of a specific request cannot be obtained among concurrent requests.
    • For the Node.js runtime, the console.info() function is used to return the ID of the current request in the log. When an instance processes multiple requests concurrently, the console.info() function cannot display the correct IDs of all the requests. All the request IDs are changed to req 2. The following example shows a sample log:
      2019-11-06T14:23:37.587Z req1 [info] logger begin
      2019-11-06T14:23:37.587Z req1 [info] ctxlogger begin
      2019-11-06T14:23:37.587Z req2 [info] logger begin
      2019-11-06T14:23:37.587Z req2 [info] ctxlogger begin
      2019-11-06T14:23:40.587Z req1 [info] ctxlogger end
      2019-11-06T14:23:40.587Z req2 [info] ctxlogger end
      2019-11-06T14:23:37.587Z req2 [info] logger end
      2019-11-06T14:23:37.587Z req2 [info] logger end                    
      Therefore, the context.logger.info() function can be used to display logs. This ensures that the correct ID of a request is returned. The following example shows the sample code:
      exports.handler = (event, context, callback) => {
          console.info('logger begin');
          context.logger.info('ctxlogger begin');
      
          setTimeout(function() {
              context.logger.info('ctxlogger end');
              console.info('logger end');
              callback(null, 'hello world');
          }, 3000);
      };                   
  • Error handling

    When an instance processes multiple requests concurrently, unexpected process quits caused by failed requests affect other concurrent requests. Therefore, you must compile troubleshooting logic to avoid impacts on other requests. The following example shows how to troubleshoot exceptions by using Node.js:

    exports.handler = (event, context, callback) => {
        try {
            JSON.parse(event);
        } catch (ex) {
            callback(ex);
        }
    
        callback(null, 'hello world');
    };                    
  • Shared variables

    When an instance processes multiple requests concurrently, errors may occur if multiple requests attempt to modify the same variable at the same time. You must use the mutual exclusion method to avoid variable modifications that are not safe for threads when you define your functions. The following example shows the sample Java code:

    public class App implements StreamRequestHandler
    {
        private static int counter = 0;
    
        @Override
        public void handleRequest(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
            synchronized (this) {
                counter = counter + 1;
            }
            outputStream.write(new String("hello world").getBytes());
        }
    }                    
  • Monitoring metrics
    After you specify the instance concurrency for your function, you can view that the number of used instances is reduced in the instance monitoring chart.Instance monitoring chart

Limits

Item Description
Supported runtime
  • Node.js Runtime
  • Java Runtime
  • Custom Runtime
Valid values of instance concurrency 1~100
Whether to return function logs in the X-Fc-Log-Result field in the response header Not supported when InstanceConcurrency is set to a value greater than 1

References

Specify the request concurrency in a single instance