The number of requests per second that an application can process and the maximum number of pay-as-you-go instances in use at a specific time are two different concepts.

You can estimate the number of pay-as-you-go instances required for your environment by using the following formula:

  • Single concurrency: An instance processes one request at a time.

    Instances in use = requests per second × request processing time (seconds)

    For example, assume that 10,000 requests are processed per second. If the average request processing time is 1 second, 10,000 × 1 = 10,000 pay-as-you-go instances are required. However, if the average request processing time is 10 milliseconds, only 10,000 × 0.01 = 100 pay-as-you-go instances are required.

  • Multiple concurrency: An instance can process multiple requests simultaneously.

    Instances in use = requests per second × request processing time (seconds)/instance concurrency

    For example, assume that 10,000 requests are processed per second and the instance concurrency is 10. If the average request processing time is 1 second, 10,000 × 1/10 = 1000 pay-as-you-go instances are required. However, if the average request processing time is 10 milliseconds, 10,000 × 0.01/10 = 10 pay-as-you-go instances are required.

    For more information, see Configure instance concurrency.
    Note To increase the upper limit of on-demand instances in a region, Contact Us.