All Products
Search
Document Center

API Gateway:Billing for auto scaling

Last Updated:Apr 21, 2025

This topic describes the billable items of Cloud-native API Gateway instances during automatic scale-outs. This helps you understand and calculate the fees that may be incurred when you use the auto scaling feature of Cloud-native API Gateway.

Capacity water levels

Cloud-native API Gateway supports auto scaling. You can configure a period for auto scaling to better cope with traffic surges. Scale-outs are automatically performed by unit. The following items describe the capacity of a unit:

  • Secure thresholds: On and beyond secure thresholds, your gateway instance can ensure high throughput and low latency even when traffic is doubled.

  • Warning thresholds: When your instance hits warning thresholds, it endures increased latency and is subject to stability risks in cases of traffic surges.

Scale-out unit

1 unit

Client connections

Secure threshold

12,000

Warning threshold

24,000

New HTTPS connections per second

Secure threshold

400

Warning threshold

800

CPU utilization

Secure threshold

30%

Warning threshold

60%

Memory usage

Secure threshold

75%

Warning threshold

75%

QPS reference

The queries per second (QPS) throughput of a gateway instance is affected by various factors, such as the response size, HTTPS enabling, and GZIP compression. The following table lists the worst-case QPS values of different instance specifications based on a 30% CPU utilization:

Note

New HTTPS connections consume large amounts of CPU resources. In business scenarios in which a large number of HTTPS connections need to be established at the same time, you can estimate your instance capacity requirements based on the data of short-lived connections in the following table:

Scale-out specification

1 unit

Connection type

Response size (KB)

HTTPS

GZIP

QPS reference at a CPU utilization of 30%

Short-lived connection

1

No

No

1,700

Yes

No

500

Long-lived connection

1

No

No

2,200

Yes

No

2,000

Yes

Yes

1,700

10

No

No

1,800

Yes

No

1,700

Yes

Yes

1,000

Scale-outs of different instance specifications

Specification

Existing units

Scale-out step size

Scale-out range

apigw.dev.x1

1

N/A (This specification does not support auto scale-out.)

apigw.small.x1

2

1 unit

3-4 units

apigw.small.x2

4

1 unit

5-8 units

apigw.small.x4

8

1 unit

9-16 units

apigw.medium.x1

16

4 units

20-32 units

apigw.medium.x2

32

4 units

36-64 units

apigw.medium.x3

48

4 units

52-96 units

apigw.large.x1

64

4 units

68-128 units

apigw.large.x2

128

4 units

132-256 units

apigw.large.x3

192

4 units

196-384 units

apigw.large.x4

256

4 units

260-512 units

Billing description

  • For the elastic nodes that are automatically scaled out, the system separately charges fees on a pay-as-you-go basis. In the billing details, the fees generated for the elastic nodes are separately listed.

    Region

    Scale-out unit

    Pay-as-you-go unit price (USD/hour)

    Chinese mainland

    Unit

    0.146

    China (Hong Kong) and Japan (Tokyo)

    0.218

    Singapore, Indonesia (Jarkata), and Germany (Frankfurt)

    0.202

    US (Virginia) and US (Silicon Valley)

    0.174

  • Scale-out fee = Number of scaled-out units x Unit price x Duration in hours. For example, if your instance is scaled out by 2 units between 8:00 and 10:00 in a Chinese mainland region, the scale-out fee is 2 x 0.146 x 2 = USD 0.584.