All Products
Search
Document Center

API Gateway:Capacity specifications

Last Updated:Oct 17, 2025

This topic describes the capacity thresholds and queries per second (QPS) performance data for different cloud-native API gateway instance types. This helps you choose the right instance type.

Capacity thresholds

The following table lists the capacity thresholds for different gateway instance types. You receive full Service-Level Agreement (SLA) coverage when gateway capacity metrics are below the alert level. For core services, keep capacity metrics below the safe level to ensure better stability.

  • Safe level: The gateway can handle a sudden traffic burst of up to twice the normal volume while maintaining high throughput and low latency.

  • Alert level: When the capacity reaches this level, gateway latency may increase. There might be stability risks during traffic bursts.

  • The apigw.dev.x1 instance type is a single-node gateway deployment and does not provide SLA coverage. Use it for testing purposes only. For production services, use gateway instance types that are deployed across multiple nodes.

Gateway instance type

Client connections

New HTTPS connections per second

CPU utilization

Memory usage

Safe level

Alert level

Safe level

Alert level

Safe level

Alert level

Safe level

Alert level

apigw.dev.x1

12,000

24,000

400

800

30%

60%

75%

75%

apigw.small.x1

24,000

48,000

800

1,600

30%

60%

75%

75%

apigw.small.x2

48,000

96,000

1,600

3,200

30%

60%

75%

75%

apigw.small.x4

96,000

192,000

3,200

6,400

30%

60%

75%

75%

apigw.medium.x1

192,000

384,000

6,400

12,800

30%

60%

75%

75%

apigw.medium.x2

384,000

768,000

12,800

25,600

30%

60%

75%

75%

apigw.medium.x3

576,000

1,152,000

19,200

38,400

30%

60%

75%

75%

apigw.large.x1

768,000

1,536,000

25,600

51,200

30%

60%

75%

75%

apigw.large.x2

1,536,000

3,072,000

51,200

102,400

30%

60%

75%

75%

apigw.large.x3

2,304,000

4,608,000

76,800

153,600

30%

60%

75%

75%

apigw.large.x4

3,072,000

6,144,000

102,400

204,800

30%

60%

75%

75%

QPS performance reference

Gateway QPS throughput is affected by several factors, such as response size and whether HTTPS or gzip is enabled. The following table provides worst-case QPS reference values when the gateway is at 30% CPU utilization.

Note

Creating new HTTPS connections consumes significant CPU resources. For scenarios with a high volume of concurrent HTTPS connections, evaluate the gateway capacity based on the data for short-lived HTTPS connections in the table below.

Gateway instance type

apigw.dev.x1

apigw.small.x1

apigw.small.x2

apigw.small.x4

apigw.medium.x1

apigw.medium.x2

apigw.medium.x3

apigw.large.x1

apigw.large.x2

apigw.large.x3

apigw.large.x4

Connection type

Response size (KB)

HTTPS enabled

Should you use gzip?

QPS reference at safe CPU level (30%)

Short-lived connection

1

No

No

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

Yes

No

500

1,000

2,000

4,000

8,700

17,400

26,100

34,800

69,600

104,400

139,200

Persistent connection

1

No

No

2,200

4,400

8,800

17,600

35,000

70,000

105,000

140,000

280,000

420,000

560,000

Yes

No

2,000

4,000

8,000

16,000

32,000

64,000

96,000

128,000

256,000

384,000

512,000

Yes

Yes

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

10

No

No

1,800

3,600

7,200

14,400

30,000

60,000

90,000

120,000

240,000

360,000

480,000

Yes

No

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

Yes

Yes

1,000

2,000

4,000

8,000

16,000

32,000

48,000

64,000

128,000

192,000

256,000

Quota limits

Global quotas

Global quotas are independent of the gateway instance type. To request a quota increase, submit a ticket.

Default quota

Maximum quota

Instances per region

50

100

Total API operations per region

10,000

20,000

Operations per API

1,000

2,000

Instance type quotas

Instance type quotas depend on the gateway instance type. If the quota is still insufficient after you upgrade to a higher instance type, submit a ticket to request a further increase.

Dev & Small

Medium & Large

Default quota

Maximum quota

Default quota

Maximum quota

Published domain names

50

100

200

500

Associated services

100

200

300

500

Total routes

200

500

1,000

2,000

Total online API operations

1,000

2,000

3,000

5,000

K8s service sources

2

3

3

5

Associated environments

5

10

15

20

Resource quotas for Ingress scenarios

Resource quotas for Ingress scenarios depend on the gateway instance type. Do not exceed the resource quotas for the corresponding instance type. Exceeding the quotas can cause stability issues.

Note

Quota limits are tied to the instance type. You can increase the quota only by changing the instance type or adding a new gateway cluster. Scaling out an instance of the same type does not increase the quota.

Dev

Small

Medium

Large

Domain names

500

1,000

2,500

7,500

Services

1,000

2,000

4,000

10,000

Routes

1,000

2,000

4,000

10,000

Ingress

1,500

1,000

2,500

7,500

Endpoints

2,500

5,000

10,000

25,000