All Products
Search
Document Center

API Gateway:Capacity planning

Last Updated:Dec 11, 2025

This topic provides capacity thresholds and queries per second (QPS) performance references for different instance types of Cloud-native API Gateway. This information helps you select the right instance type for your needs.

Capacity thresholds

The following table lists the capacity thresholds for different gateway instance types. You receive full Service-Level Agreement (SLA) coverage when gateway capacity metrics are below the alert level. For core services, keep gateway capacity metrics below the safe level to ensure greater stability.

  • Safe level: The gateway system can maintain high throughput and low latency even if burst traffic doubles.

  • Alert level: When the gateway operates above this level, latency may increase and there is a stability threat during traffic bursts.

  • The apigw.dev.x1 instance type is a single-node deployment and does not include an SLA. Use it for testing purposes only. For production services, ensure you use gateway instance types with multiple nodes.

  • The SLA does not cover request failures caused by CPU or memory usage exceeding the alert level. The gateway generates alerts when CPU and memory usage reach the alert level. Monitor the gateway's load levels and alerts promptly.

Gateway instance type

Client connections

New HTTPS connections per second

CPU usage

Memory usage

Safe level

Alert level

Safe level

Alert level

Safe level

Alert level

Safe level

Alert level

apigw.dev.x1

12,000

24,000

400

800

30%

60%

75%

75%

apigw.small.x1

24,000

48,000

800

1,600

30%

60%

75%

75%

apigw.small.x2

48,000

96,000

1,600

3,200

30%

60%

75%

75%

apigw.small.x4

96,000

192,000

3,200

6,400

30%

60%

75%

75%

apigw.medium.x1

192,000

384,000

6,400

12,800

30%

60%

75%

75%

apigw.medium.x2

384,000

768,000

12,800

25,600

30%

60%

75%

75%

apigw.medium.x3

576,000

1,152,000

19,200

38,400

30%

60%

75%

75%

apigw.large.x1

768,000

1,536,000

25,600

51,200

30%

60%

75%

75%

apigw.large.x2

1,536,000

3,072,000

51,200

102,400

30%

60%

75%

75%

apigw.large.x3

2,304,000

4,608,000

76,800

153,600

30%

60%

75%

75%

apigw.large.x4

3,072,000

6,144,000

102,400

204,800

30%

60%

75%

75%

QPS performance reference

Gateway QPS is affected by multiple factors, such as the acknowledgement size and whether HTTPS or gzip is enabled. The following table provides a conservative reference for QPS values (worst-case scenario) when gateway CPU usage is at 30%.

Note

Creating new HTTPS connections consumes significant CPU resources. For services with a high number of concurrent, short-lived HTTPS connections, evaluate gateway capacity based on the HTTPS short-lived connection data in the table below.

Gateway instance type

apigw.dev.x1

apigw.small.x1

apigw.small.x2

apigw.small.x4

apigw.medium.x1

apigw.medium.x2

apigw.medium.x3

apigw.large.x1

apigw.large.x2

apigw.large.x3

apigw.large.x4

Connection type

Acknowledgement size (KB)

HTTPS enabled

Should I use gzip?

QPS reference at safe CPU level (30%)

Short-lived connection

1

No

No

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

Yes

No

500

1,000

2,000

4,000

8,700

17,400

26,100

34,800

69,600

104,400

139,200

Persistent connection

1

No

No

2,200

4,400

8,800

17,600

35,000

70,000

105,000

140,000

280,000

420,000

560,000

Yes

No

2,000

4,000

8,000

16,000

32,000

64,000

96,000

128,000

256,000

384,000

512,000

Yes

Yes

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

10

No

No

1,800

3,600

7,200

14,400

30,000

60,000

90,000

120,000

240,000

360,000

480,000

Yes

No

1,700

3,400

6,800

13,600

28,000

56,000

84,000

112,000

224,000

336,000

448,000

Yes

Yes

1,000

2,000

4,000

8,000

16,000

32,000

48,000

64,000

128,000

192,000

256,000

Quota limits

Global quotas

Global quotas are independent of the gateway instance type. To request a quota increase, submit a ticket.

Default quota

Quota limit

Instances per region

50

100

Total API operations per region

10,000

20,000

API operations per API

1,000

2,000

Instance type quotas

Instance type quotas depend on the gateway instance type. If a quota is still insufficient after you upgrade to a higher instance type, you can submit a ticket to request a further increase.

Dev & Small

Medium & Large

Default quota

Quota limit

Default quota

Quota limit

Published domain names

50

100

200

500

Associated services

100

200

300

500

Total routes

200

500

1,000

2,000

Total online API operations

1,000

2,000

3,000

5,000

Number of K8s service sources

2

3

3

5

Associated environments

5

10

15

20

Resource quotas for Ingress scenarios

Resource quotas for Ingress scenarios depend on the gateway instance type. Do not exceed the quotas for your instance type to avoid stability issues.

Note

Quota limits are determined by the instance type. You can increase quotas only by upgrading the instance type or adding a new gateway cluster. Scaling out an instance of the same type does not increase its quotas.

Dev

Small

Medium

Large

Domain Names

500

1,000

2,500

7,500

Services

1,000

2,000

4,000

10,000

Routes

1,000

2,000

4,000

10,000

Ingress

1,500

1,000

2,500

7,500

Endpoints

2,500

5,000

10,000

25,000