This topic describes the capacity thresholds and queries per second (QPS) performance data for different cloud-native API gateway instance types. This helps you choose the right instance type.
Capacity thresholds
The following table lists the capacity thresholds for different gateway instance types. You receive full Service-Level Agreement (SLA) coverage when gateway capacity metrics are below the alert level. For core services, keep capacity metrics below the safe level to ensure better stability.
Safe level: The gateway can handle a sudden traffic burst of up to twice the normal volume while maintaining high throughput and low latency.
Alert level: When the capacity reaches this level, gateway latency may increase. There might be stability risks during traffic bursts.
The apigw.dev.x1 instance type is a single-node gateway deployment and does not provide SLA coverage. Use it for testing purposes only. For production services, use gateway instance types that are deployed across multiple nodes.
Gateway instance type | Client connections | New HTTPS connections per second | CPU utilization | Memory usage | ||||
Safe level | Alert level | Safe level | Alert level | Safe level | Alert level | Safe level | Alert level | |
apigw.dev.x1 | 12,000 | 24,000 | 400 | 800 | 30% | 60% | 75% | 75% |
apigw.small.x1 | 24,000 | 48,000 | 800 | 1,600 | 30% | 60% | 75% | 75% |
apigw.small.x2 | 48,000 | 96,000 | 1,600 | 3,200 | 30% | 60% | 75% | 75% |
apigw.small.x4 | 96,000 | 192,000 | 3,200 | 6,400 | 30% | 60% | 75% | 75% |
apigw.medium.x1 | 192,000 | 384,000 | 6,400 | 12,800 | 30% | 60% | 75% | 75% |
apigw.medium.x2 | 384,000 | 768,000 | 12,800 | 25,600 | 30% | 60% | 75% | 75% |
apigw.medium.x3 | 576,000 | 1,152,000 | 19,200 | 38,400 | 30% | 60% | 75% | 75% |
apigw.large.x1 | 768,000 | 1,536,000 | 25,600 | 51,200 | 30% | 60% | 75% | 75% |
apigw.large.x2 | 1,536,000 | 3,072,000 | 51,200 | 102,400 | 30% | 60% | 75% | 75% |
apigw.large.x3 | 2,304,000 | 4,608,000 | 76,800 | 153,600 | 30% | 60% | 75% | 75% |
apigw.large.x4 | 3,072,000 | 6,144,000 | 102,400 | 204,800 | 30% | 60% | 75% | 75% |
QPS performance reference
Gateway QPS throughput is affected by several factors, such as response size and whether HTTPS or gzip is enabled. The following table provides worst-case QPS reference values when the gateway is at 30% CPU utilization.
Creating new HTTPS connections consumes significant CPU resources. For scenarios with a high volume of concurrent HTTPS connections, evaluate the gateway capacity based on the data for short-lived HTTPS connections in the table below.
Gateway instance type | apigw.dev.x1 | apigw.small.x1 | apigw.small.x2 | apigw.small.x4 | apigw.medium.x1 | apigw.medium.x2 | apigw.medium.x3 | apigw.large.x1 | apigw.large.x2 | apigw.large.x3 | apigw.large.x4 | |||
Connection type | Response size (KB) | HTTPS enabled | Should you use gzip? | QPS reference at safe CPU level (30%) | ||||||||||
Short-lived connection | 1 | No | No | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 |
Yes | No | 500 | 1,000 | 2,000 | 4,000 | 8,700 | 17,400 | 26,100 | 34,800 | 69,600 | 104,400 | 139,200 | ||
Persistent connection | 1 | No | No | 2,200 | 4,400 | 8,800 | 17,600 | 35,000 | 70,000 | 105,000 | 140,000 | 280,000 | 420,000 | 560,000 |
Yes | No | 2,000 | 4,000 | 8,000 | 16,000 | 32,000 | 64,000 | 96,000 | 128,000 | 256,000 | 384,000 | 512,000 | ||
Yes | Yes | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 | ||
10 | No | No | 1,800 | 3,600 | 7,200 | 14,400 | 30,000 | 60,000 | 90,000 | 120,000 | 240,000 | 360,000 | 480,000 | |
Yes | No | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 | ||
Yes | Yes | 1,000 | 2,000 | 4,000 | 8,000 | 16,000 | 32,000 | 48,000 | 64,000 | 128,000 | 192,000 | 256,000 | ||
Quota limits
Global quotas
Global quotas are independent of the gateway instance type. To request a quota increase, submit a ticket.
Default quota | Maximum quota | |
Instances per region | 50 | 100 |
Total API operations per region | 10,000 | 20,000 |
Operations per API | 1,000 | 2,000 |
Instance type quotas
Instance type quotas depend on the gateway instance type. If the quota is still insufficient after you upgrade to a higher instance type, submit a ticket to request a further increase.
Dev & Small | Medium & Large | |||
Default quota | Maximum quota | Default quota | Maximum quota | |
Published domain names | 50 | 100 | 200 | 500 |
Associated services | 100 | 200 | 300 | 500 |
Total routes | 200 | 500 | 1,000 | 2,000 |
Total online API operations | 1,000 | 2,000 | 3,000 | 5,000 |
K8s service sources | 2 | 3 | 3 | 5 |
Associated environments | 5 | 10 | 15 | 20 |
Resource quotas for Ingress scenarios
Resource quotas for Ingress scenarios depend on the gateway instance type. Do not exceed the resource quotas for the corresponding instance type. Exceeding the quotas can cause stability issues.
Quota limits are tied to the instance type. You can increase the quota only by changing the instance type or adding a new gateway cluster. Scaling out an instance of the same type does not increase the quota.
Dev | Small | Medium | Large | |
Domain names | 500 | 1,000 | 2,500 | 7,500 |
Services | 1,000 | 2,000 | 4,000 | 10,000 |
Routes | 1,000 | 2,000 | 4,000 | 10,000 |
Ingress | 1,500 | 1,000 | 2,500 | 7,500 |
Endpoints | 2,500 | 5,000 | 10,000 | 25,000 |