This topic provides capacity thresholds and queries per second (QPS) performance references for different instance types of Cloud-native API Gateway. This information helps you select the right instance type for your needs.
Capacity thresholds
The following table lists the capacity thresholds for different gateway instance types. You receive full Service-Level Agreement (SLA) coverage when gateway capacity metrics are below the alert level. For core services, keep gateway capacity metrics below the safe level to ensure greater stability.
Safe level: The gateway system can maintain high throughput and low latency even if burst traffic doubles.
Alert level: When the gateway operates above this level, latency may increase and there is a stability threat during traffic bursts.
The apigw.dev.x1 instance type is a single-node deployment and does not include an SLA. Use it for testing purposes only. For production services, ensure you use gateway instance types with multiple nodes.
The SLA does not cover request failures caused by CPU or memory usage exceeding the alert level. The gateway generates alerts when CPU and memory usage reach the alert level. Monitor the gateway's load levels and alerts promptly.
Gateway instance type | Client connections | New HTTPS connections per second | CPU usage | Memory usage | ||||
Safe level | Alert level | Safe level | Alert level | Safe level | Alert level | Safe level | Alert level | |
apigw.dev.x1 | 12,000 | 24,000 | 400 | 800 | 30% | 60% | 75% | 75% |
apigw.small.x1 | 24,000 | 48,000 | 800 | 1,600 | 30% | 60% | 75% | 75% |
apigw.small.x2 | 48,000 | 96,000 | 1,600 | 3,200 | 30% | 60% | 75% | 75% |
apigw.small.x4 | 96,000 | 192,000 | 3,200 | 6,400 | 30% | 60% | 75% | 75% |
apigw.medium.x1 | 192,000 | 384,000 | 6,400 | 12,800 | 30% | 60% | 75% | 75% |
apigw.medium.x2 | 384,000 | 768,000 | 12,800 | 25,600 | 30% | 60% | 75% | 75% |
apigw.medium.x3 | 576,000 | 1,152,000 | 19,200 | 38,400 | 30% | 60% | 75% | 75% |
apigw.large.x1 | 768,000 | 1,536,000 | 25,600 | 51,200 | 30% | 60% | 75% | 75% |
apigw.large.x2 | 1,536,000 | 3,072,000 | 51,200 | 102,400 | 30% | 60% | 75% | 75% |
apigw.large.x3 | 2,304,000 | 4,608,000 | 76,800 | 153,600 | 30% | 60% | 75% | 75% |
apigw.large.x4 | 3,072,000 | 6,144,000 | 102,400 | 204,800 | 30% | 60% | 75% | 75% |
QPS performance reference
Gateway QPS is affected by multiple factors, such as the acknowledgement size and whether HTTPS or gzip is enabled. The following table provides a conservative reference for QPS values (worst-case scenario) when gateway CPU usage is at 30%.
Creating new HTTPS connections consumes significant CPU resources. For services with a high number of concurrent, short-lived HTTPS connections, evaluate gateway capacity based on the HTTPS short-lived connection data in the table below.
Gateway instance type | apigw.dev.x1 | apigw.small.x1 | apigw.small.x2 | apigw.small.x4 | apigw.medium.x1 | apigw.medium.x2 | apigw.medium.x3 | apigw.large.x1 | apigw.large.x2 | apigw.large.x3 | apigw.large.x4 | |||
Connection type | Acknowledgement size (KB) | HTTPS enabled | Should I use gzip? | QPS reference at safe CPU level (30%) | ||||||||||
Short-lived connection | 1 | No | No | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 |
Yes | No | 500 | 1,000 | 2,000 | 4,000 | 8,700 | 17,400 | 26,100 | 34,800 | 69,600 | 104,400 | 139,200 | ||
Persistent connection | 1 | No | No | 2,200 | 4,400 | 8,800 | 17,600 | 35,000 | 70,000 | 105,000 | 140,000 | 280,000 | 420,000 | 560,000 |
Yes | No | 2,000 | 4,000 | 8,000 | 16,000 | 32,000 | 64,000 | 96,000 | 128,000 | 256,000 | 384,000 | 512,000 | ||
Yes | Yes | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 | ||
10 | No | No | 1,800 | 3,600 | 7,200 | 14,400 | 30,000 | 60,000 | 90,000 | 120,000 | 240,000 | 360,000 | 480,000 | |
Yes | No | 1,700 | 3,400 | 6,800 | 13,600 | 28,000 | 56,000 | 84,000 | 112,000 | 224,000 | 336,000 | 448,000 | ||
Yes | Yes | 1,000 | 2,000 | 4,000 | 8,000 | 16,000 | 32,000 | 48,000 | 64,000 | 128,000 | 192,000 | 256,000 | ||
Quota limits
Global quotas
Global quotas are independent of the gateway instance type. To request a quota increase, submit a ticket.
Default quota | Quota limit | |
Instances per region | 50 | 100 |
Total API operations per region | 10,000 | 20,000 |
API operations per API | 1,000 | 2,000 |
Instance type quotas
Instance type quotas depend on the gateway instance type. If a quota is still insufficient after you upgrade to a higher instance type, you can submit a ticket to request a further increase.
Dev & Small | Medium & Large | |||
Default quota | Quota limit | Default quota | Quota limit | |
Published domain names | 50 | 100 | 200 | 500 |
Associated services | 100 | 200 | 300 | 500 |
Total routes | 200 | 500 | 1,000 | 2,000 |
Total online API operations | 1,000 | 2,000 | 3,000 | 5,000 |
Number of K8s service sources | 2 | 3 | 3 | 5 |
Associated environments | 5 | 10 | 15 | 20 |
Resource quotas for Ingress scenarios
Resource quotas for Ingress scenarios depend on the gateway instance type. Do not exceed the quotas for your instance type to avoid stability issues.
Quota limits are determined by the instance type. You can increase quotas only by upgrading the instance type or adding a new gateway cluster. Scaling out an instance of the same type does not increase its quotas.
Dev | Small | Medium | Large | |
Domain Names | 500 | 1,000 | 2,500 | 7,500 |
Services | 1,000 | 2,000 | 4,000 | 10,000 |
Routes | 1,000 | 2,000 | 4,000 | 10,000 |
Ingress | 1,500 | 1,000 | 2,500 | 7,500 |
Endpoints | 2,500 | 5,000 | 10,000 | 25,000 |