Billing for auto scaling - API Gateway - Alibaba Cloud Documentation Center

This topic describes the billable items of Cloud-native API Gateway instances during automatic scale-outs. This helps you understand and calculate the fees that may be incurred when you use the auto scaling feature of Cloud-native API Gateway.

Capacity water levels

Cloud-native API Gateway supports auto scaling. You can configure a period for auto scaling to better cope with traffic surges. Scale-outs are automatically performed by unit. The following items describe the capacity of a unit:

Secure thresholds: On and beyond secure thresholds, your gateway instance can ensure high throughput and low latency even when traffic is doubled.
Warning thresholds: When your instance hits warning thresholds, it endures increased latency and is subject to stability risks in cases of traffic surges.

Scale-out unit		1 unit
Client connections	Secure threshold	12,000
Client connections	Warning threshold	24,000
New HTTPS connections per second	Secure threshold	400
New HTTPS connections per second	Warning threshold	800
CPU utilization	Secure threshold	30%
CPU utilization	Warning threshold	60%
Memory usage	Secure threshold	75%
Memory usage	Warning threshold	75%

QPS reference

The queries per second (QPS) throughput of a gateway instance is affected by various factors, such as the response size, HTTPS enabling, and GZIP compression. The following table lists the worst-case QPS values of different instance specifications based on a 30% CPU utilization:

Note

New HTTPS connections consume large amounts of CPU resources. In business scenarios in which a large number of HTTPS connections need to be established at the same time, you can estimate your instance capacity requirements based on the data of short-lived connections in the following table:

Scale-out specification				1 unit
Connection type	Response size (KB)	HTTPS	GZIP	QPS reference at a CPU utilization of 30%
Short-lived connection	1	No	No	1,700
Short-lived connection	1	Yes	No	500
Long-lived connection	1	No	No	2,200
		Yes	No	2,000
		Yes	Yes	1,700
	10	No	No	1,800
		Yes	No	1,700
		Yes	Yes	1,000

Scale-outs of different instance specifications

Specification	Existing units	Scale-out step size	Scale-out range
apigw.dev.x1	1	N/A (This specification does not support auto scale-out.)
apigw.small.x1	2	1 unit	3-4 units
apigw.small.x2	4	1 unit	5-8 units
apigw.small.x4	8	1 unit	9-16 units
apigw.medium.x1	16	4 units	20-32 units
apigw.medium.x2	32	4 units	36-64 units
apigw.medium.x3	48	4 units	52-96 units
apigw.large.x1	64	4 units	68-128 units
apigw.large.x2	128	4 units	132-256 units
apigw.large.x3	192	4 units	196-384 units
apigw.large.x4	256	4 units	260-512 units

Billing description

For the elastic nodes that are automatically scaled out, the system separately charges fees on a pay-as-you-go basis. In the billing details, the fees generated for the elastic nodes are separately listed.

Region	Scale-out unit	Pay-as-you-go unit price (USD/hour)
Chinese mainland	Unit	0.146
China (Hong Kong) and Japan (Tokyo)		0.218
Singapore, Indonesia (Jarkata), and Germany (Frankfurt)		0.202
US (Virginia) and US (Silicon Valley)		0.174

Scale-out fee = Number of scaled-out units x Unit price x Duration in hours. For example, if your instance is scaled out by 2 units between 8:00 and 10:00 in a Chinese mainland region, the scale-out fee is 2 x 0.146 x 2 = USD 0.584.