This topic describes the billable items and pricing for AI Gateway. It covers two product types: dedicated instances and Serverless.
Billing methods
AI Gateway supports the following billing methods:
Pay-as-you-go: This is a postpaid billing method. You are billed on an hourly basis. Usage for less than one hour is billed as one hour. Charges are calculated on an hourly basis, and a bill is generated every 24 hours. Fees are automatically deducted from your Alibaba Cloud account balance. The actual billing time may vary.
Subscription: This is a prepaid billing method. You are billed on a monthly basis. An annual subscription is for a 12-month period. Fees are automatically deducted from your Alibaba Cloud account balance. The actual billing time may vary.
Billable items
AI Gateway is offered in two types: dedicated instances and Serverless.
Dedicated instances: Billable items include instance fees, data processing fees, and Internet traffic costs.
Serverless: Billable items include request CU fees and Internet traffic costs.
Dedicated instances
Instance fees
Region | Instance type | Pay-as-you-go price (USD/hour) | Subscription price (USD/month) |
The Chinese mainland (excluding Hong Kong (China), Macao (China), and Taiwan (China)) | aigw.small.x1 | 1.11069 | 559.65 |
aigw.small.x2 | 2.124948 | 1,071.084 | |
aigw.small.x4 | 4.022018 | 2,026.794 | |
aigw.medium.x1 | 7.854 | 3,958.64 | |
aigw.medium.x2 | 15.46776 | 7,795.76 | |
aigw.medium.x3 | 23.08096 | 11,632.88 | |
aigw.large.x1 | 29.927352 | 15,083.25 | |
aigw.large.x2 | 59.619378 | 30,048.018 | |
aigw.large.x3 | 89.31195 | 45,013.332 | |
China (Hong Kong), Japan (Tokyo) | aigw.small.x1 | 1.666322 | 839.475 |
aigw.small.x2 | 3.187422 | 1,606.626 | |
aigw.small.x4 | 6.033314 | 3,040.191 | |
aigw.medium.x1 | 11.78128 | 5,937.96 | |
aigw.medium.x2 | 23.20192 | 11,693.64 | |
aigw.medium.x3 | 34.62144 | 17,449.32 | |
aigw.large.x1 | 44.891028 | 22,624.875 | |
aigw.large.x2 | 89.428794 | 45,072.027 | |
aigw.large.x3 | 133.968198 | 67,519.998 | |
Singapore, Indonesia (Jakarta), Germany (Frankfurt) | aigw.small.x1 | 1.53258 | 772.317 |
aigw.small.x2 | 2.932566 | 1,478.09592 | |
aigw.small.x4 | 5.55058 | 2,796.97572 | |
aigw.medium.x1 | 10.8388 | 5,462.9232 | |
aigw.medium.x2 | 21.34552 | 10,758.1488 | |
aigw.medium.x3 | 31.85168 | 16,053.3744 | |
aigw.large.x1 | 41.299986 | 20,814.885 | |
aigw.large.x2 | 82.274556 | 41,466.26484 | |
aigw.large.x3 | 123.250764 | 62,118.39816 | |
US (Virginia), US (Silicon Valley) | aigw.small.x1 | 1.332828 | 671.58 |
aigw.small.x2 | 2.549708 | 1,285.3008 | |
aigw.small.x4 | 4.826192 | 2,432.1528 | |
aigw.medium.x1 | 9.4248 | 4,750.368 | |
aigw.medium.x2 | 18.5612 | 9,354.912 | |
aigw.medium.x3 | 27.69704 | 13,959.456 | |
aigw.large.x1 | 35.912604 | 18,099.9 | |
aigw.large.x2 | 71.543472 | 36,057.6216 | |
aigw.large.x3 | 107.17434 | 54,015.9984 |
Data processing fees
When you create an AI Gateway instance, you can select a network type: Internet, private network, or Internet + private network. Data processing for different network types is billed separately. Within a billing cycle, the data processing volume is the total volume of request and response data.
Region | Network type | Price (USD/GB/hour) |
Public Cloud | Internet | 0.005 |
Private network | 0.007 |
Billing formula
Internet: Hourly fee = Data processing volume × Price.
For example, you use a gateway instance for two hours. The instance processes 5 GB of data in the first hour and 10 GB in the second. The total data processing fee for the two hours is 0.005 × 5 + 0.005 × 10 = 0.075 USD.
Private network: Hourly fee = Data processing volume × Price.
For example, you use a gateway instance for two hours. The instance processes 5 GB of data in the first hour and 10 GB in the second. The total data processing fee for the two hours is 0.007 × 5 + 0.007 × 10 = 0.105 USD.
Serverless
Request CU fees
A Capacity Unit (CU) is the smallest unit used to measure the resource consumption of a Serverless instance.
CU fees are charged hourly. Usage for less than one hour is billed as one hour. The minimum billing granularity is 1,000 CUs, and usage is rounded up to the nearest 1,000 CUs. If the total CUs used in an hour is less than 100,000, you are billed for 100,000 CUs.
Hourly CU fee = max{Actual CUs for the hour, 100,000} × CU price
Billing examples:
If the actual number of CUs used in an hour is 50,200, which is less than 100,000, the billable CU count for that hour is 100,000.
If the actual number of CUs used in an hour is 150,200, which is greater than 100,000, the final billable CU count for that hour is 151,000. This is because the minimum billing granularity is 1,000 CUs.
CU measurement rules for different APIs
The CUs for a single API call are determined by the CU coefficient of the API type and the payload size. The formula is as follows:
CUs for a single API call = CU coefficient × CEILING{(Request payload + Response payload) / 512 KB}
The CU coefficients for each API type are as follows:
Model API: 10
MCP Server: 5
Agent API: 2
Billing examples:
For a Model API request with a 100 KB request payload and an 800 KB response payload, the CUs for this request are calculated as: 10 × CEILING{(100 + 800) / 512} = 20
For an MCP Server request with a 50 KB request payload and a 300 KB response payload, the CUs for this request are calculated as: 5 × CEILING{(50 + 300) / 512} = 5
CU price
Region | Price (USD/1,000 CUs) |
The Chinese mainland (excluding Hong Kong (China), Macao (China), and Taiwan (China)) | 0.0007 |
China (Hong Kong), Japan (Tokyo) | 0.001 |
Singapore, Indonesia (Jakarta), Germany (Frankfurt) | 0.001 |
US (Virginia), US (Silicon Valley) | 0.00085 |
Internet traffic costs
Internet traffic is billed through What is Cloud Data Transfer?. For more information, see Internet traffic.