All Products
Search
Document Center

API Gateway:Billing details

Last Updated:Sep 30, 2025

This topic describes the billable items and pricing for AI Gateway. It covers two product types: dedicated instances and Serverless.

Billing methods

AI Gateway supports the following billing methods:

  • Pay-as-you-go: This is a postpaid billing method. You are billed on an hourly basis. Usage for less than one hour is billed as one hour. Charges are calculated on an hourly basis, and a bill is generated every 24 hours. Fees are automatically deducted from your Alibaba Cloud account balance. The actual billing time may vary.

  • Subscription: This is a prepaid billing method. You are billed on a monthly basis. An annual subscription is for a 12-month period. Fees are automatically deducted from your Alibaba Cloud account balance. The actual billing time may vary.

Billable items

AI Gateway is offered in two types: dedicated instances and Serverless.

  • Dedicated instances: Billable items include instance fees, data processing fees, and Internet traffic costs.

  • Serverless: Billable items include request CU fees and Internet traffic costs.

Dedicated instances

Instance fees

Region

Instance type

Pay-as-you-go price (USD/hour)

Subscription price (USD/month)

The Chinese mainland (excluding Hong Kong (China), Macao (China), and Taiwan (China))

aigw.small.x1

1.11069

559.65

aigw.small.x2

2.124948

1,071.084

aigw.small.x4

4.022018

2,026.794

aigw.medium.x1

7.854

3,958.64

aigw.medium.x2

15.46776

7,795.76

aigw.medium.x3

23.08096

11,632.88

aigw.large.x1

29.927352

15,083.25

aigw.large.x2

59.619378

30,048.018

aigw.large.x3

89.31195

45,013.332

China (Hong Kong), Japan (Tokyo)

aigw.small.x1

1.666322

839.475

aigw.small.x2

3.187422

1,606.626

aigw.small.x4

6.033314

3,040.191

aigw.medium.x1

11.78128

5,937.96

aigw.medium.x2

23.20192

11,693.64

aigw.medium.x3

34.62144

17,449.32

aigw.large.x1

44.891028

22,624.875

aigw.large.x2

89.428794

45,072.027

aigw.large.x3

133.968198

67,519.998

Singapore, Indonesia (Jakarta), Germany (Frankfurt)

aigw.small.x1

1.53258

772.317

aigw.small.x2

2.932566

1,478.09592

aigw.small.x4

5.55058

2,796.97572

aigw.medium.x1

10.8388

5,462.9232

aigw.medium.x2

21.34552

10,758.1488

aigw.medium.x3

31.85168

16,053.3744

aigw.large.x1

41.299986

20,814.885

aigw.large.x2

82.274556

41,466.26484

aigw.large.x3

123.250764

62,118.39816

US (Virginia), US (Silicon Valley)

aigw.small.x1

1.332828

671.58

aigw.small.x2

2.549708

1,285.3008

aigw.small.x4

4.826192

2,432.1528

aigw.medium.x1

9.4248

4,750.368

aigw.medium.x2

18.5612

9,354.912

aigw.medium.x3

27.69704

13,959.456

aigw.large.x1

35.912604

18,099.9

aigw.large.x2

71.543472

36,057.6216

aigw.large.x3

107.17434

54,015.9984

Data processing fees

When you create an AI Gateway instance, you can select a network type: Internet, private network, or Internet + private network. Data processing for different network types is billed separately. Within a billing cycle, the data processing volume is the total volume of request and response data.

Region

Network type

Price (USD/GB/hour)

Public Cloud

Internet

0.005

Private network

0.007

Billing formula

  • Internet: Hourly fee = Data processing volume × Price.

    For example, you use a gateway instance for two hours. The instance processes 5 GB of data in the first hour and 10 GB in the second. The total data processing fee for the two hours is 0.005 × 5 + 0.005 × 10 = 0.075 USD.

  • Private network: Hourly fee = Data processing volume × Price.

    For example, you use a gateway instance for two hours. The instance processes 5 GB of data in the first hour and 10 GB in the second. The total data processing fee for the two hours is 0.007 × 5 + 0.007 × 10 = 0.105 USD.

Serverless

Request CU fees

A Capacity Unit (CU) is the smallest unit used to measure the resource consumption of a Serverless instance.

CU fees are charged hourly. Usage for less than one hour is billed as one hour. The minimum billing granularity is 1,000 CUs, and usage is rounded up to the nearest 1,000 CUs. If the total CUs used in an hour is less than 100,000, you are billed for 100,000 CUs.

Hourly CU fee = max{Actual CUs for the hour, 100,000} × CU price

Billing examples:

  • If the actual number of CUs used in an hour is 50,200, which is less than 100,000, the billable CU count for that hour is 100,000.

  • If the actual number of CUs used in an hour is 150,200, which is greater than 100,000, the final billable CU count for that hour is 151,000. This is because the minimum billing granularity is 1,000 CUs.

CU measurement rules for different APIs

The CUs for a single API call are determined by the CU coefficient of the API type and the payload size. The formula is as follows:

CUs for a single API call = CU coefficient × CEILING{(Request payload + Response payload) / 512 KB}

The CU coefficients for each API type are as follows:

  • Model API: 10

  • MCP Server: 5

  • Agent API: 2

Billing examples:

  1. For a Model API request with a 100 KB request payload and an 800 KB response payload, the CUs for this request are calculated as: 10 × CEILING{(100 + 800) / 512} = 20

  2. For an MCP Server request with a 50 KB request payload and a 300 KB response payload, the CUs for this request are calculated as: 5 × CEILING{(50 + 300) / 512} = 5

CU price

Region

Price (USD/1,000 CUs)

The Chinese mainland (excluding Hong Kong (China), Macao (China), and Taiwan (China))

0.0007

China (Hong Kong), Japan (Tokyo)

0.001

Singapore, Indonesia (Jakarta), Germany (Frankfurt)

0.001

US (Virginia), US (Silicon Valley)

0.00085

Internet traffic costs

Internet traffic is billed through What is Cloud Data Transfer?. For more information, see Internet traffic.