Application Load Balancer (ALB) Extensible Edition supports inbound API key authentication. This feature validates credentials before forwarding requests to backend Large Language Model (LLM) services, which denies unauthorized access and protects your AI services.
Solution architecture
An ALB Extensible Edition instance receives client requests, and a forwarding rule matches these requests based on HTTP headers. The API key authentication component, associated with the forwarding rule through a service extension, is executed before the forwarding action. This component extracts the API key from the HTTP request header and compares it with preconfigured credentials. If the key matches, the request is forwarded to the backend AI service. If the key does not match or is not provided, the ALB returns a 401 response and blocks the request.
ALB Extensible Edition instance: Provides load balancing and traffic forwarding capabilities.
AI Service-type server group: Connects to the backend LLM service.
HTTPS listener: Receives client requests.
Forwarding rule: Matches and forwards requests based on HTTP header conditions.
Service extension: Implements inbound authentication and forwarding control using the API key authentication component.
Prerequisites
You have access to the ALB Extensible Edition public preview.
You have created a virtual private cloud (VPC) in the Singapore region. You have also created a vSwitch in Singapore Zone A and Singapore Zone B, and have enabled Internet SNAT for the vSwitches. This allows the AI Service-type server group to access the public LLM service.
You have activated Alibaba Cloud Model Studio and got an API key.
You have registered a custom domain name.
You have a server certificate that matches your custom domain name. If the certificate was not purchased from Alibaba Cloud, you must upload it to Alibaba Cloud Certificate Management Service.
Procedure
1. Create an ALB Extensible Edition instance
Log on to the ALB console, select the Singapore region, and click Create ALB.
On the buy page, perform the following configuration and click Buy Now.
For Region, select Singapore.
For Network Type, select Internet.
VPC and Zone: Select the target VPC, the Singapore Zone A and Singapore Zone B check boxes, and the corresponding vSwitches. Then, select Automatically assign EIP.
IP Version: Select IPv4.
For Edition, select Extensible.
On the Confirm Order page, review the instance configuration and click Activate Now.
2. Create an AI Service-type server group
Create an AI Service-type server group to connect to Alibaba Cloud Model Studio.
On the Server Groups page, click Create Server Group, set Server Group Type to AI Service, enter a name such as
sgp-ai-qwen, and click Create.In the The server group is created dialog box, click Add Backend Server.
In the Add AI Service panel, set the following configurations and click OK.
For Model provider, select Alibaba Cloud Model Studio.
Endpoint: This field populates automatically after you select a Model provider.
API Key: Enter your API key from Alibaba Cloud Model Studio.
3. Create a listener
In the ALB console, click the ID of the target instance to open the Instance Details page. On the Listener tab, click Create Listener.
In the Configure Listener step, set Listener Protocol to HTTPS and Listener Port to
443. Then, click Next.In the Configure SSL Certificate step, select the server certificate that corresponds to your custom domain name, and click Next.
In the Select Server Group step, select the AI Service type and the
sgp-ai-qwenserver group. Then, click Next.The server group that you select is used for the listener's default forwarding rule. This rule processes requests that do not match any other forwarding rules. You can change this setting as needed.
In the Configuration Review step, confirm your configuration and click Submit.
4. Create a service extension
Create a service extension and add an API key authentication component to extract the API key from the HTTP header and apply inbound authentication.
On the Service Extensions page, click Create Service Extension. Then, in the Service Extension Configuration section, enter an Extension name such as
ext-apikey-auth.For Extension Type, select the default option, Plug-in. For Component name, select API Key Authentication. Configure the authentication policy and click Create.
Credential Source: The default value is Authorization:Bearer<token>.
The
<token>is a placeholder for the API key that the client includes afterAuthorization: Bearerin the request. ALB extracts and validates the API key from this field.Generation Method: The default is System.
This topic uses the default values for Timeout and Processing policy:
1000andStop. You can adjust these values as needed.
Credential Source parameter supports multiple methods, such as Authorization:Bearer<token> (default), Custom HTTP header, Custom Query String, or Custom Cookie. You can select the method that meets your business requirements.
System method automatically generates API key credentials. After the service extension is created, you can view and copy the credentials on its Details page. Alternatively, you can select the Custom method to manually enter an API key.
5. Configure a forwarding rule
Create a forwarding rule on the listener, add an HTTP header condition, and associate the service extension.
In the ALB console, click the ID of the target instance to go to the Instance Details page. On this page, click the Listener tab. Then, click the ID of the target listener to go to the Listener Details page. On this page, click the Forwarding Rules tab.
Click Add New Rule, set the following configurations, and click OK.
Add Condition: Select HTTP Header, then set Key to
kand Value tov.k: vis only an example. For production environments, you can configure the HTTP header key-value pair or use other types of forwarding conditions as needed.Service Extension (Optional): The Use Existing Service Extension option is selected by default. Select
ext-apikey-authfrom the drop-down list.Action: Select Forward to and select the AI Service-type server group
sgp-ai-qwen.
After the forwarding rule is created, requests that contain the HTTP header k: v match this rule. The service extension extracts the <token> from the Authorization HTTP request header and uses it as an API key for authentication. If authentication is successful, the request is forwarded to the sgp-ai-qwen server group.
6. Configure DNS resolution
Point your custom domain name to the DNS name of the ALB instance using a CNAME record. This allows clients to access the ALB instance through your custom domain name.
This topic uses Alibaba Cloud DNS as an example. If your domain name is not registered with Alibaba Cloud, you must first add the domain name to Alibaba Cloud DNS.
In the ALB console, copy the Domain Name for the target instance.
Log on to the Alibaba Cloud DNS console. For the target domain name, click Settings in the Actions column. On the Settings page, click Add Record.
Add a CNAME record with the following information and click OK.
For Record Type, select CNAME.
Hostname: Enter a domain name prefix, such as
ai. For example, if your root domain isexample.com, the full domain name becomesai.example.com.Query Source and TTL: Use the default values.
Record Value: The DNS name of the ALB instance.
In the Change Resource Record Confirmation dialog box, verify the details and click OK.
7. Verify the configuration
Use the curl command to send a request to verify the API key authentication feature. The request must meet the following conditions:
API key authentication header: The request must include the
k: vheader to match the forwarding rule that is associated with the service extension.OpenAI-compatible protocol: The request path must be
/v1/completions,/v1/chat/completions, or/v1/embeddings, and the request body must comply with the protocol format.
The domain name ai.example.com in the following test commands is an example. When you perform the test, replace it with the actual domain name that you configured in Step 6. Make sure that the domain name resolution has taken effect.Request with correct credentials
The request includes the Authorization: Bearer <token> header field, where <token> is the API key that was generated by the system in Step 4.
curl -v \
-H "k: v" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-turbo",
"messages": [
{
"role": "user",
"content": "Who are you"
}
]
}' \
https://ai.example.com/v1/chat/completionsA successful request returns an HTTP 200 status code and a response from the AI service:
{
"choices": [
{
"message": {
"role": "assistant",
"content": "Hello! I am Qwen, a super-large language model developed by the Tongyi Lab of Alibaba Group..."
},
"finish_reason": "stop",
"index": 0
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 14,
"completion_tokens": 53,
"total_tokens": 67
},
"model": "qwen-turbo"
}Request with incorrect credentials or no credentials
Request with incorrect credentials
The request includes the Authorization: Bearer <token> header field, but <token> is not a valid credential.
curl -v \
-H "k: v" \
-H "Authorization: Bearer wrong-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-turbo",
"messages": [
{
"role": "user",
"content": "Who are you"
}
]
}' \
https://ai.example.com/v1/chat/completionsRequest with no credentials
The request does not include the Authorization: Bearer <token> header field.
curl -v \
-H "k: v" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-turbo",
"messages": [
{
"role": "user",
"content": "Who are you"
}
]
}' \
https://ai.example.com/v1/chat/completionsA failed request returns an HTTP 401 status code and the response body contains an authentication failure message.
HTTP header:
HTTP/2 401
content-length: 29
content-type: text/plain
date: Wed, 21 Jan 2026 05:58:31 GMTResponse body:
Client authentication failedResponse description:
HTTP status code:
401. This status code indicates that the request was rejected due to an identity authentication failure.Response body: A plain text message
Client authentication failed. This message indicates that the client authentication failed.
More information
Billing
ALB Extensible Edition: Currently in public preview and free to use.
Internet data transfer fees: The Internet NAT gateway charges instance fees and capacity unit (CU) fees. The NAT Gateway and the elastic IP addresses (EIPs) that are associated with the ALB Extensible Edition instance have separate billing rules. Fees are charged per EIP.
Domain name and DNS resolution fees: In addition to the domain name fee from your provider, you must pay public authoritative DNS resolution fees if you configure DNS resolution on Alibaba Cloud.
Certificate fees: You must pay server certificate fees if you purchase a certificate from Alibaba Cloud or upload a certificate to Alibaba Cloud.
Model Studio model fees: You are charged for calling Alibaba Cloud Model Studio LLM APIs.
Regions that support ALB Extensible Edition
Area | Region | Zone |
China | China (Ulanqab) | Zone A, Zone B, and Zone C |
China (Hangzhou) | Zone J and Zone K | |
Asia Pacific | Singapore | Zone A, Zone B, and Zone C |
Europe and Americas | Germany (Frankfurt) | Zone A and Zone B |
Apply in production
API key management: Rotate API keys on a regular basis. Avoid using the same credential for an extended period. If you use the system-generated method, securely store the generated API key to prevent leaks.
Credential source selection: Select a credential source. The default Authorization:Bearer<token> method is OpenAI compatible and suitable for most scenarios. For custom configurations, use the Custom HTTP header or Custom Query String method to prevent the API key from being exposed in the URL.
FAQ
Error: upstream connect error or disconnect/reset before headers. reset reason: connection timeout
This error usually indicates that the backend service is unreachable. Make sure that SNAT is correctly configured for the vSwitch where the ALB instance is located. This ensures that ALB can forward requests to the public Model Studio LLM service.
API key authentication is configured, but requests without credentials succeed
Make sure that the forwarding conditions match the request format and that the forwarding rule has a high enough priority. This ensures that requests that require authentication match the correct forwarding rule.
Make sure that the API key authentication component is correctly added to the service extension and that the extension is associated with the forwarding rule.