All Products
Search
Document Center

Server Load Balancer:Implement inbound API key authentication using an ALB Extensible Edition instance

Last Updated:Mar 20, 2026

Application Load Balancer (ALB) Extensible Edition supports inbound API key authentication. This feature validates credentials before forwarding requests to backend Large Language Model (LLM) services, which denies unauthorized access and protects your AI services.

Solution architecture

An ALB Extensible Edition instance receives client requests, and a forwarding rule matches these requests based on HTTP headers. The API key authentication component, associated with the forwarding rule through a service extension, is executed before the forwarding action. This component extracts the API key from the HTTP request header and compares it with preconfigured credentials. If the key matches, the request is forwarded to the backend AI service. If the key does not match or is not provided, the ALB returns a 401 response and blocks the request.

  • ALB Extensible Edition instance: Provides load balancing and traffic forwarding capabilities.

  • AI Service-type server group: Connects to the backend LLM service.

  • HTTPS listener: Receives client requests.

  • Forwarding rule: Matches and forwards requests based on HTTP header conditions.

  • Service extension: Implements inbound authentication and forwarding control using the API key authentication component.

image

Prerequisites

Procedure

1. Create an ALB Extensible Edition instance

  1. Log on to the ALB console, select the Singapore region, and click Create ALB.

  2. On the buy page, perform the following configuration and click Buy Now.

    • For Region, select Singapore.

    • For Network Type, select Internet.

    • VPC and Zone: Select the target VPC, the Singapore Zone A and Singapore Zone B check boxes, and the corresponding vSwitches. Then, select Automatically assign EIP.

    • IP Version: Select IPv4.

    • For Edition, select Extensible.

  3. On the Confirm Order page, review the instance configuration and click Activate Now.

2. Create an AI Service-type server group

Create an AI Service-type server group to connect to Alibaba Cloud Model Studio.

  1. On the Server Groups page, click Create Server Group, set Server Group Type to AI Service, enter a name such as sgp-ai-qwen, and click Create.

  2. In the The server group is created dialog box, click Add Backend Server.

  3. In the Add AI Service panel, set the following configurations and click OK.

    • For Model provider, select Alibaba Cloud Model Studio.

    • Endpoint: This field populates automatically after you select a Model provider.

    • API Key: Enter your API key from Alibaba Cloud Model Studio.

3. Create a listener

  1. In the ALB console, click the ID of the target instance to open the Instance Details page. On the Listener tab, click Create Listener.

  2. In the Configure Listener step, set Listener Protocol to HTTPS and Listener Port to 443. Then, click Next.

  3. In the Configure SSL Certificate step, select the server certificate that corresponds to your custom domain name, and click Next.

  4. In the Select Server Group step, select the AI Service type and the sgp-ai-qwen server group. Then, click Next.

    The server group that you select is used for the listener's default forwarding rule. This rule processes requests that do not match any other forwarding rules. You can change this setting as needed.
  5. In the Configuration Review step, confirm your configuration and click Submit.

4. Create a service extension

Create a service extension and add an API key authentication component to extract the API key from the HTTP header and apply inbound authentication.

  1. On the Service Extensions page, click Create Service Extension. Then, in the Service Extension Configuration section, enter an Extension name such as ext-apikey-auth.

  2. For Extension Type, select the default option, Plug-in. For Component name, select API Key Authentication. Configure the authentication policy and click Create.

    • Credential Source: The default value is Authorization:Bearer<token>.

      The <token> is a placeholder for the API key that the client includes after Authorization: Bearer in the request. ALB extracts and validates the API key from this field.
    • Generation Method: The default is System.

    • This topic uses the default values for Timeout and Processing policy: 1000 and Stop. You can adjust these values as needed.

    Credential Source parameter supports multiple methods, such as Authorization:Bearer<token> (default), Custom HTTP header, Custom Query String, or Custom Cookie. You can select the method that meets your business requirements.
    System method automatically generates API key credentials. After the service extension is created, you can view and copy the credentials on its Details page. Alternatively, you can select the Custom method to manually enter an API key.

5. Configure a forwarding rule

Create a forwarding rule on the listener, add an HTTP header condition, and associate the service extension.

  1. In the ALB console, click the ID of the target instance to go to the Instance Details page. On this page, click the Listener tab. Then, click the ID of the target listener to go to the Listener Details page. On this page, click the Forwarding Rules tab.

  2. Click Add New Rule, set the following configurations, and click OK.

    • Add Condition: Select HTTP Header, then set Key to k and Value to v.

      k: v is only an example. For production environments, you can configure the HTTP header key-value pair or use other types of forwarding conditions as needed.
    • Service Extension (Optional): The Use Existing Service Extension option is selected by default. Select ext-apikey-auth from the drop-down list.

    • Action: Select Forward to and select the AI Service-type server group sgp-ai-qwen.

After the forwarding rule is created, requests that contain the HTTP header k: v match this rule. The service extension extracts the <token> from the Authorization HTTP request header and uses it as an API key for authentication. If authentication is successful, the request is forwarded to the sgp-ai-qwen server group.

6. Configure DNS resolution

Point your custom domain name to the DNS name of the ALB instance using a CNAME record. This allows clients to access the ALB instance through your custom domain name.

This topic uses Alibaba Cloud DNS as an example. If your domain name is not registered with Alibaba Cloud, you must first add the domain name to Alibaba Cloud DNS.

  1. In the ALB console, copy the Domain Name for the target instance.

  2. Log on to the Alibaba Cloud DNS console. For the target domain name, click Settings in the Actions column. On the Settings page, click Add Record.

  3. Add a CNAME record with the following information and click OK.

    • For Record Type, select CNAME.

    • Hostname: Enter a domain name prefix, such as ai. For example, if your root domain is example.com, the full domain name becomes ai.example.com.

    • Query Source and TTL: Use the default values.

    • Record Value: The DNS name of the ALB instance.

  4. In the Change Resource Record Confirmation dialog box, verify the details and click OK.

7. Verify the configuration

Use the curl command to send a request to verify the API key authentication feature. The request must meet the following conditions:

  • API key authentication header: The request must include the k: v header to match the forwarding rule that is associated with the service extension.

  • OpenAI-compatible protocol: The request path must be /v1/completions, /v1/chat/completions, or /v1/embeddings, and the request body must comply with the protocol format.

The domain name ai.example.com in the following test commands is an example. When you perform the test, replace it with the actual domain name that you configured in Step 6. Make sure that the domain name resolution has taken effect.

Request with correct credentials

The request includes the Authorization: Bearer <token> header field, where <token> is the API key that was generated by the system in Step 4.

curl -v \
    -H "k: v" \
    -H "Authorization: Bearer <token>" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

A successful request returns an HTTP 200 status code and a response from the AI service:

{
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Hello! I am Qwen, a super-large language model developed by the Tongyi Lab of Alibaba Group..."
            },
            "finish_reason": "stop",
            "index": 0
        }
    ],
    "object": "chat.completion",
    "usage": {
        "prompt_tokens": 14,
        "completion_tokens": 53,
        "total_tokens": 67
    },
    "model": "qwen-turbo"
}

Request with incorrect credentials or no credentials

Request with incorrect credentials

The request includes the Authorization: Bearer <token> header field, but <token> is not a valid credential.

curl -v \
    -H "k: v" \
    -H "Authorization: Bearer wrong-api-key" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

Request with no credentials

The request does not include the Authorization: Bearer <token> header field.

curl -v \
    -H "k: v" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

A failed request returns an HTTP 401 status code and the response body contains an authentication failure message.

HTTP header:

HTTP/2 401 
content-length: 29
content-type: text/plain
date: Wed, 21 Jan 2026 05:58:31 GMT

Response body:

Client authentication failed

Response description:

  • HTTP status code: 401. This status code indicates that the request was rejected due to an identity authentication failure.

  • Response body: A plain text message Client authentication failed. This message indicates that the client authentication failed.

More information

Billing

Regions that support ALB Extensible Edition

Area

Region

Zone

China

China (Ulanqab)

Zone A, Zone B, and Zone C

China (Hangzhou)

Zone J and Zone K

Asia Pacific

Singapore

Zone A, Zone B, and Zone C

Europe and Americas

Germany (Frankfurt)

Zone A and Zone B

Apply in production

  • API key management: Rotate API keys on a regular basis. Avoid using the same credential for an extended period. If you use the system-generated method, securely store the generated API key to prevent leaks.

  • Credential source selection: Select a credential source. The default Authorization:Bearer<token> method is OpenAI compatible and suitable for most scenarios. For custom configurations, use the Custom HTTP header or Custom Query String method to prevent the API key from being exposed in the URL.

FAQ

Error: upstream connect error or disconnect/reset before headers. reset reason: connection timeout

This error usually indicates that the backend service is unreachable. Make sure that SNAT is correctly configured for the vSwitch where the ALB instance is located. This ensures that ALB can forward requests to the public Model Studio LLM service.

API key authentication is configured, but requests without credentials succeed

  • Make sure that the forwarding conditions match the request format and that the forwarding rule has a high enough priority. This ensures that requests that require authentication match the correct forwarding rule.

  • Make sure that the API key authentication component is correctly added to the service extension and that the extension is associated with the forwarding rule.