All Products
Search
Document Center

Server Load Balancer:Implement inbound JWT authentication with ALB Extensible Edition

Last Updated:Apr 03, 2026

ALB Extensible Edition uses inbound JWT authentication to validate tokens before forwarding requests to backend large language model (LLM) services. This rejects unauthorized access and secures your AI services.

How it works

An ALB Extensible Edition instance receives client requests, and a forwarding rule matches requests based on the HTTP header. A JWT authentication component, linked to the forwarding rule by a service extension, runs before the forwarding action. The component extracts the JWT from the HTTP request header and validates it using a public key from a remote JWKS server. If the token is valid, ALB forwards the request to the backend AI service. If the token is invalid or missing, ALB immediately returns a 401 response and blocks the request.

  • ALB Extensible Edition instance: Provides load balancing and traffic forwarding capabilities.

  • AI service-type server group: Connects to backend LLM services.

  • HTTPS listener: Receives client requests.

  • Forwarding rule: Matches and forwards requests based on HTTP header conditions.

  • Service extension: Implements inbound JWT authentication and request control by using the JWT authentication component.

  • JWKS server: This procedure uses Nginx to simulate a JWKS server that provides public key information.

image

Prerequisites

Procedure

Step 1: Create an ALB Extensible Edition instance

  1. Log on to the ALB console, select the China (Ulanqab) region, and click Create ALB.

  2. On the purchase page, configure the following parameters and click Create Now.

    • Region: Select China (Ulanqab).

    • Network Type: Select Internet.

    • VPC and Zone: Select the target VPC. Select the Ulanqab Zone A and Ulanqab Zone B checkboxes, choose the corresponding vSwitch for each zone, and then select Automatically assign EIP.

    • IP Version: Select IPv4.

    • Edition: Select Extensible.

  3. On the Confirm Order page, review the instance configuration details and click Activate Now.

Step 2: Create an AI service-type server group

Create an AI service-type server group to connect to Alibaba Cloud Model Studio.

  1. Navigate to the Server Groups page, click Create Server Group, set Server Group Type to AI Service, enter a name such as sgp-ai-qwen, and click Create.

  2. In the The server group is created dialog box, click Add Backend Server.

  3. In the Add AI Service dialog box, configure the parameters and click OK.

    • Model provider: Select Alibaba Cloud Model Studio.

    • Endpoint: This parameter is automatically populated after you select a Model provider.

    • API Key: Enter the API Key for Alibaba Cloud Model Studio.

Step 3: Create a listener

  1. In the ALB console, click the ID of the target instance to go to the Instance Details page. On the Listener tab, click Create Listener.

  2. In the Configure Listener step, set Listener Protocol to HTTPS and Listener Port to 443. Then, click Next.

  3. In the Configure SSL Certificate step, select the server certificate that matches your custom domain name and click Next.

  4. In the Select Server Group step, select the AI Service type and the server group sgp-ai-qwen. Then, click Next.

    The server group that you select is used for the default forwarding rule of the listener. It processes requests that do not match other forwarding rules. You can change the server group.
  5. In the Configuration Review step, confirm the configuration and click Submit.

Step 4: Deploy a JWKS server

Deploy Nginx on an ECS instance to simulate a JWKS server that provides JWT public key information. Make sure that ALB can access the JWKS server over the VPC.

Generate a JWT key pair and token

Use a Python script to generate a compliant public key JSON file and a JWT for testing.

  1. Log on to the ECS instance and install the required Python libraries. Alibaba Cloud Linux includes Python 3 by default. You only need to install the jwcrypto library:

    sudo pip3 install --upgrade pip
    sudo pip3 install jwcrypto
  2. Create the generation script generate_jwt_jwks.py.

    sudo vim generate_jwt_jwks.py

    Copy the following code to the file:

    import json
    import time
    import sys
    
    # Check for dependencies
    try:
        from jwcrypto import jwk, jwt
    except ImportError:
        print("Error: Library 'jwcrypto' is not installed.")
        print("Please run: pip3 install jwcrypto")
        sys.exit(1)
    
    def main():
        print(">>> Starting to generate JWKS and JWT...\n")
    
        # --- Configuration ---
        KEY_ID = "aliyun-test-kid"            # Key ID
        ISSUER = "http://your-nginx-server-ip" # Replace with actual Nginx IP or domain, or keep default
        SUBJECT = "test-user-001"             # Test user identifier
        
        # --- 1. Generate an RSA key pair (2048-bit) ---
        # Generate a key by using the RS256 algorithm.
        key = jwk.JWK.generate(kty='RSA', size=2048, alg='RS256', use='sig', kid=KEY_ID)
        
        # --- 2. Export the public key and save it as jwks.json ---
        # Nginx needs to read this file to verify the token.
        public_key = key.export_public(as_dict=True)
        jwks_data = {
            "keys": [public_key]
        }
    
        jwks_filename = "jwks.json"
        with open(jwks_filename, "w") as f:
            json.dump(jwks_data, f, indent=4)
            
        print(f"[Success] JWKS file created: ./{jwks_filename}")
        print(f"       Next step: Move this file to the Nginx directory.")
    
        # --- 3. Generate a signed JWT (by using the private key) ---
        claims = {
            "sub": SUBJECT,
            "iss": ISSUER,
            "name": "Aliyun Doc User",
            "role": "admin",
            "iat": int(time.time()),            
            # Set expiration to 100 years to avoid token expiry during testing.
            "exp": int(time.time()) + 3600 * 24 * 365 * 100 
        }
    
        # Create and sign the token.
        token = jwt.JWT(
            header={"alg": "RS256", "kid": KEY_ID, "typ": "JWT"},
            claims=claims
        )
        token.make_signed_token(key)
        
        print("\n[Success] Test JWT generated (valid for 100 years):")
        print("-" * 60)
        print(token.serialize())
        print("-" * 60)
        print("Action: Use this token for API request tests.")
    
    if __name__ == "__main__":
        main()
  3. Run the script to generate the file:

    sudo python3 generate_jwt_jwks.py

    Note: Record the token string from the script output. This string, which starts with eyJ..., is required for later verification.

Install and configure Nginx

Install Nginx on the ECS instance and configure it to host the jwks.json file generated in the previous step.

  1. Install Nginx:

    sudo yum install -y nginx
    sudo systemctl start nginx
    sudo systemctl enable nginx
  2. Prepare a directory and move the generated public key JSON file to it:

    # Create the directory.
    sudo mkdir -p /usr/share/nginx/html/auth
    
    # Move the jwks.json file.
    sudo cp jwks.json /usr/share/nginx/html/auth/jwks.json
    
    # Grant read permission.
    sudo chmod 644 /usr/share/nginx/html/auth/jwks.json
  3. In the /etc/nginx/default.d/ directory, create a configuration file named jwks.conf:

    sudo vim /etc/nginx/default.d/jwks.conf

    Add the following configuration:

    location /auth/v1 {
        # Disable the cache to ensure that the client always obtains the latest public key configuration.
        add_header Cache-Control "no-store, no-cache, must-revalidate";
        
        # Set the correct response type.
        default_type application/json;
        
        # Use an alias to point directly to the file.
        alias /usr/share/nginx/html/auth/jwks.json;
    }
  4. Verify the Nginx configuration and reload it:

    sudo nginx -t
    sudo systemctl reload nginx
  5. Test the JWKS server:

    curl http://localhost:port/auth/v1

    A successful response returns JSON-formatted data that contains the public key information for the RSA algorithm (alg: RS256). This indicates that the JWKS service is deployed successfully.

  6. Record the private IP address of the ECS instance. You will use it to configure the service extension.

Step 5: Create a service extension

Create a service extension, add the JWT authentication component, and configure the remote JWKS server address and JWT extraction method.

  1. Navigate to the Service Extensions page and click Create Service Extension. In the Service Extension Configuration section, enter an Extension name such as jwt-auth-extension.

  2. Extension Type is set to Plug-in by default. Select JWT Authentication from the Component name drop-down list. Configure the following parameters and click Create.

    • Remote Service: Enter the address of the JWKS server in the http://<ECS_private_IP>:<port>/auth/v1 format. Example: http://172.16.11.132:80/auth/v1.

    • Cache Time: The period for which the JWKS public key is cached. Use the default value of 300 seconds.

    • JWKS Token Configuration:

      • Type: Select By HTTP Header.

      • Key: Use the default value Authorization.

      • Value Prefix: Use the default value Bearer .

    • Timeout and Processing policy: Keep the default values 1000 and Terminate. You can modify these values.

Step 6: Configure a forwarding rule

Create a forwarding rule for the listener, add an HTTP header condition, and then associate the service extension with the forwarding rule.

  1. In the ALB console, click the ID of the target instance to go to the Instance Details page. Click the Listener tab, click the ID of the target listener to go to the Listener Details page, and then click the Forwarding Rules tab.

  2. Click Add New Rule, configure the following parameters, and then click OK.

    • Add Condition: Select HTTP Header, set Key to k, and set Value to v.

      The key-value pair k: v is for demonstration purposes. In a production environment, you can configure custom HTTP header key-value pairs or use other types of forwarding conditions.
    • Service Extension (Optional): Use Existing Service Extension is selected by default. Select jwt-auth-extension from the drop-down list.

    • Action: Select Forward and the AI service-type server group sgp-ai-qwen.

After you create the forwarding rule, requests that contain the HTTP header k: v match the rule. The service extension extracts the JWT from the Authorization HTTP header, retrieves the public key from the remote JWKS server to validate the token, and then forwards the request to the sgp-ai-qwen server group if the token is valid.

Step 7: Configure DNS resolution

Add a CNAME record to map your domain name to the DNS name of the ALB instance. This lets clients access the instance through your domain name.

This topic uses Alibaba Cloud DNS as an example. If your domain name is not registered with Alibaba Cloud, you must first add the domain name to the Alibaba Cloud DNS console.

  1. In the ALB console, copy the Domain Name of the target instance.

  2. Log on to the Alibaba Cloud DNS console. In the domain name list, find the domain name that you want to manage and click Settings in the Actions column. On the Settings page, click Add Record.

  3. Add a CNAME record with the following information and click OK.

    • Record Type: Select CNAME.

    • Hostname: Enter a prefix such as ai. If your root domain name is example.com, the domain name used to access ALB is ai.example.com.

    • Query Source and TTL: Use the default values.

    • Record Value: Enter the DNS name of the ALB instance.

  4. In the Change Resource Record Confirmation dialog box that appears, confirm the DNS record information and click OK.

Step 8: Test and verify

Use the curl command to send a request and verify the JWT authentication feature. The request must meet the following conditions:

  • Header to match the forwarding rule: The request must contain the k: v header to match the forwarding rule associated with the service extension.

  • Comply with the OpenAI-compatible protocol: The request path must be /v1/completions, /v1/chat/completions, or /v1/embeddings, and the request format must comply with the protocol.

The domain name ai.example.com in the following commands is an example. When testing, replace it with the domain name you configured in Step 7. Make sure that the DNS record has taken effect.

Request with a valid JWT

The request contains the Authorization: Bearer <token> header, where <token> is the JWT generated in Step 4.1.

# Replace  with the actual token from the script output in Step 4.1.
token="<Token_String>"

curl -v \
    -H "k: v" \
    -H "Authorization: Bearer $token" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

A successful request returns the HTTP 200 status code and the response from the AI service:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "I am Qwen, a large-scale language model developed by the Tongyi Lab of Alibaba Group. I can answer questions, create text, perform logical reasoning and programming, and complete various other tasks. You can ask me any question, and I will do my best to help.",
                "role": "assistant"
            }
        }
    ],
    "created": 1767613123,
    "id": "chatcmpl-01fb9300-df1d-98d1-9f5d-874fddebf13f",
    "model": "qwen-turbo",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 47,
        "prompt_tokens": 14,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "total_tokens": 61
    }
}

Request with an invalid JWT or without a JWT

Request with an invalid JWT

The request contains the Authorization: Bearer <token> header, but <token> is not a valid JWT.

token="wrong-jwt-token"

curl -v \
    -H "k: v" \
    -H "Authorization: Bearer $token" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

A failed request returns an HTTP 401 status code, and the response body contains the reason for the authentication failure.

HTTP response headers:

HTTP/2 401 
www-authenticate: Bearer realm="https://ai.example.com/v1/chat/completions", error="invalid_token"
content-length: 79
content-type: text/plain
vary: Accept-Encoding
date: Wed, 21 Jan 2026 06:57:46 GMT

HTTP response body (example):

Jwt verification fails

Request without a JWT

The request does not contain the Authorization: Bearer <token> header.

curl -v \
    -H "k: v" \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen-turbo",
        "messages": [
            {
                "role": "user", 
                "content": "Who are you"
            }
        ]
    }' \
    https://ai.example.com/v1/chat/completions

A failed request returns the HTTP 401 status code and the response body indicates that the JWT is missing.

HTTP response headers:

HTTP/2 401 
www-authenticate: Bearer realm="https://ai.example.com/v1/chat/completions"
content-length: 14
content-type: text/plain
date: Wed, 21 Jan 2026 07:00:10 GMT

HTTP response body:

Jwt is missing

More information

Billing

  • ALB Extensible Edition: This edition is in public preview and is free of charge.

  • Internet access fees: An Internet NAT gateway charges instance fees and capacity unit (CU) fees. The Elastic IP Addresses (EIPs) that are associated with the Internet NAT gateway and the ALB Extensible Edition instance have independent billing rules. You are charged separately for the EIPs.

  • ECS instance: For more information, see ECS billing overview. If you create an ECS instance for testing, we recommend that you create a low-specification pay-as-you-go instance and release it after the test is complete.

  • Domain name and DNS resolution fees: In addition to the fees that you pay to the domain name provider, you are charged public authoritative DNS fees when you configure DNS resolution on Alibaba Cloud.

  • Certificate fees: You are charged server certificate fees when you purchase a certificate from Alibaba Cloud or upload a certificate to Alibaba Cloud.

  • Model Studio model fees: You are charged for calling Model Studio LLM APIs.

Supported regions

Area

Region

Availability zone

China

China (Ulanqab)

Zone A, Zone B, Zone C

China (Hangzhou)

Zone J, Zone K

China (Beijing)

Zone K, Zone L

Asia Pacific

Singapore

Zone A, Zone B, Zone C

Europe & Americas

Germany (Frankfurt)

Zone A, Zone B

Production recommendations

  • Upgrade the authentication solution: This topic uses Nginx to simulate a JWKS server and uses a fixed token for authentication. In a production environment, we recommend that you deploy a standard identity provider (IdP) that provides a standard JWKS endpoint. The IdP can centrally manage key rotation and token issuance to improve security.

  • High-availability deployment: Deploy the authentication service across multiple availability zones or in a cluster to prevent authentication failures or business interruptions that are caused by single points of failure (SPOFs).

FAQ

Troubleshooting Jwks remote fetch is failed

This error means ALB could not retrieve the public key from the remote JWKS server. You can perform the following steps to troubleshoot the issue:

  • Ensure the remote service URL configured in the service extension is correct and uses the private IP address of the ECS instance.

  • Ensure the ALB instance and the ECS instance are in the same VPC and can communicate with each other.

  • Ensure the ECS instance's security group and its system firewall allow traffic from the vSwitch CIDR blocks to the Nginx service port.

  • Ensure the JWKS server is running as expected. You can run the curl command on the ECS instance to perform a local test.

Successful requests without tokens

  • Ensure the forwarding condition matches the actual request and that the forwarding rule has a high enough priority to ensure that requests that require authentication match the forwarding rule.

  • Ensure the JWT authentication component is correctly added to the service extension and is associated with the forwarding rule.