×
Community Blog Designing Production API Infrastructure with Alibaba Cloud API Gateway

Designing Production API Infrastructure with Alibaba Cloud API Gateway

This article examines how Alibaba Cloud API Gateway provides a managed layer for exposing, securing, and governing backend services and the architectu...

Most modern application systems will not have a monolithic architecture design. Such applications comprise distributed services, containers, managed compute resources, and third-party integrations, all of which have distinct needs around authentication, traffic, release schedules, and protocols. Making this system accessible by clients exposes them to inherent fragility. Changes to any part of the backend system break the existing client integration; each consumer implements unique methods of authentication, and any traffic anomaly hits the service components unequipped to handle them.

This is where the API gateway architectural approach comes into play. It is an intermediary that addresses those cross-cutting issues via a managed layer. Alibaba Cloud API Gateway takes that approach and provides managed infrastructure to provide API definition, authentication, traffic control, backend routing, and monitoring. This document outlines how the service works, what configuration choices need to be made, and what operations should be considered during deployment.

An API gateway processes incoming requests through the application of a series of stages. Each one is configurable and can be altered independently of other services.

Layer 1: API Definition and Request Routing

API Gateway categorizes APIs based on Groups, which represent a collection of endpoints that have the same hostname, TLS certificate settings, and environment stage. In each Group, APIs consist of HTTP methods, request paths, parameter schemas, and backends. For request paths, any path parameter would be denoted as {parameter}, and they will be extracted as named values that can be used at backend services or downstream plugins.

Every API exists in three distinct stages: RELEASE, PRE, and TEST, reflecting the production, staging, and testing environments. Through the use of stage variables, the configuration of endpoints, timeouts, and parameters would vary between different stages. This is how one API definition would route to various Function Compute, ECS, or VPC endpoints according to the stage.

Custom domain names can be associated with API Groups, and the TLS 1.2 protocol would be required. SNI (Server Name Indication) is employed when multiple branded domains use the same gateway endpoint, and certificates would be provided via file uploads or through the certificate management service.

Layer 2: Authentication and Authorisation

API Gateway supports four authentication models that can be selected per API: Alibaba Cloud APP authentication, OpenID Connect, JWT, and anonymous (no authentication). Selection is determined by the consumer identity model rather than by technical preference. Different APIs within a single Group can use different mechanisms.

Alibaba Cloud APP authentication is the default model for service-to-service and first-party application access. It uses an AppKey and AppSecret credential pair, with the AppSecret used to sign the request via HMAC-SHA256. The canonical signing string covers the HTTP method, accept and content-type headers, content hash, request timestamp, a selected subset of custom headers, and the path with query parameters. This breadth of coverage prevents both replay and tampering. Timestamp validation rejects requests outside a configurable window, defaulting to 15 minutes to limit the practical replay window, even if a signature is captured in transit.

JWT authentication is the model of choice for browser-based and mobile consumer applications. API Gateway validates the token signature against a configured JSON Web Key Set, checks standard claims (iss, aud, exp, nbf), and extracts custom claims into context variables available to backend services and plugins. The JWKS endpoint is fetched and cached, with cache TTL configurable to balance key rotation responsiveness against the load placed on the identity provider.

OpenID Connect extends JWT validation with full provider discovery and is appropriate where the identity provider exposes a standards-compliant discovery document. Anonymous access is reserved for genuinely public endpoints, typically health checks, read-only metadata endpoints, or webhooks that perform their own signature verification at the backend.

Layer 3: Traffic Management and Plugin Chain

Traffic control policies are applied to APIs through the plugin mechanism, with throttling, IP access control, CORS, and request transformation each implemented as configurable plugins bound to one or more APIs. Plugin execution order within the request lifecycle is fixed by the gateway runtime. Authentication runs before throttling, throttling before transformation, transformation before backend routing and is not user-configurable, ensuring consistent enforcement semantics across deployments.

Throttling plugins enforce QPS limits at three granularities: per-API (protecting the backend from total request volume), per-AppKey (preventing a single consumer from saturating shared capacity), and per-source-IP (defending against unauthenticated traffic spikes). Limits at all three levels can be configured simultaneously, with the most restrictive value applied. When a limit is exceeded, the gateway returns HTTP 429 with a Retry-After header indicating the cooldown window.

Plugins for parameter access control block requests for which any parameter values are not in an allowed list or allowed value range, thus freeing the backend services of the responsibility for validating input parameters. It is especially pertinent when Function Compute APIs are used because input parameter validation at the gateway level ensures that the backend service does not incur invocation charges for invalid requests.

The CORS plugin supports the pre-flight protocol, OPTIONS, and includes the correct Access-Control-Allow-* headers in the response, thus eliminating the need for cross-origin support in every backend service.

Layer 4: Backend Integration

API Gateway routes authenticated, rate-limited requests to one of several backend types. The selection of backend type is the primary architectural decision at this layer and determines the latency profile, scaling behaviour, and cost model.

HTTP and HTTPS backends route to any endpoint reachable from the gateway, including ECS instances, Server Load Balancer endpoints, third-party APIs, and self-managed services. Backend timeout defaults to 9 seconds and is adjustable up to 30 seconds; requests exceeding this threshold return HTTP 504 to the consumer. Connection reuse is handled by the gateway runtime and reduces per-request handshake overhead on TLS backends.

VPC backends route to private endpoints within a VPC without exposing those endpoints to the public network. The gateway establishes the routing via a VPC authorisation that binds the gateway to the target VPC, instance, and port, applying ECS or SLB security group rules as the access control mechanism.

Function Compute backends make a direct invocation of the target function via the Function Compute service API, passing on the request payload, headers, and path parameters of the incoming request to the function via event inputs. This eliminates the requirement of a separately configured endpoint or HTTP trigger. Gateway-based authentication and rate-limiting is applied before any costs associated with function invocation. This is the default configuration for APIs hosted on serverless services.

EventBridge backends send the request to an event bus that then delivers the event to one or more destinations (e.g., functions, message queues, third-party SaaS services). This approach effectively separates the API design from the chosen backend implementation and allows for fan-out scenarios, where a single API invocation results in several workflows running in parallel.

Mock backends generate a predefined response without invoking any backend logic. They help establish a contract between producers and consumers, who can build integrations against a predefined response schema while the backend itself is still being developed.

Operational Considerations

Three operational factors determine whether the gateway performs reliably under production load.

  1. Timeout cascade management: The gateway's backend timeout must be set lower than any client-side timeout to ensure the gateway returns a structured 504 response rather than the connection being terminated by the client. For Function Compute backends, the gateway timeout should additionally be set lower than the function's configured execution timeout to avoid cases where the function completes work that the gateway has already abandoned.
  2. Cache configuration for read-heavy APIs: Response caching can be enabled per API with a configurable TTL between 1 and 7200 seconds. Cache keys include the request path and a selectable subset of query parameters and headers. Caching is appropriate for APIs serving infrequently changing data and significantly reduces backend invocation cost, but must be evaluated against data freshness requirements for any endpoint with consumer-facing operational impact.
  3. Observability and access logging: Access logs are routed to Simple Log Service, where the standard log structure includes request method, path, status code, latency, AppKey, source IP, and gateway-side processing time separated from backend response time. The latter separation is critical for diagnosing whether elevated end-to-end latency originates in the gateway plugin chain or in the backend service, and should be the first signal consulted when investigating user-reported performance regressions.

Conclusion

The processing stages of Alibaba Cloud API Gateway include API definition and routing, authentication and authorisation, traffic management and plugin chain, and backend integration, providing a complete managed layer for exposing backend services on Alibaba Cloud.

  1. API definition organises endpoints into Groups with environment stages, parameter schemas, and custom domain binding.
  2. Authentication supports Alibaba Cloud APP, JWT, OpenID Connect, and anonymous models selectable per API.
  3. Traffic management applies throttling, parameter validation, and CORS through a fixed-order plugin chain.
  4. Backend integration routes requests to HTTP endpoints, VPC services, Function Compute, EventBridge, or Mock responses.

Engineers extending this architecture should evaluate two patterns based on workload. WebSocket APIs supported through a separate API type within the same gateway suit bidirectional protocols such as device control channels and live data feeds, with the same authentication and throttling models applied to the initial handshake. Custom plugins, written against the gateway's plugin SDK, enable transformation and validation logic that cannot be expressed through the standard plugin set, executed within the gateway runtime rather than at the backend.


Disclaimer: The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.

0 0 0
Share on

PM - C2C_Yuan

96 posts | 2 followers

You may also like

Comments

PM - C2C_Yuan

96 posts | 2 followers

Related Products

  • API Gateway

    API Gateway provides you with high-performance and high-availability API hosting services to deploy and release your APIs on Alibaba Cloud products.

    Learn More
  • AgentBay

    Multimodal cloud-based operating environment and expert agent platform, supporting automation and remote control across browsers, desktops, mobile devices, and code.

    Learn More
  • Database Gateway

    A tool product specially designed for remote access to private network databases

    Learn More
  • Microservices Engine (MSE)

    MSE provides a fully managed registration and configuration center, and gateway and microservices governance capabilities.

    Learn More