All Products
Search
Document Center

Key Management Service:KMS Agent overview

Last Updated:Jun 04, 2025

The Key Management Service (KMS) Agent is a client-side HTTP proxy that simplifies secret retrieval by centralizing KMS interactions. Applications retrieve secrets through local HTTP requests instead of direct SDK integration, minimizing code changes and ensuring uniform security policies. The agent handles secret management (configured once for all applications), in-memory caching, and periodic secret refreshes to reduce SDK call frequency and network overhead. It can be deployed in local environments, virtual machines such as Elastic Compute Service (ECS), and containerized systems.

How it works

The agent uses memory to cache secret values and refreshes the cached secret values periodically based on the Time To Live (TTL) you set. When an application requests secret values from the agent through HTTP requests, the agent verifies the legitimacy of the request through the Server-Side Request Forgery (SSRF) token file. If valid secret values exist in the cache, they are returned. Otherwise, the request is forwarded to the KMS service. After the service verifies the identity of the agent, it decrypts the secrets from the KMS service and returns them. The agent updates the cache and returns the secret values to the application through HTTP messages. The processes are shown in the following figures:

  • Cache hit process

    image
  • Cache miss (no cache or expired cache) process

    image

Deploy the agent alongside your applications in various environments, including local physical servers, virtual machines (such as ECS), and containers (such as Kubernetes pods). Visit alibabacloud-kms-agent for the code repository.

Architectural components

The agent comprises four components: HTTP server, cache, KMS client, and log.

image

You can configure these four components with a configuration file as described below. See alibabacloud-kms-agent for the source code.

# All configuration items
[Server]
# Optional, default value is 2025, Agent default listening address 127.0.0.1:2025
HttpPort = 2025
# Optional, default value is ["X-KMS-Token", "X-Vault-Token"].
# Access to Agent must carry SSRF Header, otherwise access is prohibited.
SSRFHeaders = ["X-KMS-Token"]
# Optional, default value is ["KMS_TOKEN", "KMS_SESSION_TOKEN", "KMS_CONTAINER_AUTHORIZATION_TOKEN"], variable value can be a specific value, or a file path such as file:///var/run/awssmatoken.
# Agent gets SSRF Token from Env, compares with Token carried in application access Header, access is allowed only if they match.
SSRFEnvVariables = ["KMS_TOKEN"]
# Optional, default value is "/v1/".
# URI prefix for path-based access
PathPrefix = "/v1/"
# Optional, default value is 800
# Maximum number of concurrent requests
MaxConn = 800
# Optional, default value is 0
# 0: secret content is returned in KMS GetSecretvalue API Response format; 1: secret content is returned in AWS SeceretManager GetSecretvalue API Response format; 2: Returned in HashiCorp KV structure.
ResponseType = 0
# Optional, default value true
# When IgnoreTransientErrors is true, when the cache is invalid and accessing the remote KMS encounters a failure, Response will return the expired secret in memory.
IgnoreTransientErrors = true

[Kms]
# Optional, default value is cn-hangzhou
# Region where KMS is located
Region = "cn-hangzhou"
# Optional, default value is kms.cn-hangzhou.aliyuncs.com
# Endpoint can be a shared gateway Endpoint or a dedicated gateway Endpoint
Endpoint = "kms.cn-hangzhou.aliyuncs.com"

[Cache]
# Optional, default is InMemory, currently only memory cache is supported
CacheType = "InMemory"
# Optional, default cache size is 1000 secrets, when CacheSize=0, cache is not used, each request accesses the remote KMS.
CacheSize = 1000
# Optional, cache time effectiveness, default value is 300s.
TtlSeconds = 300
# Optional, cache eviction policy, default is false if not filled.
# When the cache secrets reach the CacheSize limit, false means deleting the earliest cached secrets based on cache time, true means evicting the secrets that have been used least recently based on usage frequency.
EnableLRU = false

[Log]
# Optional, default log level Debug
LogLevel = "Debug"
# Optional, default log storage in ./logs/ of the application startup directory
LogPath = "./logs/"
# Optional, default single log size 100M
MaxSize = 100
# Optional, default retention of 2 log files
MaxBackups = 2

HTTP server

Used to respond to application requests for retrieving secrets. By default, the secret values returned by the agent have the same response format as GetSecretValue. Alternatively, set the ResponseType parameter in the configuration file to return other formats.

Supported request formats

  • Path-based request

    GET /v1/secretId
  • Query-based request

    GET /secretsmanager/get?secretId=<secretid>

Supported response formats

The agent is compatible with AWS Secrets Manager and HashiCorp Vault's KV storage structure, allowing users to migrate to Alibaba Cloud with minimal modifications. For code frameworks that have already integrated Spring Vault, simply change the access endpoint to the corresponding Alibaba Cloud KMS endpoint, and complete configuration adaptation through the agent to quickly switch to the Alibaba Cloud platform.

  • Alibaba Cloud KMS

    {
       "CreateTime": "2025-01-03T07:59:17Z",
       "RequestId": "cc315250-04c9-4caf-a055-6648f36598b9",
       "SecretData": "{\"k3\":\"v3\"}",
       "SecretDataType": "text",
       "SecretName": "agent-test",
       "SecretType": "Generic",
       "VersionId": "v2",
       "VersionStages": {
          "VersionStage": [
             "ACSCurrent"
          ]
       }
    }
  • AWS Secrets Manager

     {
       "ARN": "",
       "Name": "agent-test",
       "VersionId": "v2",
       "SecretString": "{\"k3\":\"v3\"}",
       "VersionStages": [
          "ACSCurrent"
       ],
       "CreatedDate": "2025-01-03T07:59:17Z"
    }
  • HashiCorp Vault

    {
       "data": {
          "k3": "v3"
       }
    }

Cache

The agent has a built-in memory caching mechanism. The secret values are not encrypted in the cache, and applications read from the local cache, reducing frequent requests to KMS. You can set cache time, cache size, and eviction policy to avoid business interruptions caused by expired secrets.

Important

Enhance the storage security of secret values in the cache through measures such as setting memory protection mechanisms, setting reasonable KMS Agent process access permissions, and deploying memory leak detection tools.

KMS client

Supports setting region and gateway endpoint. Both shared and dedicated gateway endpoints are supported.

Note

When using a dedicated gateway endpoint, the agent has built-in CA certificates for dedicated gateways in all regions, so you do not need to configure CA certificates.

Log

Based on the popular Zap logging framework, the agent provides logs in JSON format, and supports configuration of size limits for individual log files and the maximum number of log files to retain.

Security

Authentication and authorization

Agent accessing to KMS

The agent accesses KMS using Alibaba Cloud's default credential provider chain (prioritizing environment variables, followed by OIDC IdP RAM role, config.json, ECS RAM role, and finally credential URI) unless a specific initialization method is provided in credentials.NewDefaultCredentialsProvider().

To access secrets via KMS using RAM policies, the agent needs permissions to retrieve and decrypt them. Apply the principle of least privilege when configuring these permissions.

Applications accessing to the agent

The agent's HTTP server incorporates built-in SSRF protection. An SSRF token file (such as /var/run/kmstoken) is created upon startup. Applications must include this token in request headers for authentication. The following deployment scenarios ensure restricted access:

  • Linux: Defaults to restrict SSRF token file access to only the agent and the application's user.

  • Sidecar container: Deployed within the application's pod. Defaults to restrict SSRF token access to the pod.

Communication security

  • Agent-KMS service communication uses Transport Layer Security (TLS) to prevent eavesdropping and attacks. For enhanced security, use a dedicated gateway endpoint instead of a shared one. This limits traffic to your VPC network, and prevents public Internet exposure.

  • The agent only listens on 127.0.0.1, restricting access to the local machine.

Auditing and logging

The agent logs all secret retrieval operations in JSON format (using the Zap logging framework), with configurable log file size and retention policies. This ensures auditable operational records.

Stability

The agent ensures service continuity in complex network environments, and sudden failure scenarios through self-checking, retry mechanisms, and other measures.

  • Startup self-check mechanism.

    When the agent starts, it verifies connectivity to KMS. If the verification fails, the startup is terminated.

  • Error retry mechanism.

    The agent relies on Alibaba Cloud SDK (V2) to communicate with KMS. When network exceptions occur, it automatically resends requests using the built-in error retry logic of Alibaba Cloud SDK (V2). When encountering server-side throttling (HTTP 429) or internal server errors (HTTP 500), it will retry 3 times using an exponential backoff method for interval times.

  • Use of expired cache during failures.

    By setting the IgnoreTransientErrors parameter in the agent configuration file, when network or server-side failures are encountered, it will check and return old cached data, ensuring that applications do not fail to retrieve secrets due to short-term failures. The IgnoreTransientErrors parameter is enabled by default.

  • High availability guarantee based on systemd or Sidecar containers.

    • Linux (systemd): Managed by systemd on Linux, ensuring automatic restarts if the agent process crashes.

    • Sidecar container: Deployed as an init container, the agent failing will trigger a container restart, guaranteeing application stability.

Benefits

  • Performance and reliability

    The agent caches secret values in memory, reducing frequent requests to the KMS service in high-frequency access scenarios, avoiding throttling that may be caused by high-frequency access, thereby improving performance and business availability.

  • Compatibility

    The agent provides services based on standardized HTTP interfaces, supporting calls from applications in any programming language. When applications use different languages, the agent reduces integration difficulty.

  • Simplified integration

    Through the agent, applications can be decoupled from KMS. This reduces the complexity required for applications to interact with the KMS service. Applications only need to communicate with the agent without handling authentication, API calls, and other aspects when accessing the KMS service.

  • Centralized management and scalability

    For enterprise-level multi-application scenarios, the agent unifies management and control of access permissions, reducing the configuration of permissions on each client, ensuring uniformity in the integration process of various applications. When business needs to expand, the agent facilitates the integration of new applications, reducing permission configuration and code modifications that may be caused by using SDK.

KMS Agent vs. Secret Client

KMS Agent serves as a middle layer, with applications indirectly accessing the KMS service through the agent. Using a secret client requires applications to call the KMS service's API through SDK. The differences between the two are shown in the following table.

Aspect

KMS Agent

Secret Client

Recommended scenario

Enterprises with multiple applications and diverse programming languages, requiring centralized permission control and simplified, standardized integration.

Small or individual applications with simple access control requirements.

Deployment

An independent, application-decoupled process.

Integrated within application code as a library.

Integration complexity

Simple

Complex

Access control

Centralized policy enforcement via a single access point.

Decentralized: Each application self-manages policies.

Language support

Supports any language (using a universal HTTP interface).

Supports Java 8 and later, Python, and Go.

Performance

In-memory caching minimizes latency and throttling in high-frequency access.

High-frequency access may result in KMS throttling.

Secret rotation

Secrets are cached with a configurable TTL, automatically refreshing from KMS when needed to prevent retrieval failures.

Secrets are automatically retrieved from KMS using a refresh mechanism and retry logic.

Maintenance costs

Low: Single configuration for all applications.

High: Individual configuration per application.