The Key Management Service (KMS) Agent is a client-side HTTP proxy that simplifies secret retrieval by centralizing KMS interactions. Applications retrieve secrets through local HTTP requests instead of direct SDK integration, minimizing code changes and ensuring uniform security policies. The agent handles secret management (configured once for all applications), in-memory caching, and periodic secret refreshes to reduce SDK call frequency and network overhead. It can be deployed in local environments, virtual machines such as Elastic Compute Service (ECS), and containerized systems.
How it works
The agent uses memory to cache secret values and refreshes the cached secret values periodically based on the Time To Live (TTL) you set. When an application requests secret values from the agent through HTTP requests, the agent verifies the legitimacy of the request through the Server-Side Request Forgery (SSRF) token file. If valid secret values exist in the cache, they are returned. Otherwise, the request is forwarded to the KMS service. After the service verifies the identity of the agent, it decrypts the secrets from the KMS service and returns them. The agent updates the cache and returns the secret values to the application through HTTP messages. The processes are shown in the following figures:
Cache hit process
Cache miss (no cache or expired cache) process
Deploy the agent alongside your applications in various environments, including local physical servers, virtual machines (such as ECS), and containers (such as Kubernetes pods). Visit alibabacloud-kms-agent for the code repository.
Architectural components
The agent comprises four components: HTTP server, cache, KMS client, and log.
You can configure these four components with a configuration file as described below. See alibabacloud-kms-agent for the source code.
# All configuration items
[Server]
# Optional, default value is 2025, Agent default listening address 127.0.0.1:2025
HttpPort = 2025
# Optional, default value is ["X-KMS-Token", "X-Vault-Token"].
# Access to Agent must carry SSRF Header, otherwise access is prohibited.
SSRFHeaders = ["X-KMS-Token"]
# Optional, default value is ["KMS_TOKEN", "KMS_SESSION_TOKEN", "KMS_CONTAINER_AUTHORIZATION_TOKEN"], variable value can be a specific value, or a file path such as file:///var/run/awssmatoken.
# Agent gets SSRF Token from Env, compares with Token carried in application access Header, access is allowed only if they match.
SSRFEnvVariables = ["KMS_TOKEN"]
# Optional, default value is "/v1/".
# URI prefix for path-based access
PathPrefix = "/v1/"
# Optional, default value is 800
# Maximum number of concurrent requests
MaxConn = 800
# Optional, default value is 0
# 0: secret content is returned in KMS GetSecretvalue API Response format; 1: secret content is returned in AWS SeceretManager GetSecretvalue API Response format; 2: Returned in HashiCorp KV structure.
ResponseType = 0
# Optional, default value true
# When IgnoreTransientErrors is true, when the cache is invalid and accessing the remote KMS encounters a failure, Response will return the expired secret in memory.
IgnoreTransientErrors = true
[Kms]
# Optional, default value is cn-hangzhou
# Region where KMS is located
Region = "cn-hangzhou"
# Optional, default value is kms.cn-hangzhou.aliyuncs.com
# Endpoint can be a shared gateway Endpoint or a dedicated gateway Endpoint
Endpoint = "kms.cn-hangzhou.aliyuncs.com"
[Cache]
# Optional, default is InMemory, currently only memory cache is supported
CacheType = "InMemory"
# Optional, default cache size is 1000 secrets, when CacheSize=0, cache is not used, each request accesses the remote KMS.
CacheSize = 1000
# Optional, cache time effectiveness, default value is 300s.
TtlSeconds = 300
# Optional, cache eviction policy, default is false if not filled.
# When the cache secrets reach the CacheSize limit, false means deleting the earliest cached secrets based on cache time, true means evicting the secrets that have been used least recently based on usage frequency.
EnableLRU = false
[Log]
# Optional, default log level Debug
LogLevel = "Debug"
# Optional, default log storage in ./logs/ of the application startup directory
LogPath = "./logs/"
# Optional, default single log size 100M
MaxSize = 100
# Optional, default retention of 2 log files
MaxBackups = 2HTTP server
Used to respond to application requests for retrieving secrets. By default, the secret values returned by the agent have the same response format as GetSecretValue. Alternatively, set the ResponseType parameter in the configuration file to return other formats.
Cache
The agent has a built-in memory caching mechanism. The secret values are not encrypted in the cache, and applications read from the local cache, reducing frequent requests to KMS. You can set cache time, cache size, and eviction policy to avoid business interruptions caused by expired secrets.
Enhance the storage security of secret values in the cache through measures such as setting memory protection mechanisms, setting reasonable KMS Agent process access permissions, and deploying memory leak detection tools.
KMS client
Supports setting region and gateway endpoint. Both shared and dedicated gateway endpoints are supported.
When using a dedicated gateway endpoint, the agent has built-in CA certificates for dedicated gateways in all regions, so you do not need to configure CA certificates.
Log
Based on the popular Zap logging framework, the agent provides logs in JSON format, and supports configuration of size limits for individual log files and the maximum number of log files to retain.
Security
Authentication and authorization
Agent accessing to KMS
The agent accesses KMS using Alibaba Cloud's default credential provider chain (prioritizing environment variables, followed by OIDC IdP RAM role, config.json, ECS RAM role, and finally credential URI) unless a specific initialization method is provided in credentials.NewDefaultCredentialsProvider().
To access secrets via KMS using RAM policies, the agent needs permissions to retrieve and decrypt them. Apply the principle of least privilege when configuring these permissions.
Applications accessing to the agent
The agent's HTTP server incorporates built-in SSRF protection. An SSRF token file (such as /var/run/kmstoken) is created upon startup. Applications must include this token in request headers for authentication. The following deployment scenarios ensure restricted access:
Linux: Defaults to restrict SSRF token file access to only the agent and the application's user.
Sidecar container: Deployed within the application's pod. Defaults to restrict SSRF token access to the pod.
Communication security
Agent-KMS service communication uses Transport Layer Security (TLS) to prevent eavesdropping and attacks. For enhanced security, use a dedicated gateway endpoint instead of a shared one. This limits traffic to your VPC network, and prevents public Internet exposure.
The agent only listens on 127.0.0.1, restricting access to the local machine.
Auditing and logging
The agent logs all secret retrieval operations in JSON format (using the Zap logging framework), with configurable log file size and retention policies. This ensures auditable operational records.
Stability
The agent ensures service continuity in complex network environments, and sudden failure scenarios through self-checking, retry mechanisms, and other measures.
Startup self-check mechanism.
When the agent starts, it verifies connectivity to KMS. If the verification fails, the startup is terminated.
Error retry mechanism.
The agent relies on Alibaba Cloud SDK (V2) to communicate with KMS. When network exceptions occur, it automatically resends requests using the built-in error retry logic of Alibaba Cloud SDK (V2). When encountering server-side throttling (HTTP 429) or internal server errors (HTTP 500), it will retry 3 times using an exponential backoff method for interval times.
Use of expired cache during failures.
By setting the IgnoreTransientErrors parameter in the agent configuration file, when network or server-side failures are encountered, it will check and return old cached data, ensuring that applications do not fail to retrieve secrets due to short-term failures. The IgnoreTransientErrors parameter is enabled by default.
High availability guarantee based on systemd or Sidecar containers.
Linux (systemd): Managed by
systemdon Linux, ensuring automatic restarts if the agent process crashes.Sidecar container: Deployed as an init container, the agent failing will trigger a container restart, guaranteeing application stability.
Benefits
Performance and reliability
The agent caches secret values in memory, reducing frequent requests to the KMS service in high-frequency access scenarios, avoiding throttling that may be caused by high-frequency access, thereby improving performance and business availability.
Compatibility
The agent provides services based on standardized HTTP interfaces, supporting calls from applications in any programming language. When applications use different languages, the agent reduces integration difficulty.
Simplified integration
Through the agent, applications can be decoupled from KMS. This reduces the complexity required for applications to interact with the KMS service. Applications only need to communicate with the agent without handling authentication, API calls, and other aspects when accessing the KMS service.
Centralized management and scalability
For enterprise-level multi-application scenarios, the agent unifies management and control of access permissions, reducing the configuration of permissions on each client, ensuring uniformity in the integration process of various applications. When business needs to expand, the agent facilitates the integration of new applications, reducing permission configuration and code modifications that may be caused by using SDK.
KMS Agent vs. Secret Client
KMS Agent serves as a middle layer, with applications indirectly accessing the KMS service through the agent. Using a secret client requires applications to call the KMS service's API through SDK. The differences between the two are shown in the following table.
Aspect | KMS Agent | Secret Client |
Recommended scenario | Enterprises with multiple applications and diverse programming languages, requiring centralized permission control and simplified, standardized integration. | Small or individual applications with simple access control requirements. |
Deployment | An independent, application-decoupled process. | Integrated within application code as a library. |
Integration complexity | Simple | Complex |
Access control | Centralized policy enforcement via a single access point. | Decentralized: Each application self-manages policies. |
Language support | Supports any language (using a universal HTTP interface). | Supports Java 8 and later, Python, and Go. |
Performance | In-memory caching minimizes latency and throttling in high-frequency access. | High-frequency access may result in KMS throttling. |
Secret rotation | Secrets are cached with a configurable TTL, automatically refreshing from KMS when needed to prevent retrieval failures. | Secrets are automatically retrieved from KMS using a refresh mechanism and retry logic. |
Maintenance costs | Low: Single configuration for all applications. | High: Individual configuration per application. |