Addressing Identity Authentication and Secret Management in Automation: Exploring Vault's AppRole Solution

This article introduces HashiCorp Vault's AppRole identity authentication solution, analyzes its overall process, discusses its implementation methods in different scenarios, and summarizes my views.

By Fudai

1. What is Vault?

When Vault is mentioned alone, it may not ring a bell, but you might be familiar with its developer HashiCorp. Another well-known open-source project and product of HashiCorp is the IaC tool Terraform. Vault and Terraform complement each other, focusing on providing solutions for automation scenarios. As a result, one of Vault's product positions is to offer secure access to sensitive data during automation processes.

HashiCorp Vault is an enterprise-level open-source key management system that controls, manages, and safeguards access to sensitive data such as API keys, passwords, certificates, and other confidential information through identity authentication and authorization systems. It supports various cloud services and infrastructure platforms, offering a rich array of information security features, including static and dynamic secret management, data encryption, identity authentication and authorization, and providing various operational interfaces based on UI, CLI, and HTTP API.

2. Core Concepts of Vault

2.1 Authentication and Authorization

Authorization and Authentication are common concepts and terms in information security and access control. Due to the similarity between the English words, the abbreviation authZ is often used to represent the authorization operation, and authN for the authentication operation. Typically, authentication precedes authorization, as users must prove their identities before accessing subsequent operations.

Authentication is the process of confirming the user's identity. It involves verifying that the user is the access entity they claim to be, usually by providing one or more credentials, such as passwords, dynamic verification codes, and temporary credentials. Authorization occurs after authentication and decides what resources can be accessed and what operations can be performed by the authenticated user with permissions. During this process, the system checks if the resources requested by the user or the operations performed by the user meet their permission constraints.

2.2 Policy

A policy is a set of rules in Vault. It is used to define the detailed access control of an identity on Vault resources. By defining and applying policies, it determines what operations the current identity (user, application, or service) can perform, which paths can be accessed, what behaviors are allowed, and what behaviors are prohibited.

Policy Example of Key-value Secret Engine

path "secret/data/{{identity.entity.id}}/*" {
  capabilities = ["create", "update", "patch", "read", "delete"]
}

path "secret/metadata/{{identity.entity.id}}/*" {
  capabilities = ["list"]
}

When a user completes identity authN, a series of policies are applied to the obtained token. These policies determine the access permissions they have after the successful authN. At the same time, the application and change of the policy are dynamically isolated, which makes it unnecessary for the user to re-authenticate when the policy changes, thus increasing the flexibility of access control.

2.3 Token

Token is a core concept and model in Vault. It is used for identity authN and authZ. Vault tokens are similar to session tokens or access tokens in other systems. They are used to identify a session once. They have a validity period to allow clients who hold tokens to access secrets and other resources stored in Vault after the successful user or service authN.

When a user or service completes identity authN through the Vault authN method, Vault issues a token to the user. The token is associated with a series of policies that define the operations that the token holder can perform and the paths that the token holder can access. In general, a token has an independent lifecycle. During the lifecycle, the token can be renewed or revoked.

2.4 AppRole

The role is a key entity in the AppRole authN solution. You can apply a set of policies to a role so that it has access control permissions on the corresponding resources. Vault pre-defines permission policies and separately manages roles and policies. This improves the security of system control and limits the resource collections that can be accessed by entities that have passed the AppRole authN and the conditions that must be met for access.

The granularity of AppRole is very flexible. You can create one role for an application, one role for multiple machine instances with the same effect under an application, or even different roles for different users on a machine instance. Through this role division by different granularities, greater flexibility and security can be provided for the management and maintenance of roles.

2.5 Role ID and Secret ID

The Role ID is an identifier that Vault assigns to a specific "role". It can be simply understood as a user name or account in the account-password authN mode. It does not have sensitive characteristics, so separate encryption is not required.

The Secret ID is a secret identifier in the identity authN process. It is usually a string of random characters automatically generated by Vault, which can be simply understood as a password in the account-password authN mode. The Secret ID is used in conjunction with the Role ID to implement the identity authN process of the role. Due to its sensitive characteristics, some additional security measures, such as Response Wrapping, are required to ensure its security.

2.6 Wrapped Secret ID

In Vault, a wrapped Secret ID is obtained by using the Cubbyhole Response Wrapping technology, which is essentially a wrapped token. Therefore, you can obtain the original Secret ID of the AppRole by unwrapping operations.

Cubbyhole Response Wrapping technology is not a secret management solution that only takes effect for secrets, but a general security solution that wraps all request-responses. The caller can wrap the request response by carrying the specified configuration parameters during the request, thus obtaining a temporary one-time wrapped token. To obtain the original response, the user needs to perform the unwrapping operation on the wrapped token. This operation can be implemented either through API calls or CLI. From this perspective, Vault's Cubbyhole Response Wrapping technology is more like providing users with a temporary password box for request responses, and the key of the password box is the wrapped token. Therefore, in summary, the Response Wrapping technology finally realizes the security protection of the original response by referring to the original value of the response instead of transmitting the original value of the response.

3. What is AppRole AuthN?

AppRole authN is an identity authN mechanism provided by Vault and an identity authN solution for M2M (Machine-to-Machine) and A2M (App-to-Machine) scenarios. It is specially designed for automation workflows. The purpose is to improve the security and flexibility of obtaining secrets in non-human operation scenarios so that these workloads or automation tools can also securely obtain and manage secrets.

The AppRole authN solution uses two core elements Role ID and Secret ID to complete the authN process. A Role ID is equivalent to a user name and is used to identify a specific role. A Secret ID is equivalent to a password and is used to complete the identity authN process in conjunction with the Role ID. When an application or automation job attempts to obtain a secret stored in the Vault server, it can complete an authenticated login by providing the Role ID and Secret ID to obtain a token for subsequent operations.

It should be noted that the AppRole authN solution is not a trusted third-party authenticator, but a trusted broker. The difference between the two is that in the AppRole authN, the responsibility and key of trust lie in a securely managed broker that proxies identity verification between the client and the vault.

4. Core Principles of AppRole AuthZ

Reducing the Scope of Identities

As previously mentioned, Vault operates on an identity-based secrets management model, requiring authenticated client identities to access secrets. An identity authenticated by Vault must, therefore, be uniquely recognizable, and post-verification, should be constrained to access only the secrets pertinent to the associated user. Importantly, secrets should never be transferred via a proxy between Vault and the actual secret's end-user, nor should the client have the capability to access another end-user's secrets.

Following the principle of least privilege, policies and permissions assigned to an identity should be as restricted as possible to reduce the potential impact radius of that identity.

Shorten the Validity Period of AuthN

When Vault authenticates the identity of an entity, it issues a token if the identity authentication is successful. Subsequent interactions between the client and Vault require the token for identity authentication. Therefore, the token should be securely processed, ensuring that it has a limited lifecycle, only being valid for the operating cycle of its secret access operation. In other words, the token should only be valid when access to the authorized secrets is required.

Identities Meet Only at the End-user System

During identity authN from the trusted broker to Vault, the Role ID and Secret ID appear together only on the end-user system that needs to use the key. This principle is the core of the entire identity authN solution.

5. How is the Solution Implemented?

Figure 1 - Vault AppRole authN process (referenced from Vault Blog)

5.1 Send the Role ID to the Application

Step 1: Enable the AppRole Identity AuthN Method

As introduced in the Core Concept, identity authN is used to perform identity authN operations and is responsible for assigning identity and a set of access control policies to a user. Vault uses a maintenance display solution to manage the identity authN method, so when we need to use a non-default-enabled identity authN method in the system, we need to enable the identity authN method first.

The authN method can be enabled by using the API or CLI, and the path parameter can be used to specify the specific path to enable the authN method. Different paths correspond to different authN instances. This allows the administrator to enable different authN policies and rules for different users or use cases. If this parameter is not specified, the default path name will be used. You can run the following commands to enable the authN method of an AppRole:

Commands for Vault to Enable AppRole AuthN Method

vault auth enable approle

Step 2: Create an AppRole and Apply an AuthZ Policy

After you enable the authN method, you need to create an associated role for the corresponding application or instance and apply appropriate policies. In the AppRole authN method, the role is a core identity model that can be used to divide a group of instances. These instances can be all the instances under the same application or some instances with semantics. The role can be granted a set of policies to control its access permissions, and the granted permissions will be used to control which secrets can be accessed by the current application or instance.

Roles are defined by administrators. Vault does not detect the specific business implications of roles. Therefore, when two applications or instances have the same role, they will have the same permission to access the same secret. Therefore, from the perspective of security, you should follow the "principle of the least privilege" when setting roles and their policies, and divide the roles and permissions of each application or instance as small as possible. When two different applications need to access the same public secret, you can distinguish and manage permission policies by dividing "public policy" and "specific policy".

In addition, Vault also provides other measures to ensure the secure use of the AppRole. For example, when you create an AppRole, you can configure the validity period, number of visits, and source CIDR limit to ensure the security of the Secret ID.

Step 3 ~ Step 5: Obtain and Send the Role ID to the Application

Step 3 and Step 4 are the processes in which the administrator manually obtains the Role ID and sends it to the artifact or static configuration. The artifact here generally refers to the image to be deployed, and the static configuration generally refers to the static configuration file.

Step 5 is the process of forwarding the Role ID from the artifact or static configuration to the dynamic running application. Generally, it means that after the configuration, construction, and release are performed through the trusted platform, the dynamic running application will be able to read the Role ID from the static configuration.

5.2 Send the Wrapped Secret ID to the Application

Step 6 ~ Step 7: Obtain the Temporary Wrapped Secret ID

This part of the work is completed by the trusted publishing platform or trusted configuration center. Therefore, before all subsequent operations, the trusted platform must complete its trusted authN and prove its legitimacy to Vault. This process is not shown in the preceding figure but will be analyzed in the following CI pipeline scenario. After the trusted platform completes its trusted authN, it will obtain a token with the specified permission policy. The trusted platform can use the token to complete subsequent requests for wrapping secrets.

The trusted platform obtains a temporary wrapped Secret ID by requesting the Vault service and then sends the wrapped Secret ID to the application for use. It can be found here that what the trusted platform requests and sends to the application is not a Secret ID, but a wrapped Secret ID, which uses the above-mentioned Cubbyhole Response Wrapping technology. When users need to obtain the wrapped original request, they need to initiate an unwrapping request for the wrapped response to the Vault service. Vault is designed to implement the following purposes:

(1) Hide the Secret ID

By sending a wrapped ID instead of a real Secret ID, the Secret ID can be prevented from being forwarded among multiple intermediate systems. This reduces the number of nodes that may leak and improves the security of the secret transmission process.

(2) Shorten the Validity Period

The Response Wrapping mechanism of Vault allows you to manage the response validity period and secret validity period independently. This shortens the exposure time of sensitive data responses and ensures that the secret validity period is not affected.

(3) Prevent Data Tampering

The "temporary" and "one-time" characteristics of the Response Wrapping mechanism further guarantee the security of the entire process. By verifying the unwrapping result of the wrapped Secret ID, timely feedback can be obtained after the wrapped ID is maliciously and illegally unwrapped.

Step 8: Send the Wrapped Secret ID to the Application

After obtaining the wrapped Secret ID, the trusted platform needs to send the ID to the running application. In the CI pipeline scenario, this process can be completed by the publishing platform by setting the wrapped Secret ID as an environment variable and then reading it from the environment variable when the application is started. In containerization scenarios, you can mount the volume in a container to complete this process.

However, regardless of the distribution method, it must be guaranteed that it is not distributed in the same channel as the Role ID because the security assumption of Vault can be realized only by complying with this agreement. That is, the Role ID and the Secret ID should meet at, and only at, the user-end who needs to use the secret. At the same time, the application should ensure that the wrapped Secret ID is cleaned up promptly after it is used to avoid accidental leakage.

5.3 Application Requests a Client Token

Step 9: Unwrap to Obtain the Secret ID

After you complete the preceding steps, the application has both a Role ID and a wrapped Secret ID, which meets the conditions for the request to obtain the original Secret ID.

With the focus on the runtime application, the first step it needs to complete at this time is to "immediately" restore the wrapped Secret ID received from the trusted platform to a Secret ID that can be directly used. In the previous step, we briefly introduced the Cubbyhole Response Wrapping technology and learnt that if the application wants to obtain the original response, it needs to unwrap the wrapped response, so this step is the process of unwrapping the "wrapped Secret ID" to obtain the original Secret ID.

In this case, the application needs to initiate an unwrapping request to Vault as soon as possible (the abnormal unwrapping of the wrapped Secret ID can only be known during the second unwrapping), and this operation should be performed before the wrapping expires (the wrapped ID has a validity period), and can only be performed once (the wrapped ID is one-time). On the one hand, if the unwrapping fails at this time, it means that the wrapped token may have been illegally intercepted and consumed, which needs to be checked immediately. On the other hand, the path creation in the unwrapping response should be verified after the unwrapping to prevent illegal re-wrapping of the secret.

Step 10 ~ Step 11: Obtain a Client Token

The original Secret ID is obtained after the application is unwrapped. In this case, the application has both a Role ID and a Secret ID. This meets the prerequisites for obtaining a client token. After the application logs in by requesting the Vault server and using the Role ID and Secret ID, the application can obtain the token used for subsequent request operations. The token has a permission policy for the secrets required by the application for subsequent requests.

Step 12: Request to Obtain the Secret

In this case, the application has already held an access token, and the token must have the corresponding secret operation permissions (the set of secrets required to access the application). In this case, the application can use the token to complete subsequent secret requests to Vault.

5.4 Summary of the Process

Four Interactions of the Trusted Broker and Applications with Vault

During the establishment of the trusted process, the trusted broker and application interact with Vault a total of four times. The application can obtain the access token for subsequent requests only after all the authN interactions are completed. The preceding process omits the authN process of the trusted broker (trusted platform). This process will be described in the following CI pipeline scenario. In summary, these four interactions can be summarized as:

Obtain a trusted broker token: the trusted broker requests the original access token from Vault to issue the wrapped Secret ID used by subsequent applications.
Generate a wrapped Secret ID: the trusted broker requests the corresponding wrapped Secret ID from Vault according to the application's role name information.
Unwrap to obtain the Secret ID: the application uses the wrapped Secret ID sent by the trusted broker to initiate an unwrapping operation to Vault to redeem the real Secret ID.
Redeem the application's access token: the application uses the pre-set Role ID and the unwrapped Secret ID to redeem the token for subsequent access to Vault.

Two Types of Secret ID

The two types of Secret ID are wrapped Secret ID and real Secret ID. They are isolated from each other in lifecycle and configuration and have different use scenarios and troubleshooting. The "wrapped Secret ID" is a one-time identifier that is valid for a short period of time, mainly to solve the security problem of the "Secret ID" during transmission. It is essentially a "wrapped token" built based on the Response Wrapping technology. The real Secret ID is used to perform the identity authN of the role. It can be used in conjunction with the Role ID to complete the identity authN process of the role.

6. Specific Implementation Scenario

6.1 CI Pipeline: Sending the Secret ID by the Environment Variable

Figure 2 - Best practice for Vault AppRole authN method in the CI pipeline (referenced from Vault Blog)

CI Worker is Initialized to Obtain an Access Token

In the application's role authN scenario of CI pipeline, as the core trusted broker, the CI Worker bears the responsibility of sending the Secret ID to the CI Job. As the core node of the entire authN and authZ system, the CI Worker theoretically handles all Secret IDs sent to the CI Job, so it is the core node of the entire authN system.

In the whole usage process, the initialization of CI Worker is the starting point of the whole process, and its core steps correspond to Step 1 and Step 2 in the above figure. That is, authenticate to the Vault server to obtain the token issued after authN. This is the first step in the entire trusted process. When Vault completes the authN of the CI Worker, it indicates that Vault has trusted the CI Worker and allows the trusted process to be passed down.

In the official documents of Vault, the officially recommended method to complete the identity authN of the CI Worker is based on the platform level. The platform-level authN method here is generally based on a trusted three-party platform. However, in the "Vault returns a token" section of the article, the official has also expressed its confidence in the security of the process of Vault sending tokens to the CI Worker, and even said that the administrator can directly send the tokens that the CI Worker needs to use to the Worker in advance. You can simply hard-code the Role ID and Secret ID directly.

The premise is that developers can manage the permissions of roles according to the best practice. In the best practice, the token issued by Vault to CI Worker should only have permission to obtain the wrapped Secret ID (the permission example is shown as follows), which makes it impossible to independently decrypt the secret in the CI Job when only the token of CI Worker is available.

Policy Example to Obtain a Wrapped Secret ID

path "auth/approle/role/+/secret*" {
  capabilities = [ "create", "read", "update" ]
  min_wrapping_ttl = "100s"
  max_wrapping_ttl = "300s"
}

CI Worker Requests a Wrapped Secret ID

After the CI Worker is initialized, it has obtained a token with subsequent operation permissions and has the permission to request the Vault server to obtain the wrapped Secret ID required by the to-be-used CI Job. This corresponds to Step 3 and Step 4 in the above figure: the CI Worker requests Vault to obtain its corresponding wrapped Secret ID according to the role of the CI Job.

Two concepts need to be noted here. One is "wrapped Secret ID". As can be seen from the introduction above, the wrapped Secret ID means that a validity period is added to the Secret ID, so that the secret can be unwrapped by using the 'vault unwrap' command only within the validity period. Such an operation process conforms to the second of the three basic principles of Vault, which is to shorten the validity period of authN as much as possible and ensure the closed loop of permissions.

The other concept is "role". When the CI Worker requests Vault to obtain the wrapped Secret ID, the only two pieces of information it knows are CI Job ID and Approle Role Name. At this time, the CI Worker does not know any Role ID information about the CI Job. Therefore, the CI worker can only request the wrapped Secret ID from Vault with the Role Name of AppRole.

This is the reason why Vault is so confident about the security of CI Worker tokens. It is because the CI Worker will never obtain the Role ID, but only the wrapped Secret ID. According to Vault's narrative, the core principle of this security mechanism is that during the authN from the broker to Vault, the Role ID and Secret ID will appear together only on the end-user system that needs to use the key.

The specific request is shown as follows. wrap-ttl sets a validity period for the requested Secret ID to generate a wrapped Secret ID. In the current example, the validity period is 120 seconds. The my-role in the request path is the Role Name of this request, not the Role ID.

Command Example to Request a Wrapped Secret ID

vault write -wrap-ttl=120s -f auth/approle/role/my-role/secret-id

CI Worker Sends the Wrapped Secret ID to the CI Job

In Step 5, the CI Worker obtains the wrapped Secret ID from Vault and sends the wrapped Secret ID to the CI Job in the form of an environment variable after the CI Job is created. In this way, the CI Job can obtain the value from the environment variable. The following example shows how to request the wrapped Secret ID and execute the sending command: use the name of the CI job as the Role Name, request the corresponding wrapped Secret ID (valid for 300 seconds) from Vault, and then set it to the environment variable WRAPPED_SID of the task.

Example to Request the Wrapped Secret ID and Send it through an Environment Variable

environment {
   WRAPPED_SID = """$s{sh(
                    returnStdout: true,
                    Script: ‘curl --header "X-Vault-Token: $VAULT_TOKEN"
       --header "X-Vault-Namespace: ${PROJ_NAME}_namespace"
       --header "X-Vault-Wrap-Ttl: 300s"
         $VAULT_ADDR/v1/auth/approle/role/$JOB_NAME/secret-id’
         | jq -r '.wrap_info.token'
                 )}"""
  }

CI Job Unwraps the Wrapped Secret ID

After the CI Job is started, it will unwrap the wrapped Secret ID. That is, restore the wrapped Secret ID with lifecycle to a real and valid Secret ID. This step can be completed by the 'vault unwrap' command or its corresponding API request, which corresponds to Step 6 and Step 7 in the above figure. It should be noted that each wrapped Secret ID can be unwrapped only once to ensure the security of the entire process.

CI Job Obtains the Access Token

After completing the unwrapping operation of the wrapped Secret ID, the CI Job obtains the real and valid Secret ID. At the same time, since the Role ID is naturally kept by the CI Job, the current running already has both the Secret ID and the Role ID and can obtain the access token used in subsequent requests from Vault. This process corresponds to Step 8 and Step 9 in the figure above.

After this step is performed, the entire authN trusted process has been established. It can be found that in the whole process, as Vault said, the wrapped Secret ID is dynamically sent to the CI Job by the CI Worker, and the real and valid Secret ID can only be obtained after unwrapping, while the Role ID is only known by the CI Job itself. This ensures that the Role ID and the Secret ID appear together only on the end-user system (currently CI Job) that needs to use the secret, which greatly ensures the security of the key.

CI Job Uses the Access Token to Perform the Secret Access Operation

Finally, as shown in Step 10, when the CI Job obtains the access token through the Role ID and Secret ID, subsequent request operations of the task are performed based on the token, and the access permissions it has are also controlled by the token.

6.2 Containerization Scenario: Sending the Secret ID by Mounting the Volume in a Container

Application Obtains the Secret ID

The implementation solution in the containerization scenario is similar to that in the CI pipeline scenario, except that the sending of the Secret ID is replaced from the environment variable to mounting the volume on a container. The entire process is shown in the following Figure 3.1 and Figure 3.2. Figure 3.1 is the process of the application started to obtain the Secret ID, and Figure 3.2 is the process of the application obtaining the access token and finally initiating the secret request. Next, let's summarize the process in Figure 3.1:

The Role ID is pre-set before the application is started.
The pipeline requests the Vault server to obtain the wrapped Secret ID.
The pipeline writes the wrapped Secret ID to the storage that can be mapped to the container-mounted volume.
Read the wrapped Secret ID from the container-mounted volume when the application is started.
The application initiates an unwrapping request for the wrapped Secret ID to the Vault server.
The Vault server returns the unwrapped real Secret ID to the application.

Figure 3.1 - Workflow of the application obtaining the wrapped Secret ID in a containerization scenario

Request a Client Token and Query Secrets

The premise of Figure 3.2 is that the application has obtained both the Role ID and the Secret ID, and then can use them to initiate an access token redemption request to the Vault server. The following figure describes the process:

The application initiates a login request to the Vault server by carrying the Role ID and Secret ID to obtain an access token.
After the Vault server passes the verification, the access token corresponding to the role is returned to the application.
After obtaining the access token, the application clears the wrapped Secret IDs in the volume mounted on the container.
Subsequent applications can use the access token to initiate access requests to which they have access secrets.

Figure 3.2 - Workflow of the application requesting the client token and querying secrets in a containerization scenario

7. Reflections on the Solution

7.1 Advantages

Security

After learning the whole solution, the most amazing thing is the Vault's extreme pursuit of security, which may drive some developers who pursue stability crazy. The main reason is that it is designed based on a centralized single-primary solution, accompanied by a series of risks of secret expiration and token one-time consumption. However, as an identity-based key management system, it has essentially achieved the utmost in terms of security control. From the perspective of the three basic core principles of the solution, whether it is to narrow the identity scope or shorten the validity period of authN, the risk of key leakage is reduced as much as possible.

A basic dual-secret authN mechanism is provided through a combined authN of Role ID and Secret ID. Then, through fine-grained access control, managers can assign different policies to each role to accurately control the permissions of the application, realize the principle of the least privilege, and give the Role ID and Secret ID separate security access control capabilities, such as providing Secret ID valid period limit, and trusted CIDR source limit. Finally, through the periodic rotation technology of providing Secret IDs and access tokens, the security risks that may be caused by long-term use of the same secret are minimized.

Auditability

Although this article does not focus on the audit function of the solution, by studying the audit module design of Vault, you can find that "auditable operation" is deeply embedded in the entire product design of Vault. All operations performed by the client after the identity authN is completed must be based on the access token obtained after the authN. Each request operation of the client can be accurately traced through the request record of the access token. However, such a graceful auditing solution and auditability mainly benefit from the Vault's entire external interface design. Both UI, CLI, and HTTP API are implemented based on the same set of Open API, so each operation of the client can be decomposed into some HTTP operations performed on a certain path.

Flexibility

The flexibility of the solution is reflected in many aspects. On the one hand, Vault provides many configuration items for the management of role configuration, such as setting the IP address block that the current role can log in, setting the validity period of the role Secret ID, and setting the maximum number of role accesses to the access token. On the other hand, the configuration of access control policies also provides great flexibility and versatility, which is the same as the point mentioned in Auditability. Since Vault designs its external interface based on standard HTTP REST API protocols, the design of policies can be very graceful and versatile. Only resource paths and operation types can be used to specify the access control of an operation, that is, What You See is What You Get.

Statelessness

Unlike traditional identity authN based on machine fingerprints, Vault uses a combination of dynamic tokens and static identifiers to complete identity authN. During this process, the secret is converted twice, from the wrapped Secret ID to the original Secret ID, and then from the original Secret ID to the final access token.

Vault implements stateless identity authN by sending short-term valid and one-time wrapped tokens based on the trusted broker platform and applying its static Role ID. That is, it does not need to rely on other centralized data centers to complete the stateful verification of machine fingerprints during the identity authN. This improves the flexibility and stability of the solution to a certain extent and avoids the problem of identity authN failure caused by multiple data inconsistencies.

7.2 Possible Problems

Complexity and Cost of Learning

Overall, the learning cost of all Vault solutions is relatively high, which can be understood as an inevitable result of "high flexibility" and "high configurability". On the one hand, since Vault provides many entity models, such as identities, aliases, entities, policies, tokens, and token accessors, each entity model has its operating point, coupled with numerous authN methods and secret engines, making the learning cost of the entire solution relatively high. On the other hand, since there are many entity models, the effective management and maintenance of these entities also require high costs.

For example, according to Vault's design, a default policy is usually applied directly to an access token, but you can also apply a policy to the entity of the token, thus realizing the superposition of the token policies. You can even apply a policy to the identity group where the identity entity is located, and then superimpose the policy on the first two. You can even realize the inheritance of the group policy. At this moment, flexibility is maximized, but so is complexity.

Performance and Extensibility

Since Vault is a single-primary write architecture implemented based on the Raft consensus protocol, the system may encounter performance bottlenecks when facing a large number of requests, especially during intensive write operations. This problem may be more pronounced in AppRole authN scenarios where there are a large number of write operations. In the open-source basic solution, the single primary node architecture reduces the effect of horizontal expansion, because adding more backup nodes does not improve the write performance of the system but only improves the read performance and redundancy of the system.

On the one hand, although Vault provides highly available cluster deployment and replication capabilities between data centers, there are still great challenges in setting up and managing these complex architectures. For data synchronization and consistency, unexpected exceptions may occur suddenly when the network quality is poor or when the deployment is across regions. On the other hand, since the solution provides many extended features, such as the lifecycle management of secrets and tokens and the renewal of leases, as the data scale expands, the complexity and cost of maintaining and synchronizing the state information may become higher and higher, even affecting the basic functions of the system.

Stability and Robustness

From the perspective of technical architecture, one problem that may exist in the single primary node architecture is the impact on stability. Since only the primary node in the entire cluster can process write requests, when the primary node is down, it needs to perform a failover and elect a new primary node. This process may cause a short service interruption, which poses a great risk for key management service that needs to perform high-frequency write and refresh operations.

For security design, there is a large amount of entity lifecycle maintenance in the solution, and the entity needs to continuously renew its secrets to ensure the validity of its secrets, which may amplify the risk of stability. At the same time, given the security, the user should not perform the spilling process for the obtained Secret ID and token, which often leads to the strong dependence of the automation system on the broker, so that the operation without the broker will eventually fail (such as restarting the application directly through the machine script in an emergency). However, such operations may be very common in large-scale microservice systems.

The one-time validity of the wrapped Secret ID may also affect the robustness of the system in abnormal situations, reducing the fault tolerance of the entire system. For example, after a one-time wrapped Secret ID is unwrapped, the Secret ID is lost. At this time, since the wrapped Secret ID has been consumed, the unwrapping operation cannot be performed again. In this case, the problem can only be solved by re-sending a new wrapped Secret ID by the trusted broker.

Reference

[01] What is Vault?
[02] AppRole auth method
[03] Vault Tokens
[04] Policies
[05] How (and Why) to Use AppRole Correctly in HashiCorp Vault
[06] Recommended pattern for Vault AppRole use
[07] Cubbyhole response wrapping

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.