Identity and access management - Well-Architected Framework - Alibaba Cloud - Well-Architected Framework

Design and implement identity and access management (IAM) so that only verified identities can reach the cloud resources they need, under the conditions you define.

In cloud security architecture, identity is the perimeter. Every request to a cloud resource passes through an identity check: a developer logging in, an application calling an API, or a background service accessing a database. A well-designed IAM system verifies who is making the request, decides what that identity is allowed to do, and records a complete audit trail.

Weak IAM design is one of the most common root causes of cloud security incidents. Overly permissive roles, shared credentials, and unmonitored access create attack surface that is otherwise avoidable.

Terminology

Term	Definition
Authentication (AuthN)	Verifying that an identity is who or what it claims to be.
Authorization (AuthZ)	Determining which resources an authenticated identity is permitted to access and which actions it can perform.
Least privilege	Granting each identity only the permissions required to perform its function, and nothing more.
Workload identity	An identity assigned to an application, service, or automated process rather than a human user.

Identity types

Before designing controls, classify every identity that interacts with your workload. Two broad categories exist:

Human identities: People who access your Alibaba Cloud environment or applications.
- Administrators and operators: Internal team members who manage cloud resources, deploy infrastructure, and respond to incidents.
- Developers: Engineers who need access to build, test, and debug workloads.
- End users: Customers or partners who interact with your application directly.
Workload identities: Non-human principals such as applications, microservices, batch jobs, and automation scripts that call Alibaba Cloud APIs or access other services.

Classifying these categories upfront ensures your IAM design has no coverage gaps. Most IAM incidents involve an identity type that was never formally considered during design.

Authentication

Authentication answers the question: Is this identity who it claims to be?

Design authentication to be strong by default, with no reliance on shared or static credentials.

Best practices for authentication

Use RAM users and RAM roles instead of the root account. The Alibaba Cloud root account has unrestricted access to all resources. Reserve it for initial account setup only, then lock it down with a strong password and multi-factor authentication (MFA). For all day-to-day operations, use Resource Access Management (RAM) users or roles.
Require MFA for human identities. Passwords alone are insufficient for privileged access. Enforce MFA on all RAM users, especially those with administrative permissions. MFA significantly reduces the risk from compromised credentials.
Use RAM roles for workload identities. Applications and services running on Alibaba Cloud — for example, on Elastic Compute Service (ECS) instances, Function Compute, or Container Service for Kubernetes — should assume RAM roles to obtain temporary, automatically rotated credentials. This eliminates the need to embed long-term access keys in code or configuration files.

Tradeoff: Role-based workload identity requires that your runtime environment supports credential injection (for example, instance metadata or an OIDC token endpoint). For environments that do not support this, manage access keys manually — rotate them regularly and store them in a dedicated secrets manager such as Key Management Service (KMS).

Avoid long-term access keys wherever possible. Static access keys do not expire and are frequently leaked through source repositories, log files, or misconfigured storage. When access keys are required, grant them the minimum necessary permissions and rotate them on a defined schedule.
Enforce strong password policies for RAM users. Set minimum password length, require complexity (mixed case, numbers, special characters), and configure password expiration.

Non-identity-based secrets

IAM covers identities, but applications also authenticate to other services using non-identity credentials such as API keys, database passwords, and TLS certificates. These secrets require their own management discipline:

Store secrets in a dedicated secret management service (such as Alibaba Cloud KMS) rather than in environment variables, source code, or configuration files checked into version control.
Pull secrets dynamically at runtime rather than baking them into container images or deployment artifacts.
Build in the ability to revoke and rotate any secret quickly. If a secret is compromised, you need to be able to invalidate it within minutes.
Set expiry dates on secrets and enforce rotation before expiry.

Authorization

Authorization answers the question: What is this identity allowed to do?

A secure authorization design limits the blast radius of any compromised identity. If an attacker gains access to a credential, they can reach only a small, well-defined set of resources.

Best practices for authorization

Apply least privilege to every identity. Grant only the permissions an identity needs to perform its specific function. Avoid wildcard actions () and wildcard resources () in RAM policies. Review permissions regularly and remove any that are no longer needed.

Tradeoff: Granular permission policies reduce risk but add operational overhead. Each workload or role requires a custom policy, and permission errors can block legitimate operations. Invest in a repeatable policy review process and use Alibaba Cloud's permission simulation tools to test policies before applying them.

Use RAM roles for cross-account and cross-service access. When a service or user in one Alibaba Cloud account needs to access resources in another account, use role assumption (Security Token Service (STS) AssumeRole) with a well-scoped trust policy. This avoids sharing long-term credentials across account boundaries.
Separate duties across roles. No single identity should be able to both create and approve the same change, or both deploy infrastructure and administer security policies. Separation of duties limits the damage from insider threats and compromised accounts.
Scope access by resource and condition. RAM policies support resource-level conditions such as IP address ranges, time-of-day restrictions, and MFA requirements. Use conditions to narrow access beyond just the action and resource — for example, restrict sensitive administrative operations to requests originating from your corporate network.
Audit and right-size permissions regularly. Over time, identities accumulate permissions that are no longer needed — a pattern called permission creep. Use Alibaba Cloud's access analysis tools to identify unused permissions and remove them.

Inside-out and outside-in access

Map access from two directions to ensure full coverage:

Outside-in access: Who and what accesses your workload from outside? This includes end users calling your application's API, developers connecting to management interfaces, and third-party services sending data. Each entry point must have a defined identity model and access policy.
Inside-out access: What does your workload access? Applications typically read from and write to databases, retrieve secrets, publish messages to queues, and log telemetry to monitoring services. Each outbound call requires a workload identity with the appropriate permissions, scoped as narrowly as possible.

Mapping both directions prevents a common gap where inbound identity is carefully designed but outbound workload access is left with overly broad permissions.

Monitoring and auditing

Authentication and authorization controls are incomplete without visibility into when they fail or are misused. Monitoring and auditing close this gap — recording what identities do and alerting when behavior is anomalous.

Best practices for monitoring and auditing

Enable ActionTrail for all accounts. Alibaba Cloud ActionTrail records every API call made in your account, including the caller identity, source IP, timestamp, and outcome. Enable ActionTrail across all accounts in your organization and deliver logs to a centralized, tamper-resistant storage location.

Tradeoff: Storing and querying high-volume audit logs incurs storage and query costs. Define a retention policy based on your compliance requirements and filter out low-value noise — for example, read-only API calls to non-sensitive resources — before long-term archival.

Set alerts for high-risk identity events. Define what normal looks like for each identity and alert on deviations. Key events to monitor include: root account logins, MFA disablement, policy changes that expand permissions, access key creation, and access from unfamiliar IP addresses or regions.
Review access logs regularly. Automated alerts catch known patterns, but regular manual review can surface subtle anomalies that rules miss. Include access log review in your security operations cadence.
Centralize logs across accounts. In multi-account environments, aggregate ActionTrail logs into a dedicated security account that operational teams cannot modify. This keeps logs intact even if an attacker compromises an individual workload account.
Implement automated response for critical findings. For the highest-severity identity events — for example, root account usage or unexpected IAM policy changes — automate an immediate response such as disabling the affected credential, notifying the security team, or triggering an incident workflow.