Overview of ECS instance security - Elastic Compute Service

Protect your ECS instances by securing accounts, hardening operating systems, encrypting data, isolating networks, and building a layered security defense system.

Background

Alibaba Cloud protects the underlying infrastructure, such as data centers and virtualization platforms. You are responsible for following security best practices: protecting your accounts, keeping information confidential, and controlling permissions.

Cloud security trends

Cybersecurity threats

Cybersecurity threats are increasing rapidly. The State of Security 2022 report by Splunk reveals:

49% of organizations suffered data breaches in the past two years, up from 39% a year ago.
79% of respondents faced ransomware attacks. 35% lost access to data and systems from one or more attacks.
59% of security teams devoted significant time and resources to remediation, up from 42% a year ago.
Recovery from unplanned downtime averaged 14 hours, at an estimated cost of USD 200,000 per hour.

The shift to cloud-based architectures introduces new security challenges. A single accidental operation can expose applications to the Internet or disclose keys. Security and compliance are essential to your cloud journey.

Prepare to protect cloud information assets

Security is an ongoing process. Build and refine your security posture continuously.

Develop a holistic security strategy with integrated protection policies, tools, and controls.
Build security into DevOps.
Build an automated security defense system to protect your systems.
Follow cloud security compliance standards.

You must also:

Identify and categorize all information assets.
Define what asset data to protect.
Define who can access the asset data and for what purposes.

Protect cloud information assets

Cloud security uses policies, controls, and technologies to protect data, infrastructure, and applications from external and internal threats. To ensure compliance, develop cloud-based applications with security in mind from the start.

Use the following best practices to protect your cloud information assets.

Item	Best practice	Description
Account security	Protect Alibaba Cloud accounts	Enable MFA for accounts. Use RAM users instead of Alibaba Cloud accounts with appropriate permissions policies. Use instance RAM roles instead of AccessKey pairs to call API operations. Prevent AccessKey pair leaks. Follow security suggestions for account and password management.
Application resource management	Manage information assets in bulk	Use tags to manage resources in bulk. Use Cloud Assistant to automate O&M on resource channels. Use Cloud Config for compliance audits on resources.
Information and data security	Enable security compliance when you create instances	Host business that requires high security on security-enhanced instances. Use secure images. Encrypt data on disks. Use snapshots for disaster recovery purposes. Access instance metadata in security hardening mode.
Network environment security	Properly separate permissions on network resources	Follow security suggestions for network resource isolation. Build a secure network environment.
Application security	Use security services to build a security defense system	Use Anti-DDoS Origin Basic (free-of-charge), Anti-DDoS Pro, and Anti-DDoS Premium to mitigate DDoS attacks. Use Security Center Basic for free to protect against exploitation of system vulnerabilities. Use Web Application Firewall (WAF) to protect against exploitation of system vulnerabilities.
Guest OS application security	Protect applications in instance guest operating systems	Configure security settings for secure ECS instance logons. Encrypt data in transit. Monitor and audit log exceptions.

Protect Alibaba Cloud accounts

Enable MFA for accounts

Enable MFA for your Alibaba Cloud account. MFA requires security codes in addition to usernames and passwords, dynamically generated by MFA devices.

Use RAM users with least-privilege policies

Grant access permissions on ECS resources to RAM users based on the principle of least privilege. Do not share account information with third parties or grant unnecessary permissions. Create RAM users (or user groups) and attach specific permissions policies for fine-grained access control. See RAM users.

RAM user

If multiple ECS instances need to be accessed by different entities such as employees, systems, and applications, create RAM users with only required permissions to prevent security risks.
RAM user group
- Create multiple user groups with different permissions policies. For example, to enhance network security, deny access to specific ECS resources from IP addresses outside your corporate network and assign the policy to a user group.
- Create user groups to manage permissions by job responsibility. For example, if a developer becomes a system administrator, move the account from the Developers group to the SysAdmins group.
Policy for a user group
- SysAdmins: Needs permissions to create and manage ECS resources. Attach a policy that allows all operations on instances, images, snapshots, and security groups.
- Developers: Needs only permissions to use instances. Attach a policy that allows calling DescribeInstances, StartInstances, StopInstances, RunInstances, and DeleteInstance.

Use instance RAM roles instead of AccessKey pairs to call API operations

Applications deployed on ECS instances typically access other Alibaba Cloud APIs using AccessKey pairs. However, storing AccessKey pairs on instances carries security risks such as leaks and maintenance difficulties. Instance RAM roles address these risks by using RAM for fine-grained permission control without exposing AccessKey pairs.

Attach an instance RAM role to an ECS instance and use a Security Token Service (STS) temporary credential to access other Alibaba Cloud APIs. After attaching the role, access instance metadata in security hardening mode. The STS temporary credential is updated periodically. See Instance RAM roles.

Prevent AccessKey pair leaks

AccessKey pairs are API credentials and must be kept secure. Do not expose them to external channels such as GitHub. If your AccessKey pair is disclosed, your resources are at risk.

Security suggestions for AccessKey pairs:

Do not embed AccessKey pairs in code.
Change AccessKey pairs regularly.
Revoke unnecessary AccessKey pairs regularly.
Use RAM users based on the principle of least privilege.
Enable log audit and deliver logs to OSS and Simple Log Service (SLS) for storage and audits.
Set acs:SourceIp to control access from specific public IP addresses to Alibaba Cloud APIs.
Set acs:SecureTransport to true, which indicates that the features and resources are accessed over HTTPS.

Security suggestions for managing accounts and passwords

Category	Policy description
Alibaba Cloud accounts	MFA must be enabled for administrative accounts. Configure tiered permissions and follow the principle of least privilege. Disable root access to APIs or common request methods.
Keys and credentials	Do not use expired certificates or credentials. Delete the access keys for root accounts. Remove keys and credentials unused for more than 30 days regularly. Monitor key and credential usage. Scan your Git repository and history for potential key leaks regularly.
Password	Change passwords regularly and ensure they meet strength requirements. Enforce password complexity policies. Use unique complex passwords that differ from those on other platforms to prevent cross-platform security threats from password leaks. Store AccessKey pairs and password information securely in Key Management Service (KMS). Do not store them in plaintext on disks. Do not use the same password or key pair for different accounts on a host.
Confidential data in KMS	Storing confidential data in plaintext on disks poses leakage risks. Activate KMS to enable data encryption in cloud services without maintaining your own cryptographic infrastructure. For example, enable disk encryption and trusted boot on ECS instances.

Manage information assets in bulk

Manage and audit cloud resources in bulk to prevent assets from being left unprotected due to misconfigurations. Use uniform deployment and naming conventions for instances and security groups, and periodically check for non-compliant resources. Use tags for bulk management, Cloud Assistant for automated O&M, and Cloud Config for compliance audits.

Use tags to manage resources in bulk

Use tags to identify, categorize, and find cloud resources in bulk. In a security incident, tags help quickly identify the scope and severity.
Configure security policies with specific tags for resources such as security groups at once.

See Tags.

Use Cloud Assistant to automate O&M on resource channels

Traditional O&M relies on SSH, which requires keys and open ports — both security risks if mismanaged. Cloud Assistant is a native O&M tool for ECS that enables batch script execution and file distribution without passwords, logons, or jump servers. Cloud Assistant provides session management for interactive O&M and uses the following security mechanisms:

Access control: Cloud Assistant uses RAM policies to control user access based on instances, resource groups, tags, or source IP addresses.
End-to-end reliability: All resources interact over HTTPS with data encrypted in transit. Instances use an internal security mechanism for inbound access without opening ports, and communicate outbound over the internal network without exposing public IP addresses.
Secure content: Commands are encrypted and verified with signatures to prevent tampering during transmission.
Log audit: Call API operations to audit commands and files transmitted by Cloud Assistant, including execution times, results, content, and usernames. Deliver task logs to OSS or SLS for archiving or analysis.

See Cloud Assistant overview.

Use Cloud Config to conduct compliance audits on resources

Cloud Config evaluates and monitors cloud resources for compliance. It aggregates resources across regions, records configuration changes in timelines, and generates alerts for non-compliant configurations. See What is Cloud Config?

Configure security when creating instances

Host business that requires high security on security-enhanced instances

For business requiring high security and enhanced trust, run workloads on security enhanced instances that support trusted boot and private data protection.

Supports encrypted memory and confidential computing based on Intel^®Software Guard Extensions (SGX) to protect essential code and data from malware attacks.
Implements trusted boot based on Trusted Cryptography Module (TCM) or Trusted Platform Module (TPM) chips. During trusted boot, all modules in the boot chain from the underlying server to the ECS instance are measured and verified.

Use secure images

Use public images and enable security hardening for the images.
- Use public images provided by Alibaba Cloud.
- Enable free security hardening for public images to get features such as webshell detection, security configuration checks, and unusual logon alerts.
Use encrypted custom images.

Use the AES-256 algorithm to encrypt custom images to prevent data leaks. Create encrypted disks and then create encrypted custom images from them. You can also encrypt an image copy using the Copy and Encrypt feature. For shared encrypted images, create separate BYOK keys to prevent key leaks. See Copy a custom image.

Encrypt data on disks by using KMS

Encrypt disks to protect stored data without modifying business or applications. Snapshots created from encrypted disks and disks created from those snapshots are also encrypted. Both system disks and data disks can be encrypted. See Disk encryption. disk-encryption

Use snapshots for disaster recovery purposes

Use snapshots to back up data

Snapshots are the foundation of disaster recovery and help reduce data loss from system failures, accidental operations, and security issues. Create snapshots based on your needs. See Create snapshot manually.

Create automatic snapshots daily and retain them for at least seven days to improve disaster tolerance and minimize data loss.
Use encrypted snapshots

ECS uses AES-256 encryption for snapshots to prevent data leaks. Create encrypted disks and then create encrypted snapshots from them.

Access instance metadata in security hardening mode

ECS instance metadata includes instance information in Alibaba Cloud. You can view and use the metadata of running instances to configure or manage them. See Instance metadata.

Access instance metadata in security hardening mode. In this mode, a session is established between the instance and the metadata server, and the server authenticates your identity with a token. When the token expires, the session closes. Token limits:

Each token can only be used for a single ECS instance. Access with another instance's token is denied.
Each token has a validity period of 1 to 21,600 seconds (6 hours). Tokens can be reused until they expire.
Proxy access is not supported. Requests with the X-Forwarded-For header are denied.
An unlimited number of tokens can be issued for each instance.

Isolate and secure network resources

VPC abstracts physical networks into isolated secure virtual networks at the data link layer. VPCs are fully isolated and connect only through elastic IP addresses or NAT IP addresses. You can configure IP address ranges, CIDR blocks, route tables, and gateways within each VPC. Connect on-premises data centers to VPCs using VPN Gateway, Express Connect, or Smart Access Gateway (SAG). Connect networks worldwide using Cloud Enterprise Network (CEN).

Networks are vulnerable to cyberattacks. In Alibaba Cloud, control access to VPCs using security groups, network ACLs, routing policies, or Express Connect circuits. Also configure Cloud Firewall, WAF, and Anti-DDoS for external threat protection.

Follow the security suggestions for isolating network resources

Security suggestions for network isolation:

Create a network administrator account to manage security groups, network ACLs, and traffic logs in a centralized manner.
Use network ACLs to restrict access to private data.
Isolate network resources and preconfigure large subnets to prevent overlapping of subnets.
Configure security groups based on access points instead of resources.

Build a secure network environment

Properly configure security groups to isolate networks and reduce attack surface

Security groups control network access to and from ECS instances. Configure rules to filter traffic by port and IP address to reduce attack surface.

For example, port 22 is the default SSH port on Linux instances. Leaving it open poses security risks. Configure rules to allow only specific IP addresses to access port 22, or use VPN to encrypt logon data.

Suggestion	Description	References
Principle of least privilege	Security groups should work like whitelists. Open only necessary ports and allocate only necessary public IP addresses. Use a VPN or bastion host for troubleshooting. Incorrect configurations may expose ports or IP addresses to the Internet. Open only ports required by business in security groups for both Internet and internal network traffic. For high-risk service ports, allow access only from specific IP addresses. For HTTP service management backends, allow access only from specific IP addresses and enable WAF for the domain name.	Guidelines and use cases
Do not set 0.0.0.0/0 as an authorization object in your security group rules.	Allowing all inbound access is a common mistake. `0.0.0.0/0` means all IP addresses. A rule with this authorization object opens all ports to external access. Deny access over all ports by default, then allow only required ports such as 80, 8080, and 443.	Add a security group rule
Disable inbound security group rules that are no longer needed.	If a rule includes `0.0.0.0/0`, review the ports your applications expose. For ports that should not provide external services, add a Deny rule. For example, if MySQL runs on port 3306, add a Deny rule with the lowest priority (100).	Add a security group rule
Reference security groups as authorization objects in security group rules.	Add rules based on the principle of least privilege. Different application layers should use different security groups with appropriate inbound and outbound rules. For distributed applications with security groups that are not mutually accessible, add rules that reference security groups (instead of IP addresses or CIDR blocks) as authorization objects to allow mutual access. For example, create sg-web for the web layer and sg-database for the database layer. In sg-database, add a rule that references sg-web to allow access over port 3306.	Add a security group rule
Configure appropriate names and tags for security groups.	Clear names and descriptions help identify security groups. Add tags to security groups via the ECS console or API for easy search and management.	Modify a security group Tags
Add ECS instances that require mutual access to the same security group.	Each ECS instance can belong to up to five security groups. Instances in the same security group communicate over the internal network. If you have multiple security groups, create a new group and add instances that need internal network communication. Do not add all instances to the same security group. For distributed applications, each instance plays a different role and needs specific inbound and outbound rules.	Associate security groups with an instance (primary ENI)
Isolate instances within a security group.	Security groups act as virtual firewalls with Stateful Packet Inspection (SPI). Instances in the same security group share the same region and security requirements. Alibaba Cloud fine-tunes internal access control policies to isolate instances within a security group.	Security groups
Use security group quintuple rules.	Security groups control network access to ECS instances and provide logical security isolation. Quintuple rules enable precise access control based on source IP, source port, destination IP, destination port, and transport layer protocol.	Security group quintuple rules
Add ECS instances that provide Internet-facing services and those that provide internal network-facing services to different security groups.	Applications on an ECS instance may be Internet-accessible when the instance exposes ports (such as 80 and 443) or uses port forwarding rules (such as NAT port forwarding or EIP-based forwarding). Apply the strictest security group rules. Deny all protocols and ports by default, then allow only ports required by external services such as 80 and 443. Internet-facing instances in the same security group should have clear responsibilities. Deploy backend services like MySQL and Redis on instances without Internet access, then use security group rules to allow access from specific security groups.	Add a security group rule

Configure security domains to isolate services of different security levels within your organization

Build private networks with VPCs to host servers of different security levels separately. Create a VPC, assign a CIDR block, configure route tables and gateways, and store important data in this logically isolated VPC. Use an EIP or jump server for daily O&M. See Create and manage a VPC.
Use jump servers or bastion hosts to protect against internal and external intrusions

Jump servers have broad permissions. Use bastion hosts instead to record and audit all O&M operations, detect anomalies, and attribute actions to specific personnel.

Assign the jump server to a dedicated vSwitch in a VPC and associate the corresponding EIP or NAT port forwarding table. Create a dedicated security group (SG_BRIDGE) and open only required ports (TCP 22 for Linux, RDP 3389 for Windows). Restrict inbound access to specific public IP addresses of your organization. To grant the jump server access to another security group (SG_CURRENT), add a rule allowing access from SG_BRIDGE over specific protocols and ports. For SSH, use key pairs instead of passwords. See SSH key pair overview.
Properly assign public IP addresses to reduce Internet-exposed attack surface
Proper public IP allocation reduces attack surface. For VPC instances, connect instances needing Internet access to a few designated vSwitches for easier auditing.
- Do not assign public IP addresses to instances that don't provide Internet services. For multiple Internet-facing instances, use SLB to distribute traffic. See SLB overview.
- For instances that need outbound Internet access but don't have public IPs, use NAT gateways. Configure SNAT entries for specific CIDR blocks or subnets to avoid exposing services to the Internet. See What is VPN Gateway?

Use security services to build a security defense system

Use Anti-DDoS Origin Basic (free-of-charge), Anti-DDoS Pro, and Anti-DDoS Premium to mitigate DDoS attacks

DDoS attacks flood targets with fraudulent traffic from multiple compromised systems. Alibaba Cloud Security Center defends against Layer 3 to Layer 7 attacks, including SYN, UDP, ACK, ICMP, DNS, and HTTP flood attacks. Anti-DDoS Origin Basic provides up to 5 Gbit/s DDoS mitigation free of charge.

Anti-DDoS Origin Basic is enabled on ECS instances by default, maintaining normal access during DDoS attacks without purchasing scrubbing devices. After creating an instance, set its traffic scrubbing thresholds.

Use Security Center Basic for free to protect against exploitation of system vulnerabilities

Security Center is a centralized security management system that identifies, analyzes, and warns of threats in real time. Features include anti-ransomware, antivirus, web tamper proofing, and compliance checks.

Security Center Basic is available by default and scans for unusual logons, vulnerabilities, and cloud service configuration risks. For advanced features such as threat detection, vulnerability fixing, and virus removal, purchase a paid edition from the Security Center console. See Security Center Basic.

Use WAF to protect against exploitation of system vulnerabilities

WAF protects web applications against common OWASP attacks and HTTP flood attacks, including SQL injections, XSS attacks, webshells, trojans, and unauthorized access. WAF blocks malicious traffic to prevent data leaks and ensure website security and availability. See Getting started.

WAF benefits:

WAF handles various web attacks to ensure security and availability without installing software or hardware or modifying code. WAF provides dedicated protection for specific websites across fields such as finance, e-commerce, O2O, gaming, and public services.
Without WAF, you may be vulnerable to web intrusions, data leaks, HTTP flood attacks, and trojans.

See Get started with WAF.

Protect applications in instance guest operating systems

Configure security settings to ensure secure logons to ECS instances

By default, a non-root account can log on to an ECS instance. Grant administrative permissions using su or sudo. The root account cannot log on with a PEM key file by default. For Linux instances, configure only RSA key pair-based logon. For Windows instances, use complex passwords with at least eight characters and special characters.

Linux instance:

By default, non-root accounts are used to log on to Linux instances.

Logging on as root grants the highest permissions but poses data security risks if the instance is attacked. Use the Anolis OS 8.4 or Ubuntu 20.04 public image, which supports logon with the regular ecs-user account.
Use temporary SSH key pairs to log on to Linux instances.

Use the config_ecs_instance_connect plug-in to send an SSH public key to a specific instance. The key is stored for 60 seconds, during which you can connect without a password. See Connect with a temporary key pair.
An SSH key pair consists of a public key and a private key, encrypted by default with RSA-2048. Key pair authentication advantages over password authentication:
- Higher security and reliability for logons.
- Strength far exceeds regular passwords, preventing brute-force attacks.
- Private keys cannot be deduced from public keys.
- SSH key pairs are easy to use.
  - Configure a public key on a Linux instance, then use the private key to run SSH commands without a password.
  - Use a key pair to batch manage multiple Linux instances. Specify a key pair when creating an instance or bind one after creation, then connect using the private key.
Modify the sshd_config file to disable password-based logon and support only RSA key pair-based logon.

Windows instance:

Weak passwords are a leading vulnerability. Use complex passwords with at least eight characters, including uppercase letters, lowercase letters, digits, and special characters. Change passwords regularly.
Set strong passwords for ECS instances. The passwords must be 8 to 30 characters in length and contain at least three of the following character types: uppercase letters, lowercase letters, digits, and special characters, including ( ) ` ~ ! @ # $ % ^ & * _ - + = | { } [ ] : ; ' < > , . ? /. For Windows instances, passwords cannot start with a forward slash (/).

Encrypt data in transit

Configure security groups or firewalls to allow communication only between ports that encrypt data. Use encryption protocols such as TLS 1.2 and later to encrypt sensitive data in transit between clients and ECS instances.

Monitor and audit log exceptions

The M-Trends 2018 report by FireEye found that the global median dwell time (from attack to detection) was 101 days, and 498 days in Asia Pacific. Reliable log data and audit services are essential to shorten dwell time.

Use CloudMonitor, ActionTrail, Log Audit Service, VPC flow logs, and application logs to build a monitoring and alerting system for abnormal resource and permission access.

Use CloudMonitor to set resource usage alert thresholds and prevent DDoS attacks. See What is CloudMonitor
Use ActionTrail to detect unauthorized access, incorrect security configurations, and high-risk operations. ActionTrail also supports compliance auditing and threat detection. Use MFA to control access to ActionTrail. See ActionTrail
Enable the flow log feature in VPC to record inbound and outbound ENI traffic.
Use Log Audit Service to collect, analyze, and visualize data and generate alerts for scenarios such as DevOps, operations, security, and auditing. See Overview of Log Audit Service.
Trace application event logs and API call logs.
Synchronize all logs to SLS or OSS regularly and configure access permissions.
Add instance IDs, regions, zones, and environments (test or production) to logs for easier troubleshooting.