All Products
Search
Document Center

Elastic Compute Service:Disaster recovery solutions

Last Updated:Mar 22, 2024

Disaster recovery solutions are designed to ensure business continuity when a disaster occurs. Alibaba Cloud Elastic Compute Service (ECS) allows you to use snapshots, images, or multi-zone deployment to back up and restore data and applications to improve business availability and continuity.

Use a snapshot or image to back up and restore data

Use a snapshot to back up and restore data

Alibaba Cloud Snapshot is an agentless backup service that allows you to create a snapshot or snapshot-consistent group to capture the point-in-time state of data blocks on a disk or disk group. You can use snapshots to restore data, build development and testing environments, or create custom images for batch deployment of business. For more information, see Snapshot overview.

  • Create a snapshot for data backup

    • The first backup is a full backup, and subsequent backups are incremental backups. Incremental backups can be quickly created and occupy a small amount of storage space. The amount of time required for backup varies based on the amount of incremental data to be backed up. For more information, see How snapshots work.

    • You can manually create a snapshot or create an automatic snapshot policy for the system disk and data disks. For more information, see Create a snapshot for a disk or Automatic snapshots.

  • Restore data from a snapshot

    If data loss occurs due to reasons such as accidental operations or ransom viruses, you can use the snapshot to roll back the disk. This way, the disk reverts to the original state when the snapshot was created. For more information, see Roll back a disk by using a snapshot.

Use an image to back up and restore data

An image is a copy of data from one or more cloud disks. An instance image can contain data from only the system disk or from the system disk and data disks.

Use a multi-zone deployment architecture to implement disaster recovery for applications

You can use a Server Load Balancer (SLB) instance at the frontend of an application, deploy multiple ECS instances, and use the auto scaling technology to implement disaster recovery. Even if one of the ECS instances fails or is overloaded, disaster recovery can be implemented to ensure business continuity and availability. The following figure provides an example of using a multi-zone deployment architecture to implement disaster recovery, in which ECS instances are deployed in data centers in two zones within the same region.

image
  • SLB

    SLB instances are configured to manage incoming traffic. SLB uses algorithms to distribute incoming traffic across ECS instances in different zones. This helps improve the fault tolerance and scalability of the system. For more information about SLB, see What is SLB?.

  • ECS cluster

    ECS instances in different zones provide equivalent capabilities. This ensures business continuity when an instance fails. The failure of a single instance does not affect data-layer applications or the ECS control feature.

    • If a failure occurs, the system automatically performs hot migration so that other ECS instances can continue to provide services. This can prevent service interruptions caused by a single point of failure or hot migration failures.

    • If hot migration fails, you are notified of the failure based on a system event so that you can deploy new ECS instances at the earliest opportunity. This ensures high system availability and business stability.

  • Data layer

    • Object Storage Service (OSS): OSS is deployed at the region level. ECS instances in data centers in different zones can access objects in OSS. This accelerates data access and ensures data reliability. For more information, see What is OSS?

    • Database service: We recommend that you use a database service that supports multi-zone deployment, such as ApsaraDB RDS. The primary node can perform read and write operations across zones without causing conflicts with application-layer traffic. Secondary nodes can perform read operations across zones. If the primary node is unavailable, ECS instances can read data from secondary nodes to ensure data availability and business continuity. For more information, see ApsaraDB RDS.