All Products
Search
Document Center

Elastic Compute Service:Big data instance families

Last Updated:Dec 28, 2023

This topic describes the features of big data instance families of Elastic Compute Service (ECS) and lists the instance types of each instance family.

Overview

Big data instance families are designed to provide cloud computing and big data storage to support the needs of big data-oriented enterprises. These instance families are suitable for scenarios that require offline computing and big data storage, such as Hadoop distributed computing, extensive log processing, and large-scale data warehousing. Big data instance families are ideal for business that uses distributed networks and has high requirements on storage, capacity, and internal bandwidth.

These instance families are suitable for customers in industries such as Internet and finance that need to compute, store, and analyze big data. Big data instance families use local storage to ensure large amounts of storage space and high storage performance.

Big data instances have the following benefits:

  • Enterprise-level computing power ensures efficient and stable data processing.

  • Network performance is enhanced with higher maximum internal bandwidth per instance and higher maximum packet forwarding rates to satisfy data transfer demands such as shuffling in Hadoop MapReduce at peak times.

  • When an instance is created or started for the first time, its disks must warm up before they can deliver optimal performance. Each disk can deliver sequential read and write performance of up to 190 MB/s, and each instance can offer a storage throughput of up to 5 GB/s. This reduces the amount of time required to read data from or write data to Hadoop Distributed File System (HDFS) files.

  • The cost of local storage is 97% lower than that of standard SSDs. This significantly reduces the cost to build Hadoop clusters.

When you use big data instances, take note of the following items:

  • Instances equipped with local SSDs do not support instance configuration changes or failovers.

  • Local disks can be tied only to specific instance types. The number and capacity of local disks attached to an instance vary based on the instance type. You cannot separately purchase local disks, or detach local disks from instances and then attach the disks to other instances.

  • You cannot create snapshots for local disks. If you want to create an image from the system disk and data disks of an instance equipped with local SSDs, we recommend that you create an image by combining the snapshots of both the system disk and data disks. In this case, the data disks must be cloud disks.

  • You cannot create images that consist of system disk snapshots and data disk snapshots based on instances equipped with local SSDs.

  • You can attach a standard SSD to an instance equipped with local SSDs and extend the capacity of the standard SSD.

  • Operations on an instance that are equipped with local SSDs may affect the data stored on the local SSDs. For more information, see the "Impacts of instance operations on data stored on local disks" section in Local disks.

Best practices for mounting a file system to a big data instance

The first time you mount a file system such as ext4, you must initialize the inode table. By default, the lazyinit feature is enabled in Linux kernel v2.6.37 and later, which causes the inode table not to be initialized until file systems are mounted. In addition, local disks consume a large amount of throughput when they are being initialized, such as 600 MB/s for 30 local disks. This may affect service stability. The concurrency of lazyinit in Linux kernel v4.x is improved to resolve this issue. For more information, see index: kernel/git/stable/linux.git. We recommend that you use the following best practices for initializing the inode table at your earliest opportunity:

  1. Obtain a list of all local serial advanced technology attachment (SATA) HDDs.

  2. Run the following command to initialize each local disk separately.

    In this example, an ext4 file system is created on a local disk whose device name is /dev/vdb.

    mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdb &
  3. After all local disks are initialized, run the iostat -x 5 command until the I/O activities of all local disks are displayed as 0.

  4. Batch run the mount command.

d3s, storage-intensive big data instance family

Features:

  • This instance family is equipped with 12-TB, large-capacity, high-throughput local SATA HDDs and can provide a maximum network bandwidth of 64 Gbit/s between instances.

  • This instance family supports online replacement and hot swapping of damaged disks to prevent instance shutdown.

    If a local disk fails, you receive a system event. You can handle the system event by initiating the process of fixing the damaged disk. For more information, see O&M scenarios and system events for instances equipped with local disks.

    Important

    After you initiate the process of fixing the damaged disk, data in the damaged disk cannot be restored.

  • Compute:

    • Uses 2.7 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz to provide consistent computing performance.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • ESSDs and ESSD AutoPL disks are supported.

  • Network:

    • Supports IPv6.

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Big data computing and storage business scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Search and log data processing scenarios in which solutions such as Elasticsearch and Kafka are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network baseline/burst bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

Disk baseline/burst IOPS

Disk baseline/burst bandwidth (Gbit/s)

ecs.d3s.2xlarge

8

32

4 * 12,000

10/burstable up to 15

2,000,000

8

7

30

40,000/burstable up to 60,000

3/burstable up to 5

ecs.d3s.4xlarge

16

64

8 * 12,000

25/none

3,000,000

8

8

30

60,000/none

5/none

ecs.d3s.8xlarge

32

128

16 * 12,000

40/none

6,000,000

16

8

30

120,000/none

8/none

ecs.d3s.12xlarge

48

192

24 * 12,000

60/none

9,000,000

24

8

30

180,000/none

12/none

ecs.d3s.16xlarge

64

256

32 * 12,000

80/none

12,000,000

32

8

30

240,000/none

16/none

Note

d3c, compute-intensive big data instance family

Features:

  • This instance family is equipped with high-capacity and high-throughput local disks and can provide a maximum bandwidth of 40 Gbit/s between instances.

  • This instance family supports online replacement and hot swapping of damaged disks to prevent instance shutdown.

    If a local disk fails, you receive a system event. You can handle the system event by initiating the process of fixing the damaged disk. For more information, see O&M scenarios and system events for instances equipped with local disks.

    Important

    After you initiate the process of fixing the damaged disk, data in the damaged disk cannot be restored.

  • Compute:

    • Uses the third-generation 2.9 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz for consistent computing performance.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • ESSDs and ESSD AutoPL disks are supported.

  • Network:

    • Supports IPv6.

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Big data computing and storage business scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Scenarios in which EMR JindoFS and Object Storage Service (OSS) are used in combination to separately store hot and cold data and decouple storage from computing

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Search and log data processing scenarios in which solutions such as Elasticsearch and Kafka are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network baseline/burst bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

Disk baseline/burst IOPS

Disk baseline/burst bandwidth (Gbit/s)

ecs.d3c.3xlarge

14

56.0

1 * 13,740

8/burstable up to 10

1,600,000

8

8

30

40,000/none

3/none

ecs.d3c.7xlarge

28

112.0

2 * 13,740

16/burstable up to 25

2,500,000

16

8

30

50,000/none

4/none

ecs.d3c.14xlarge

56

224.0

4 * 13,740

40/none

5,000,000

28

8

30

100,000/none

8/none

Note

d2c, compute-intensive big data instance family

Features:

  • This instance family is equipped with high-capacity and high-throughput local SATA HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.

  • This instance family supports online replacement and hot swapping of damaged disks to prevent instance shutdown.

    If a local disk fails, you receive a system event. You can handle the system event by initiating the process of fixing the damaged disk. For more information, see O&M scenarios and system events for instances equipped with local disks.

    Important

    After you initiate the process of fixing the damaged disk, data in the damaged disk cannot be restored.

  • Compute:

    • Uses 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports enhanced SSDs (ESSDs), ESSD AutoPL disks, standard SSDs, and ultra disks.

  • Network:

    • Supports IPv6.

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Big data computing and storage business scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Scenarios in which EMR JindoFS and OSS are used in combination to separately store hot and cold data and decouple storage from computing

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Search and log data processing scenarios in which solutions such as Elasticsearch and Kafka are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

ecs.d2c.6xlarge

24

88.0

3 * 4,000

12.0

1,600,000

8

8

20

ecs.d2c.12xlarge

48

176.0

6 * 4,000

20.0

2,000,000

16

8

20

ecs.d2c.24xlarge

96

352.0

12 * 4,000

35.0

4,500,000

16

8

20

Note

d2s, storage-intensive big data instance family

Features:

  • This instance family is equipped with high-capacity and high-throughput local SATA HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.

  • This instance family supports online replacement and hot swapping of damaged disks to prevent instance shutdown.

    If a local disk fails, you receive a system event. You can handle the system event by initiating the process of fixing the damaged disk. For more information, see O&M scenarios and system events for instances equipped with local disks.

    Important

    After you initiate the process of fixing the damaged disk, data in the damaged disk cannot be restored.

  • Compute:

    • Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports ESSDs, ESSD AutoPL disks, standard SSDs, and ultra disks.

  • Network:

    • Supports IPv6.

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Big data computing and storage business scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Search and log data processing scenarios in which solutions such as Elasticsearch and Kafka are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

ecs.d2s.5xlarge

20

88.0

8 * 8,000

12.0

1,600,000

8

8

20

ecs.d2s.10xlarge

40

176.0

15 * 8,000

20.0

2,000,000

16

8

20

ecs.d2s.20xlarge

80

352.0

30 * 8,000

35.0

4,500,000

32

8

20

Note

d1ne, network-enhanced big data instance family

Features:

  • This instance family is equipped with high-capacity and high-throughput local SATA HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.

  • Compute:

    • Offers a CPU-to-memory ratio of 1:4, which is designed for big data scenarios.

    • Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports standard SSDs and ultra disks.

  • Network:

    • Supports IPv6.

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Search and log data processing scenarios in which solutions such as Elasticsearch are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

ecs.d1ne.2xlarge

8

32.0

4 * 6,000

6.0

1,000,000

4

4

10

ecs.d1ne.4xlarge

16

64.0

8 * 6,000

12.0

1,600,000

4

8

20

ecs.d1ne.6xlarge

24

96.0

12 * 6,000

16.0

2,000,000

6

8

20

ecs.d1ne-c8d3.8xlarge

32

128.0

12 * 6,000

20.0

2,000,000

6

8

20

ecs.d1ne.8xlarge

32

128.0

16 * 6,000

20.0

2,500,000

8

8

20

ecs.d1ne-c14d3.14xlarge

56

160.0

12 * 6,000

35.0

4,500,000

14

8

20

ecs.d1ne.14xlarge

56

224.0

28 * 6,000

35.0

4,500,000

14

8

20

Note

d1, big data instance family

Features:

  • This instance family is equipped with high-capacity and high-throughput local SATA HDDs and can provide a maximum bandwidth of 17 Gbit/s between instances.

  • Compute:

    • Offers a CPU-to-memory ratio of 1:4, which is designed for big data scenarios.

    • Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.

  • Storage:

    • Is an instance family in which all instances are I/O optimized.

    • Supports standard SSDs and ultra disks.

  • Network:

    • Provides high network performance based on large computing capacity.

  • Supported scenarios:

    • Scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used

    • Machine learning scenarios such as Spark in-memory computing and MLlib

    • Scenarios in which customers in industries such as Internet and finance need to compute, store, and analyze big data

    • Search and log data processing scenarios in which solutions such as Elasticsearch are used

Instance types

Instance type

vCPUs

Memory (GiB)

Local storage (GB)

Network bandwidth (Gbit/s)

Packet forwarding rate (pps)

NIC queues

ENIs

Private IP addresses per ENI

ecs.d1.2xlarge

8

32.0

4 * 6,000

3.0

30

1

4

10

ecs.d1.3xlarge

12

48.0

6 * 6,000

4.0

40

1

6

10

ecs.d1.4xlarge

16

64.0

8 * 6,000

6.0

600,000

2

8

20

ecs.d1.6xlarge

24

96.0

12 * 6,000

8.0

80

2

8

20

ecs.d1-c8d3.8xlarge

32

128.0

12 * 6,000

10.0

1,000,000

4

8

20

ecs.d1.8xlarge

32

128.0

16 * 6,000

10.0

1,000,000

4

8

20

ecs.d1-c14d3.14xlarge

56

160.0

12 * 6,000

17.0

1,800,000

6

8

20

ecs.d1.14xlarge

56

224.0

28 * 6,000

17.0

1,800,000

6

8

20

Note