This topic describes the features of big data instance families of Elastic Compute
Service (ECS) and lists the instance types of each instance family.
- Recommended instance families
- Other available instance families (If these instance families are sold out, you can
use the recommended ones.)
Description
Big data instance families are designed to provide cloud computing and big data storage
to support the needs of big data-oriented enterprises. These instance families are
suitable for scenarios that require offline computing and big data storage, such as
Hadoop distributed computing, extensive log processing, and large-scale data warehousing.
Big data instance families are ideal for business that uses distributed networks and
has high requirements on storage, capacity, and internal bandwidth.
These instance families are suitable for customers in industries such as Internet
and finance that need to compute, store, and analyze big data. Big data instance families
use local storage to ensure large amounts of storage space and high storage performance.
Big data instances have the following benefits:
- Enterprise-level computing power ensures efficient and stable data processing.
- Network performance is enhanced with higher maximum internal bandwidth per instance
and higher maximum packet forwarding rates to satisfy data transfer demands such as
shuffling in Hadoop MapReduce at peak times.
- When an instance is created or started for the first time, its disks must be pre-warmed
before they can achieve optimal performance. Each disk can deliver sequential read
and write performance of up to 190 MB/s, and each instance can deliver a storage throughput
of up to 5 GB/s. This reduces the amount of time required to read data from or write
data to Hadoop Distributed File System (HDFS) files.
- The cost of local storage is 97% lower than that of standard SSDs. This significantly
reduces the cost to build Hadoop clusters.
When you use big data instances, take note of the following items:
- Instances with local SSDs do not support instance configuration changes or failovers.
- Local disks can be tied only to specific instance types. The number and capacity of
local disks attached to an instance vary based on the instance type. You cannot separately
purchase local disks, or detach local disks from the associated instances and then
attach the disks to other instances.
- You cannot create snapshots for local disks. If you want to create an image from the
system disk and data disks of an instance with local SSDs, we recommend that you create
an image by combining the snapshots of both the system disk and data disks. In this
case, the data disks must be cloud disks.
- You cannot create images that contain snapshots of system disks and data disks based
on instance IDs.
- You can attach a standard SSD to an instance with local SSDs and extend the capacity
of the standard SSD.
- Operations on an instance with local SSDs may affect the data stored on the local
SSDs. For more information, see Impacts of instance operations on data stored on local disks.
Best practices for mounting a file system to a big data instance
The first time you mount a file system such as ext4, you must initialize the inode
table. By default, the lazyinit feature is enabled in Linux kernel v2.6.37 and later,
which causes the inode table not to be initialized until file systems are mounted.
In addition, local disks consume a large amount of throughput when they are being
initialized, such as 600 MB/s for 30 local disks. This may affect service stability.
The concurrency of lazyinit in Linux kernel v4.x is improved to resolve this problem.
For more information, see
index: kernel/git/stable/linux.git. We recommend that you use the following best practices to initialize the inode table
at your earliest opportunity:
- Obtain a list of all local serial advanced technology attachment (SATA) HDDs.
- Run the following command to initialize each local disk separately.
In this example, an ext4 file system is created on a local disk whose device name
is /dev/vdb.
mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdb &
- After all local disks are initialized, run the iostat -x 5 command until the I/O activities of all local disks are displayed as 0.
- Batch run the mount command.
d3c, compute-intensive big data instance family
Note This instance family is in invitational preview. To use this instance family,
submit a ticket.
Features:
- This instance family is equipped with high-capacity and high-throughput local SSDs
and can provide maximum bandwidth of 32 Gbit/s between instances.
- Supports online replacement and hot swapping of damaged disks to prevent instance
shutdown.
If a local disk fails, you receive a notification about the system event. You can
handle the system event by initiating the process of fixing the damaged disk. For
more information, see
O&M scenarios and system events for instances equipped with local disks.
Notice After you initiate the process of fixing the damaged disk, data in the damaged disk
cannot be restored.
- Compute:
- Uses the third-generation 2.7 GHz Intel® Xeon® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz
for consistent computing performance.
- Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports enhanced SSDs (ESSDs), standard SSDs, and ultra disks.
- Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
- Supported scenarios:
- Big data computing and storage business scenarios in which services such as Hadoop
MapReduce, HDFS, Hive, and HBase are used
- Scenarios in which EMR JindoFS and Operation Orchestration Service (OOS) are used
in combination to separately store hot and cold data and decouple storage from computing
- Machine learning scenarios such as Spark in-memory computing and MLlib
- Search and log data processing scenarios in which solutions such as Elasticsearch
and Kafka are used
Instance types
Instance type |
vCPUs |
Memory (GiB) |
Local storage (GiB) |
Baseline/burst bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
NIC queues |
ENIs |
Private IP addresses per ENI |
ecs.d3c.3xlarge |
14 |
56.0 |
1 × 16000 |
8/burstable up to 10 |
1,600,000 |
8 |
8 |
30 |
ecs.d3c.7xlarge |
28 |
112.0 |
2 × 16000 |
16/burstable up to 25 |
2,500,000 |
16 |
8 |
30 |
ecs.d3c.14xlarge |
56 |
224.0 |
4 × 16000 |
32/none |
5,000,000 |
28 |
8 |
30 |
ecs.d3c.16xlarge |
64 |
256.0 |
4 × 16000 |
32/none |
5,000,000 |
32 |
8 |
30 |
d2c, compute-intensive big data instance family
Features:
- This instance family is equipped with high-capacity and high-throughput local SATA
HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.
- Supports online replacement and hot swapping of damaged disks to prevent instance
shutdown.
If a local disk fails, you receive a notification about the system event. You can
handle the system event by initiating the process of fixing the damaged disk. For
more information, see
O&M scenarios and system events for instances equipped with local disks.
Notice After you initiate the process of fixing the damaged disk, data in the damaged disk
cannot be restored.
- Compute:
- Uses 2.5 GHz Intel® Xeon® Platinum 8269CY (Cascade Lake) processors.
- Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, standard SSDs, and ultra disks.
- Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
- Supported scenarios:
- Big data computing and storage business scenarios in which services such as Hadoop
MapReduce, HDFS, Hive, and HBase are used
- Scenarios in which EMR JindoFS and OOS are used in combination to separately store
hot and cold data and decouple storage from computing
- Machine learning scenarios such as Spark in-memory computing and MLlib
- Search and log data processing scenarios in which solutions such as Elasticsearch
and Kafka are used
Instance types
Instance type |
vCPUs |
Memory (GiB) |
Local storage (GiB) |
Bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
NIC queues |
ENIs |
Private IP addresses per ENI |
ecs.d2c.6xlarge |
24 |
88.0 |
3 × 4000 |
12.0 |
1,600,000 |
8 |
8 |
20 |
ecs.d2c.12xlarge |
48 |
176.0 |
6 × 4000 |
20.0 |
2,000,000 |
16 |
8 |
20 |
ecs.d2c.24xlarge |
96 |
352.0 |
12 × 4000 |
35.0 |
4,500,000 |
16 |
8 |
20 |
d2s, storage-intensive big data instance family
Features:
- This instance family is equipped with high-capacity and high-throughput local SATA
HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.
- Supports online replacement and hot swapping of damaged disks to prevent instance
shutdown.
If a local disk fails, you receive a notification about the system event. You can
handle the system event by initiating the process of fixing the damaged disk. For
more information, see
O&M scenarios and system events for instances equipped with local disks.
Notice After you initiate the process of fixing the damaged disk, data in the damaged disk
cannot be restored.
- Compute:
- Uses 2.5 GHz Intel® Xeon® Platinum 8163 (Skylake) processors.
- Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports ESSDs, standard SSDs, and ultra disks.
- Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
- Supported scenarios:
- Big data computing and storage business scenarios in which services such as Hadoop
MapReduce, HDFS, Hive, and HBase are used
- Machine learning scenarios such as Spark in-memory computing and MLlib
- Search and log data processing scenarios in which solutions such as Elasticsearch
and Kafka are used
Instance types
Instance type |
vCPUs |
Memory (GiB) |
Local storage (GiB) |
Bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
NIC queues |
ENIs |
Private IP addresses per ENI |
ecs.d2s.5xlarge |
20 |
88.0 |
8 × 7300 |
12.0 |
1,600,000 |
8 |
8 |
20 |
ecs.d2s.10xlarge |
40 |
176.0 |
15 × 7300 |
20.0 |
2,000,000 |
16 |
8 |
20 |
ecs.d2s.20xlarge |
80 |
352.0 |
30 × 7300 |
35.0 |
4,500,000 |
32 |
8 |
20 |
d1ne, network-enhanced big data instance family
Features:
- This instance family is equipped with high-capacity and high-throughput local SATA
HDDs and can provide a maximum bandwidth of 35 Gbit/s between instances.
- Compute:
- Offers a CPU-to-memory ratio of 1:4, which is designed for big data scenarios.
- Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.
- Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only standard SSDs and ultra disks.
- Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
- Supported scenarios:
- Scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used
- Machine learning scenarios such as Spark in-memory computing and MLlib
- Search and log data processing scenarios in which solutions such as Elasticsearch
are used
Instance types
Instance type |
vCPUs |
Memory (GiB) |
Local storage (GiB) |
Bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
NIC queues |
ENIs |
Private IP addresses per ENI |
ecs.d1ne.2xlarge |
8 |
32.0 |
4 × 5500 |
6.0 |
1,000,000 |
4 |
4 |
10 |
ecs.d1ne.4xlarge |
16 |
64.0 |
8 × 5500 |
12.0 |
1,600,000 |
4 |
8 |
20 |
ecs.d1ne.6xlarge |
24 |
96.0 |
12 × 5500 |
16.0 |
2,000,000 |
6 |
8 |
20 |
ecs.d1ne-c8d3.8xlarge |
32 |
128.0 |
12 × 5500 |
20.0 |
2,000,000 |
6 |
8 |
20 |
ecs.d1ne.8xlarge |
32 |
128.0 |
16 × 5500 |
20.0 |
2,500,000 |
8 |
8 |
20 |
ecs.d1ne-c14d3.14xlarge |
56 |
160.0 |
12 × 5500 |
35.0 |
4,500,000 |
14 |
8 |
20 |
ecs.d1ne.14xlarge |
56 |
224.0 |
28 × 5500 |
35.0 |
4,500,000 |
14 |
8 |
20 |
d1, big data instance family
Features:
- This instance family is equipped with high-capacity and high-throughput local SATA
HDDs and can provide a maximum bandwidth of 17 Gbit/s between instances.
- Compute:
- Offers a CPU-to-memory ratio of 1:4, which is designed for big data scenarios.
- Uses 2.5 GHz Intel® Xeon® E5-2682 v4 (Broadwell) processors.
- Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only standard SSDs and ultra disks.
- Network:
- Provides high network performance based on large computing capacity.
- Supported scenarios:
- Scenarios in which services such as Hadoop MapReduce, HDFS, Hive, and HBase are used
- Machine learning scenarios such as Spark in-memory computing and MLlib
- Scenarios in which customers in industries such as Internet and finance need to compute,
store, and analyze big data
- Search and log data processing scenarios in which solutions such as Elasticsearch
are used
Instance types
Instance type |
vCPUs |
Memory (GiB) |
Local storage (GiB) |
Bandwidth (Gbit/s) |
Packet forwarding rate (pps) |
NIC queues |
ENIs |
Private IP addresses per ENI |
ecs.d1.2xlarge |
8 |
32.0 |
4 × 5500 |
3.0 |
300,000 |
1 |
4 |
10 |
ecs.d1.3xlarge |
12 |
48.0 |
6 × 5500 |
4.0 |
400,000 |
1 |
6 |
10 |
ecs.d1.4xlarge |
16 |
64.0 |
8 × 5500 |
6.0 |
600,000 |
2 |
8 |
20 |
ecs.d1.6xlarge |
24 |
96.0 |
12 × 5500 |
8.0 |
800,000 |
2 |
8 |
20 |
ecs.d1-c8d3.8xlarge |
32 |
128.0 |
12 × 5500 |
10.0 |
1,000,000 |
4 |
8 |
20 |
ecs.d1.8xlarge |
32 |
128.0 |
16 × 5500 |
10.0 |
1,000,000 |
4 |
8 |
20 |
ecs.d1-c14d3.14xlarge |
56 |
160.0 |
12 × 5500 |
17.0 |
1,800,000 |
6 |
8 |
20 |
ecs.d1.14xlarge |
56 |
224.0 |
28 × 5500 |
17.0 |
1,800,000 |
6 |
8 |
20 |