This topic describes the features of Super Computing Cluster (SCC) instance families and their instance types.

Recommended instance families

Introduction

SCCs are based on ECS Bare Metal Instances. With the high-speed interconnects of Remote Direct Memory Access (RDMA) technology, SCCs greatly improve network performance and the acceleration ratio of large-scale clusters. Therefore, SCCs have all the benefits of ECS Bare Metal Instances and can provide high-quality network performance that feature high bandwidth and low latency.

SCCs are used to meet the demands of applications such as high performance computing, artificial intelligence, machine learning, scientific and engineering computing, data analysis, and audio and video processing. In the clusters, nodes are connected over RDMA networks that feature high bandwidth and low latency. This guarantees the parallel efficiency that is demanded by applications that require high performance computing. The RDMA over Convergent Ethernet (RoCE) rivals an InfiniBand network in terms of connection speed and can support more Ethernet-based applications.

The combination of SCCs and other Alibaba Cloud computing products such as ECS and GPU instances provides Elastic HPC with the ultimate high performance parallel computing resources, making supercomputing on the cloud a reality.

Comparison of SCCs, physical machines, and virtual machines

The following table describes a comparison of their features. In this table, Y means supported, N means not supported, and N/A means not applicable.

Feature type Feature SCC Physical machine Virtual machine
Automated O&M Delivery within minutes Y N Y
Computing Zero performance loss Y Y N
Zero feature loss Y Y N
Zero resource preemption Y Y N
Storage Compatible with ECS disks Y N Y
Startup from disks (system disks) Y N Y
Quick reset of system disks Y N Y
Compatible with ECS images Y N Y
Cold migration between physical and virtual machines Y N Y
Free of OS installation Y N Y
Free of local RAID, and stronger protection of data in disks Y N Y
Network Compatible with ECS VPCs Y N Y
Compatible with ECS classic networks Y N Y
Free of communication bottlenecks between physical and virtual machine clusters in VPCs Y N Y
Management Compatible with existing ECS management systems Y N Y
Consistent user experiences on features such as VNC with that of virtual machines Y N Y
Out-of-band (OOB) network security Y N N/A

scch5, SCC instance family with high clock speed

Features
  • I/O optimized
  • Supports only standard SSDs and ultra disks
  • Supports both RoCE and VPCs, of which RoCE is dedicated to RDMA communication
  • Has all the features of ECS Bare Metal Instances
  • Equipped with 3.1 GHz Intel ® Xeon ® Gold 6149 (Skylake) processors
  • CPU-to-memory ratio of 1:3
  • Suitable for the following scenarios:
    • Large-scale machine learning training
    • Large-scale high performance scientific computing and simulations
    • Large-scale data analysis, batch computing, and video encoding
Instance types
Instance type vCPUs Physical cores Memory (GiB) GPUs Bandwidth (Gbit/s) Packet forwarding rate (Kpps) RoCE (Gbit/s) IPv6 support NIC queues ENIs (including one primary ENI) Private IP addresses per ENI
ecs.scch5.16xlarge 64 32 192.0 None 10.0 4,500 2 × 25 No 8 32 10
Note

sccg5, general purpose SCC instance family

Features
  • I/O optimized
  • Supports only standard SSDs and ultra disks
  • Supports both RoCE and VPCs, of which RoCE is dedicated to RDMA communication
  • Has all the features of ECS Bare Metal Instances
  • Equipped with 2.5 GHz Intel ® Xeon ® Platinum 8163 (Skylake) processors
  • CPU-to-memory ratio of 1:4
  • Suitable for the following scenarios:
    • Large-scale machine learning training
    • Large-scale high performance scientific computing and simulations
    • Large-scale data analysis, batch computing, and video encoding
Instance types
Instance type vCPUs Physical cores Memory (GiB) GPUs Bandwidth (Gbit/s) Packet forwarding rate (Kpps) RoCE (Gbit/s) IPv6 support NIC queues ENIs (including one primary ENI) Private IP addresses per ENI
ecs.sccg5.24xlarge 96 48 384.0 None 10.0 4,500 2 × 25 No 8 32 10
Note

sccgn6, compute optimized SCC instance family with GPU capabilities

Features
  • I/O optimized
  • CPU-to-memory ratio of 1:4
  • Equipped with 2.5 GHz Intel ® Xeon ® Platinum 8163 (Skylake) processors
  • Has all the features of ECS Bare Metal Instances
  • Storage:
    • Supports ESSDs, standard SSDs, and ultra disks
    • Supports high performance CPFS
  • Networking:
    • Support VPCs
    • Supports RoCE v2 networks, which is dedicated to RDMA communication
  • Uses NVIDIA V100 GPU processors (with the SXM2 module):
    • Powered by the new NVIDIA Volta architecture
    • 16 GB HBM2 GPU memory
    • 5,120 CUDA Cores
    • 640 Tensor Cores
    • Memory bandwidth of 900 GB/s
    • Supports up to six NVLink connections and total bandwidth of 300 GB/s (25 GB/s per connection)
  • Suitable for the following scenarios:
    • Ultra-large-scale training for machine learning on a distributed GPU cluster
    • Large-scale high performance scientific computing and simulations
    • Large-scale data analysis, batch computing, and video encoding
Instance types
Instance type vCPUs Memory (GiB) Local storage (GiB) GPUs Bandwidth (Gbit/s) Packet forwarding rate (Kpps) RoCE (Gbit/s) IPv6 support NIC queues ENIs (including one primary ENI) Private IP addresses per ENI
ecs.sccgn6.24xlarge 96 384.0 None 8 × V100 30 4,500 2 × 25 Yes 8 32 10
Note

Billing method

SCCs support pay-as-you-go and subscription billing methods. For more information, see Billing method comparison.