Hybrid Cloud CPFS Storage provides a high-performance computing file storage that supports standard Portable Operating System Interface (POSIX) and MPI-IO protocols. The built-in high-performance computing program can efficiently run without interface adaptation or performance optimization.

Architecture

Hybrid Cloud CPFS Storage is a file storage service for high-performance and ultra-large-scale storage scenarios. It provides a new file storage architecture and can combine multi-level storage pools on and off the cloud.

A single cluster can be scaled out to 16,384 nodes at most. This provides a distributed file storage service with high performance, high scalability, and low latency. The service applies to scenarios such as autonomous driving model training, genome sequence assembly, and oil exploration data analysis.

Benefits

Benefit Description
Hybrid cloud storage architecture Alibaba Cloud storage services are integrated to provide premium experience for customers in cloud bursting scenarios.
High scalability
  • A single Hybrid Cloud CPFS Storage cluster can be scaled out to 16,384 nodes at most.
  • The fully symmetric and distributed architecture ensures that the throughput bandwidth can be linearly scaled for both metadata and entity data.
  • True seamless scale-out: The load of existing storage nodes is automatically detected during the scale-out to control the scale-out speed.
  • Multi-level storage architecture: An on-premises Hybrid Cloud CPFS Storage cluster can be scaled to use Cloud Paralleled File System (CPFS) and Object Storage Service (OSS) storage on Alibaba Cloud.
High performance
  • A single Hybrid Cloud CPFS Storage cluster supports a maximum throughput at the TB/s level.
  • 100 Gigabit Ethernet is supported to provide a throughput of 2.3 GB/s per node.
  • 100 Gb/s and 200 Gb/s InfiniBand networks are supported. This can systematically improve throughput and reduce latency.
High availability and reliability
  • Rolling upgrade is supported. The update does not interrupt services.
  • The second-level fault detection feature allows you to detect damaged disks and failed service nodes at the earliest opportunity.
  • Multiple data protection modes are supported.
    • Multi-replica modes: 2-replica and 3-replica
    • Erasure coding modes: 4+2p, 4+3p, 8+2p, and 8+3p
Rich interface protocols A wide range of interface protocols are supported, including POSIX protocols such as Server Message Block (SMB) and Network File System (NFS), Object, and Hadoop Distributed File System (HDFS).
  • NFS v4 and v3
  • SMB 3.0, 2.1, and 2
  • OpenStack Swift along with Keystone v3
  • S3
  • HDFS Transparency 3.1.0-X, 3.0.0-X, 2.7.3-X, 2.7.2-X, and 2.7.0-X

Scenarios

Scenario Description
Autonomous driving model training Hybrid Cloud CPFS Storage provides low-latency and high-IOPS access to the large number of small files collected by vehicle-mounted devices such as cameras, radars, and infrared sensors in the autonomous driving scenario. The model training speed can be increased by more than three times.
Genome sequence assembly Genome sequence assembly requires a large number of concurrent computing jobs. Hybrid Cloud CPFS Storage provides an access bandwidth of up to 100 GB/s to meet the access requirements of hundreds of concurrent jobs. This breaks the file I/O bottleneck shortens the job completion time by 50%.
Oil exploration data analysis A large amount of geological data needs to be computed, processed, and analyzed. In addition, the raw seismic data and process data need to be stored for a long time. Hybrid Cloud CPFS Storage provides PB-level namespaces and supports flexible quota management for namespaces. Resources can be allocated to different computing jobs to meet the needs of business expansion.

Specifications

For more information about the service specifications, see the following documents: