On March 19, 2019, the GPU version of Alibaba Cloud RDS PostgreSQL database was officially launched. This new version of the database uses parallel acceleration of RDS heterogeneous computing, providing significant improvements. In fact, GPUs have been shown to have about 50 times the computing capability of CPUs. By using heterogeneous computing at the cloud infrastructure level of RDS, this is the first database in China to apply GPUs in spatial information field commercially.
GPUs can be defined as any graphics processor with a high-performance parallel architecture. Traditional CPUs only have four or eight computing cores, whereas GPUs can have as many as several thousand computing cores. Coupled with data cache and flow control, all of this makes GPUs well suited for the acceleration of compute-intensive data processing tasks. In the past, GPUs were designed to take advantage of parallel graphics computing. However, in recent years, with the increasing application of GPUs in AI and high-performance computing in general, GPUs have become increasingly synonymous with both powerful and generalized processors.
With the launch of this version, RDS adapts to the GPU framework for the first time at the cloud infrastructure level, setting the foundation for building a parallel heterogeneous computing environment for ApsaraDB. Given the large volume of spatial graphics, image data, and complex computing involved, the acceleration capability of GPUs in the first phase is combined with Ganos, improving spatial data processing performance. Ganos can provide efficient computing services, such as storage, query, and analysis, for various types of spatial-temporal data on the cloud.
The system provides a heterogeneous CPU- and GPU- computing framework that can automatically detect the GPU environment, and establish a rule-based optimization method to evaluate CPU, GPU, or CPU-GPU hybrid computing. In spatial computing, the optimal adaptation between the GIS spatial parallel model (such as the Raster-Chunk-Cell framework of raster data) and the CUDA (the computing platform launched by NVIDIA) parallel model can be established, reducing GPU task scheduling and maximizing the use of GPU resources.
The spatial raster data (remote sensing images, elevation models, and so on) has a large number of splits, a large volume, and a slow cloud migration speed. Promoting the writing speed of ApsaraDB by improving basic algorithms is an application scenario that users are concerned about. Taking the resampling algorithm of raster data as an example, the image quality of the cubic convolution sampling and more advanced re-sampling algorithms is good, but the computing workload will multiply. Therefore, promoting the processing efficiency of basic algorithms is the key to speed up the raster data processing. According to the output of 10000*10000 pixels of remote sensing image data, 100 million sampling algorithms need to be run independently, which is computation-intensive, and the raster data is a matrix model, so the GPU parallel acceleration capability can be fully utilized to improve the re-sampling efficiency.
In remote sensing spatial applications, the coordinate systems are often inconsistent because of different data sources or collection methods. When the data needs to be superimposed and displayed uniformly, it is either converted into the same coordinate system in advance, or it is projected dynamically in real time. Using the former will have the problem of data redundancy, while using the latter is more efficient. However, due to the large amount of computations, it is often difficult to achieve real-time displays in the CPU environment. This problem can be solved by using GPU parallel computing to improve the dynamic projection computing efficiency.
With the rapid development of Internet travel and location sensing technology, the amount of data generated by moving objects (such as planes, ships, and cars) is increasing. Accessing, updating, displaying, transmitting, and storing such large amounts of track data presents a great challenge to the industry. However, with the help of GPU parallel computing capabilities, the tracking data of mobile objects can be dynamically thinned in real time to reduce storage capacity, and quickly transmit and display data.
A comparative test found that the parallel computing capabilities of GPUs is about 50 times higher on average than that of CPUs. This improvement is even clearer as data volume increases and the computation tasks grow in complexity. For example, when data is stored on an SSD cloud disk, GPUs are 9 to 15 times faster than CPUs for remote sensing image storage (including index creation). Then, when data is stored in the Alibaba Cloud OSS, GPUs are 4 to 7 times faster than the CPUs used for the data warehouse receiving process.
The GPU version of Alibaba Cloud RDS for PostgreSQL has been launched on the public cloud, and is currently only available in China East 2 (Shanghai). When you purchase the product, select the PG10 Basic edition, and select the GPU acceleration model in the Type column. By default, the system already has the GPU environment. You can experience the performance experience brought by GPU accelerated computing without setting any parameters.
For more information about RDS for PostgreSQL, visit the following pages:
digoal - September 17, 2019
Alibaba Clouder - July 5, 2018
digoal - April 12, 2019
digoal - October 23, 2018
digoal - June 26, 2019
ApsaraDB - April 3, 2019
Powerful parallel computing capabilities based on GPU technology.Learn More
An online MPP warehousing service based on the Greenplum Database open source programLearn More
Mitigate the scalability problem of single machine relational databases for large-scale online databases.Learn More
A reliable, cost-efficient backup service for continuous data protection.Learn More
More Posts by ApsaraDB