Meteorological data is a typical type of big data that features large volume, high instantaneity, and wide variety. Meteorological data is mainly spatio-temporal data that holds observations and analog quantities for individual physical quantities within a time or space range. Data produced each day usually varies from tens of TB to hundreds of TB and is increasing explosively. Efficiently storing and querying meteorological data is becoming an increasingly difficult challenge.
In recent years, people both in the academic circle and the industry have begun to use distributed NoSQL storage as a solution for storing and querying a sea of meteorological data in real time. Compared with traditional solutions, distributed NoSQL storage can support larger data sizes, provide better query performance and significantly improve stability, manageability, and some other features.
It is also becoming an increasing trend to parse, store, query and analyze data using cloud computing resources. The cloud holds diversified products/services and elastic computing resources, which can support the implementation of the whole meteorological data processing workflow: use cloud distributed storage to store and query meteorological data in real time, utilize big data computing services to analyze and process meteorological data, and finally use various app services to set up meteorological platforms and apps.
Alibaba Cloud Table Store is a distributed NoSQL service on the cloud that supports high-concurrency data read/write operations and PB-level data storage and provides the ability to read data in milliseconds.
On the one hand, Table Store is a distributed NoSQL storage service. In Table Store, data is spread among various servers, and a single cluster can support 10 PB of data. This resolves the storage issue.
On the other hand, Table Store supports fast single-row query and range query, which means it is a large SortedMap from the data model perspective. Even if a table has tens of billions or even trillions of data rows, the speed at which a single row of data is located does not decrease. So, when a file system contains an excessive number of small files, the time it takes to locate target data is reduced.
The growing trend is to use distributed systems and cloud computing services to resolve big data issues across a variety of industries. In the future, more advanced industrial solutions will be implemented on the cloud. For details about the traditional solutions and the Table Store solutions for storing and querying massive amounts of meteorological model data and the comparison of the advantages and disadvantages for the solutions, please go to Meteorological Data Storage and Querying with NoSQL.
Table Store is a NoSQL Multi-model database developed by Alibaba Cloud. It provides PB-level structured data storage, 10 million TPS, and millisecond-level latency service. In real-time computing, Table Store provides powerful writing capability and multi-model storage forms, allowing it to be used not only as a computing result table, but also as a real-time computing source table.
Blink is a deeply improved real-time computing platform by Alibaba Cloud based on Apache Flink. Like Flink, Blink aims to integrate stream processing and batch processing.
This article introduces the best architecture practice of real-time computing based on Table Store and Blink, as well as the data analysis job based on Table Store and Blink.
We take a big data analysis system for situational awareness as an example to illustrate the advantages of Table Store and Blink based real-time computing architecture. If our client is the CEO of a large catering enterprise with chain stores all over the country, the CEO is very concerned about whether customers all over the country have received good service in the stores. For example, will Taiwanese customers and Sichuan customers have different taste evaluations? Have the dishes become less popular? To solve these problems, the CEO needs a big data analysis system, which, on one hand, can monitor the sales information of dishes in various regions in real time, and on the other hand, can regularly analyze historical data and provide changes in customer trends.
Undoubtedly technology is the most rapidly evolving space where comes a thousand different options to meet your varying needs. However, you need to ponder these choices to perfectly suit your overall requirements. Likewise, the database dynamics kept changing even from our grandpa's time, from relational to non-relational, from SQL to NoSQL and then from structured to huge unstructured databases requiring big data environments from us. Also, the top-notch organizations such as Alibaba Cloud, Facebook, Twitter, and Google, etc. deal with huge data streamlines and Big Data processing to quickly provide response to the user queries, for which they wrestle quite a lot. All in all, the database has always been counted as one of the core components for the development of either the small or the large-scale applications.
Table Store is a NoSQL database service that built on Alibaba Cloud’s Apsara distributed file system, enabling you to store and access massive volumes of structured data in real time. MaxCompute and TableStore are two independent big data computing and storage services. This document introduces how to import data from Table Store to the MaxCompute computing environment. This allows seamless connections between multiple data sources.
MongoDB is a document-oriented database that is second only to Oracle and MySQL. A MongoDB connection allows you to read data from and write data to MongoDB by using MongoDB Reader and Writer. You can configure sync nodes for MongoDB by using the code editor.
Table Store is a fully managed NoSQL cloud database service that enables storage of a massive amount of structured and semi-structured data. It provides full/incremental data tunnels, seamlessly interconnecting with various products for big data analysis and real-time stream computing.
Data Transmission Service (DTS) helps you migrate data between data storage types, such as relational database, NoSQL, and OLAP. The service supports homogeneous migrations as well as heterogeneous migration between different data storage types.
DTS also provides optimized and high performance delivery to Analytic DB from RDS to support customers with their real-time big data analytics initiatives. With this solution, customers can perform ad-hoc discovery, organization, and enrichment of low-latency data before it traverses to more refined sets of analytics tools.
Alibaba Cloud Storage - November 8, 2018
Alibaba Clouder - May 20, 2020
Alibaba Cloud Storage - March 28, 2019
Alibaba Cloud Storage - May 14, 2019
Apache Flink Community China - December 25, 2019
Alibaba Clouder - November 27, 2017
More Posts by Alibaba Clouder