Developer Content

The improvement of infrastructure promotes the continuous change of application forms

We have divided the past 40 years into five important stages of technological development, which can be divided into 1980 - 2000, 2000 - 2005, 2005 - 2010, 2010, 2020 and 2020 - 2025 from the timeline. Today's press conference is related to the technological development of the fifth stage. Looking from the past to the future, let's review the history of technological development and see how the first four stages have experienced technological development? What are the main scenarios for technology application? What is the mainstream application form? What new technologies and products have been born?

From 1980 to 2000, it was the stage of computer technology development and application. Computers can help enterprises to better manage their own data and improve process efficiency, so the application of this stage is mainly enterprise information systems. At this stage, many science and technology companies specialized in serving enterprises were born, and many great commercial products were created. The data of the application system is mainly stored in the database. The database is mainly relational database.

From 2000 to 2005, it was the initial development stage of Internet technology. Information can be transmitted more effectively through the Internet, so a large number of portal websites were born in this stage, that is, the Web 1.0 era. LAMP (Linux+Apache+MySQL+PHP) was the most popular website building technology at that time. It was a low-cost solution completely composed of open source products. The data storage of the application system is still dominated by relational databases. MySQL has replaced commercial relational databases with its advantages of open source and low cost, and has been widely used. With more and more information on the Internet, people are increasingly demanding to obtain effective information, so search engine as a new application form was born. Search engine is the first application facing large-scale data processing challenges. As a great technology company, Google has pioneered many big data processing technologies. The well-known troika (GFS, Mapreduce and Bigtable) has laid the foundation for the development of NoSQL and Bigdata technologies in the next decade.

From 2005 to 2010, with the popularity of personal PCs and the gradual reduction of the cost of accessing the Internet, more and more people have access to the Internet. There are some new forms of application. First, people are no longer satisfied with one-way information from the Internet, but more eager to exchange information between people on the Internet, thus promoting the development of social networks; Second, some B2C or C2C e-commerce websites began to develop around the application of "crowd". This era is the so-called Web 2.0 era. As a new data source, "people" begin to generate large amounts of data on the Internet. At this stage, some large Internet companies were born, mainly in the field of e-commerce and social networking. These companies are faced with the challenge of how to support such large-scale data online services, and how to process and analyze such massive data. At that time, there was no mature and available solution, so these companies had to start developing their own systems. Therefore, some popular NoSQL and Bigdata systems were born or incubated from the era of large Internet companies. For example, Hadoop was first incubated in Yahoo, and Cassandra was first used in the inbox search scene of Facebook.

From 2010 to 2020, with the development of 4G technology and the popularity of smart phones, mobile Internet began to develop. People can connect to the Internet anytime and anywhere. Mobile applications can cover a wider range of people and penetrate into more life scenarios, such as payment, taxi, etc. The gradual transformation of traditional Internet applications to mobile applications has created a demand for a large number of applications. As a low-cost, easily accessible data center, cloud platforms have been accepted by more enterprises. This decade is also the golden decade of cloud computing development. Cloud computing has completely changed the running environment of applications. Unlike traditional IDC, in this running environment, computing, storage and other resources are pooled, and multiple types of storage and computing resources can be flexibly obtained. The application of cloud based elastic resource construction is called "cloud native" application. More and more big data and database products are based on cloud native construction. In order to have elastic scalability, distributed technology is also widely used. For modern new big data and database products, distributed and cloud native must be necessary capabilities.

Finally, in the period of 2020~2025, we can see that 5G and IoT technologies are gradually mature, and a new application form, the Internet of Things, will be born. We can see some new application scenarios, including the Internet of Vehicles, the Internet of Industrial Things and smart home.

Summarize the laws of technology and product evolution in the past decades (infrastructure technology ->larger informatization scope ->more scenarios, larger data scale ->technology and product development):

Each stage starts from the improvement and popularization of an "infrastructure". The core role of infrastructure is to further expand the scope of informatization. For example, the Internet enables applications to connect with more terminals. The mobile Internet directly breaks the barrier of terminals, enabling applications to connect with more people. The Internet of Things will add more devices to this connection.

With the expansion of the scope of informatization, more new application scenarios are born, and a larger number of "individuals" generate more large-scale data, which has become the driving force for the development of basic technology.

In this process, basic technologies often lag behind the development of application forms. However, with the popularization of distributed technology and cloud computing, the evolution and popularization of basic technologies are getting faster and faster. We can also see the changes in the form of basic technology products, from the earliest commercial products, to open source products, and to the current cloud native products.

In the new stage of the Internet of Things, the number of devices and the data generated will be of a larger scale, and there will be greater challenges. What kind of technological development will be promoted under such challenges?

What kind of technical challenges will the Internet of Things industry face with the rapid development

Let's take a look at the high-speed development of the Internet of Things, and see the overall growth trend of the Internet of Things from the following two market reports:

Large scale growth of the number of devices: Gartner predicts that the number of devices in the Internet of Things will grow to 25 billion by 2021. How to manage such a large number of devices is the first challenge.

Large scale growth of equipment data: As can be seen in the IDC report, the scale of IoT data will reach 79.4 ZB by 2025, with an average annual growth rate of 34.91%. How to store and analyze such massive data is the second challenge.

Take the Internet of Vehicles scenario as an example to define the specific requirements for data storage

The theme of this conference is the storage solution of the Internet of Things, so let's take a look at the specific requirements for data storage in the Internet of Things scenario. Let's take a real application scenario in the Internet of Vehicles as an example. If you are a new energy vehicle company that provides online car hailing services and manages hundreds of thousands of new energy vehicles daily to provide online car hailing services, you will encounter the following specific scenarios:

To facilitate the management of these vehicles, each vehicle must report its own status in real time, including location information, remaining power, mileage, driving speed, etc. In addition to these dynamic information, each vehicle also has its own static information, such as model, owner, etc., which needs to be acquired and managed in real time at the back end.

With the real-time status information of these vehicles, the real-time status query service of vehicles can be provided for owners, passengers or the background. There will also be some computing tasks in the background that depend on real-time status, such as vehicle polling based on location information or specific conditions to manage specific tasks, or vehicle scheduling based on real-time status.

In addition to the real-time status reporting of the vehicle itself, the vehicle and management background also need to maintain a message channel. Through this message channel, the vehicle will report some abnormal events of its own, and the background can also send some message information or control instructions.

In addition, some driving information of the vehicle will be reported and stored as track data, and some sensor data during driving also need to be stored. With these data, on the one hand, you can query the driving track, on the other hand, you can do some calculation and analysis based on the data to mine more value, such as optimizing the scheduling algorithm by analyzing historical driving data.

From these scenarios, we can see that there are three main types of vehicle related data. One is real-time status data, which we classify as "metadata"; The second is the message channel. We classify this type of data as "message data"; The third is trajectory data, which we classify as "time series data". These three types of data have different requirements for the underlying storage. Metadata is characterized by frequent updates and high requirements for query capabilities. Data query or filtering is required according to multi field conditions; The characteristics of Message Data are similar to that of message queues. The number of queues is very large, and an independent match is required for each device; Time Series Data is characterized by high throughput writing, large data scale, and more emphasis on analysis scenarios.

In traditional schemes, metadata is generally stored in MySQL. However, MySQL's biggest problem is that it cannot flexibly support multi field condition filtering. Generally, it needs to combine Elasticsearch and rely on Elasticsearch to provide multi field retrieval capabilities. Although message data has the characteristics of message queue, traditional message queue cannot be used. Because traditional message queue cannot support so many topics, MySQL is generally used to simulate the queue implementation. Time series data is generally stored using HBase, which can provide high throughput writing and support large-scale storage, but HBase does not have analysis capability.

The basic technology often lags behind the development of application forms. The traditional architecture is to build the entire IoT storage system by combining multiple products. This kind of multi-component composite architecture has high architecture complexity and brings high operation and maintenance costs. Developers need to understand and use multiple products, and the operation and maintenance of distributed components are difficult, resulting in very high overall costs. In addition, each component is not designed and optimized for the IoT scenario. We can see that the device metadata, message data and timing data in the IoT scenario have very typical characteristics, and the overall scale growth rate will be far faster than that in the Internet era. It can be predicted that older generation products cannot cope with the growth of larger data scale under the IoT scenario.

What kind of storage products does the Internet of Things need

According to the objective law of the development of technology products in the past decades, the era of the Internet of Things has come, and the current technical architecture is difficult to support the growth of the scale of the Internet of Things in the future. In the face of the new application form of the Internet of Things, under the challenge of massive equipment and data, based on cloud computing, a new generation of basic platform, what kind of new basic products do we need to build by using cloud native, distributed and other basic technologies?

We hope this new basic product has the following features:

Built on cloud native and distributed technology, with scalability and elasticity

Meet the requirements of one-stop storage, retrieval and analysis of device metadata, message data and time series data

The cost is low enough to support such massive data

Therefore, in order to meet the challenges brought by the development of the Internet of Things industry, Alibaba Cloud form storage will launch a new form storage capability: one-stop IoTStor; To learn more about the press conference, click the link below or scan the QR code of the poster to see the details!

Trends, Status Quo and Challenges of the Internet of Things Industry

Related Articles

A detailed explanation of Hadoop core architecture HDFS

What Does IOT Mean

6 Optional Technologies for Data Storage

What Is Blockchain Technology

Explore More Special Offers

Short Message Service(SMS) & Mail Service

Sales Support

Technical Support

Connect & Report Abuse