from: Alibaba Cloud realtime compute Flink 2020-09-21 4545
the Internet of Things (Internet of Things) is the carrier of information such as the Internet and traditional telecommunication networks. It enables all common objects with independent functions to be interconnected. IoT digitizes the real world and has a wide range of applications. The Internet of Things Shortens scattered information and unifies the digital information of objects and objects. Its main application fields include the following aspects: transportation and logistics, health care, and smart environment (home, office, and factory) fields, personal and social fields, etc., have very broad market application prospects. The Internet of Things integrates technologies such as intelligent perception, identification technology, network communication and universal computing, and is considered as the next trend of the development of the world information industry after computers, the Internet and smart phones.
According to IDC's estimation, the Internet of Things will generate a global value of 1.46 trillion US dollars by 2020. Chinaidr predicts that China's Internet of Things market will exceed 1.8 trillion RMB by then. It is generally believed that China will become a major participant in the Internet of Things industry. Thanks to the huge population base and low chip manufacturing costs, China will play an important role in promoting the growth of the global Internet of Things market.
Currently, the IoT industry is increasingly widely used in the following areas:
the industrial field is currently the application field with the most Internet of Things projects. The industrial field covers the most networking things, such as printing equipment, workshop machinery, mines and factories.
2. Medical field
at present, the application of Internet of Things technology in medical industry includes intelligent personnel management, intelligent medical process, intelligent supply chain management, intelligent medical waste management and intelligent health management. The most typical application is wearable equipment, which helps users realize personalized self-health management and has become the new favorite of many people who pay attention to health.
3. Intelligent transportation and vehicle networking
at present, the application of the internet of things in intelligent transportation has taken shape and has strong development potential. The application of the internet of things in intelligent transportation includes real-time monitoring system, automatic charging system, intelligent parking system and real-time vehicle tracking system, which can automatically detect and report the health status of roads and bridges, it can also help the transportation industry to alleviate energy consumption, pollution, congestion and other problems.
4. Smart home
ioT solves the problem of device networking in smart home. Many domestic manufacturers in different fields have started to set foot in the smart home industry, including Internet technology manufacturers, traditional household appliance manufacturers and internet giants. Intelligent hardware such as smart TV and smart speaker can also be used as the control center and hub of smart home. "Artificial Intelligence + Internet of Things" will set off a craze to change lifestyle.
5. Smart logistics
intelligent Logistics is the Internet of Things technologies such as bar code, radio frequency identification technology, sensor and global positioning system, which are widely used in logistics transportation, warehouse, distribution, packaging, loading and unloading and other links. The rise of smart logistics cannot be separated from the catalysis of the e-commerce outbreak, and even from the support of the Internet of Things technology.
Data characteristics in the IoT era
IoT sensors continuously receive data from a large number of connected devices. As the number of connected devices increases, IoT systems need to be scalable to accommodate data inflows. The analysis system processes these data and provides valuable analysis reports, which will bring competitive advantages to enterprises. IoT big data comes from the data generated by sensors of IoT devices, which is different from the data characteristics in the traditional big data field. IoT big data features 6V as follows:
- capacity (Volume): data volume is a decisive factor in determining whether a dataset is big data, traditional large-scale or ultra-large-scale data. The amount of data generated by using IoT devices is much larger than before, which obviously conforms to this feature. * *
- speed (Velocity): the big data generation and processing speed of IoT is high enough to support the availability of real-time big data. In view of this high data rate, it also proves that advanced tools and technical analysis are needed to operate effectively. * *
- diversity (Variety): generally, big data has different forms and types. This may include structured, semi-structured, and unstructured data. Various data types can be generated through the internet of things, such as text, audio, video, and sensor data.
- Authenticity (Veracity): authenticity refers to quality, consistency and data credibility. Only data with authenticity can be accurately analyzed. This is especially important for the internet of things, especially for those groups that perceive data. * *
- Variability (Variability): this attribute indicates that the data flow rate is different. Due to the nature of IoT applications, different data generation components may have inconsistent data streams. In addition, the data loading rate of a data source may be different at a specific time. For example, the data loading of parking service applications using IoT sensors will reach a peak during peak hours. * *
- Value (Value): value refers to the transformation of big data into useful information and content, bringing competitive advantages to organizations. The value of data depends not only on the process or service of data processing, but also on the way data is treated.
According to the data characteristics of IoT, big data and AI analysis tools related to IoT need to solve the following data calculation problems:
- more and more real-time data needs: more real-time data is needed to support IoT device analysis, manipulation, and management. For example, IoT device data needs to be analyzed in real time and then directly fed back to industrial lathes for processing parameter adjustment.
- More and more semi-structured and unstructured data analysis needs: more and more IoT devices generate semi-structured or even unstructured data, including logs, images, audio, and videos. All these require big data and AI-related analysis tools for analysis and processing. * *
- more and more intelligent data processing needs: for a large number of image, audio, and video scenarios, you need to use AI-enhanced real-time processing tools for analysis and processing, rather than traditional databases and big data tools. * *
- from data analysis and manual decision-making to data computing, the production link is controlled in reverse mode: big data analysis of IoT changes from generating reports to manual decision-making, and gradually changes to data computing results that directly control production links in reverse direction, saving manpower decision-making costs.
realtime compute Flink is a Apache Flink-based enterprise-level product created by Alibaba Cloud. Its underlying technology engine is driven by Ververica Platform, a commercial product provided by Flink's founding team. To meet the requirements of big data analysis in the IoT era, realtime compute Flink combines the infrastructure of Alibaba Cloud IaaS and PaaS to solve the following problems for IoT customers:
- real-time data processing: realtime compute Flink features low latency, high throughput, and consistency. It is suitable for real-time data cleansing, analysis, and processing of IoT.
- More and more semi-structured and unstructured data analysis needs : supports structured data processing (analysis and statistical computing), and provides efficient APIs such as SQL, TableAPI, and PyFlink to solve data analysis problems; realtime compute also provides Flink APIs that include DataStream, allowing you to easily analyze unstructured audio and video content.
- More and more intelligent data processing requirements : provides the underlying framework for streaming data. Based on the Flink open-source API, you can freely combine image, audio, and video processing algorithm packages to provide flexible and intelligent data processing requirements.
since the introduction of the Alibaba Cloud Apsara big data platform, the yield rate of photovoltaic production has increased by 1 percentage point, saving tens of millions of costs.
xiexin photovoltaic is located in Suzhou Industrial Park with beautiful environment. It is the world's leading manufacturer of photovoltaic materials. Silicon wafer products account for 70% of domestic circulating silicon wafer, and it is in the leading position in the same industry in China. In technology research and development, quality control, automation upgrade and other aspects are also at a high level.
Through years of optimization of the production process, Xiexin's production efficiency and product quality have always been in the leading position in the industry. However, they gradually feel that if they continue to maintain the traditional way, the space that can be optimized becomes smaller and smaller. For Xiexin people who pursue excellence, how to go to the last kilometer of production quality improvement is undoubtedly a huge problem. The general manager of Suzhou Xiexin photovoltaic once said: "The future breakthrough of Suzhou Xiexin still depends on new technologies and new products."
alibaba Cloud realtime compute Flink solution
the rise of intelligent manufacturing has brought big data analysis into the manufacturing revolution. By collecting and uploading production data to the cloud, you can perform real-time and long-term data analysis, monitor the production process, analyze the parts that can be optimized in the production process, monitor the links that affect product quality, carry out quantitative analysis and improvement on product quality; Predict the equipment situation and optimize the spare parts.
In 2016, Xiexin photovoltaic officially cooperated with Alibaba Cloud, hoping to promote internal management upgrading and further improve market competitiveness through new-generation information technologies such as cloud computing and big data. The main objectives of this cooperation are transparent production, data management and improved yield rate. Specifically:
- save all data in the production process of Xiexin at low cost for a long time;
- establish a good product rate prediction model through big data analysis;
- through big data analysis, a key parameter monitoring model is established to monitor and alarm the production process;
- multi-dimensional statistical analysis of Xiexin's production data through the Alibaba Cloud BI system;
- through Alibaba Cloud's large screen technology, workshop and business department are established to produce large screen Kanban and other aspects.
The overall technical framework can be divided into three parts: Workshop source data, big data storage and analysis area, and business area. Specifically, it includes data migration to the cloud, key parameter monitoring model, key and full parameter standard curve model, production process monitoring and alarm, yield rate prediction, spare parts loss analysis, large screen Kanban, BI analysis.
- after the implementation of the first phase of the project, tens of millions of costs are saved each year.
- Through Alibaba Cloud's big data analysis algorithm, Xiexin photovoltaic can analyze all variables collected in the production process and find out the key variables that are most relevant to the yield rate. "According to these key variables, build a production parameter monitoring model for Xiexin photovoltaic, and analyze and process these variables in the production process. Once the variables exceed the scope of the model, the monitoring system of Xiexin photovoltaic will give early warning in time."
- as the representative of manufacturing enterprises pursuing excellence, Xiexin photovoltaic has explored a way for the transformation and upgrading of similar enterprises. As an important asset of enterprises, big data can realize intelligent transformation and upgrading of enterprises with the help of new technologies such as cloud computing, and complete the last mile of improving production efficiency and product quality.
- The cooperation model of Xiexin and Alibaba Cloud can be directly copied, and the production experience of manufacturing enterprises can be utilized, as well as the stable and efficient big data storage and analysis capabilities provided by cloud computing and big data analysis, create an enterprise-level data analysis platform.
shanghai Yuxin Software Co., Ltd. focuses on the research and development of indoor positioning technology and passenger flow statistics and analysis, such as indoor positioning engine and passenger flow statistics and analysis system. When users import the customer flow system, it provides integrated solutions for commercial retail stores, such as covering WeChat Internet access and regularly pushing accurate business information to customers at specific locations.
- real-time heat map, the real-time heat map of each floor is made through the real-time passenger flow analysis system. Different colors represent the density of the number of passengers.
- Real-time passenger flow statistics, the real-time passenger flow analysis product mainly serves the operators of the shopping mall, and provides functions such as the heat map of the shopping mall, the heat map of the store, the number of passengers, the proportion of new and old customers, the stay time and the distribution of passenger flow time, etc, provides data support for operational decisions.
- Precise push, Wi-Fi the collected addresses collide with the existing database, create user profiles for the collided users, and push them more accurately according to the situation of the store.
- Location-oriented advertising, connect with the off-line advertising screen of the shopping mall, set the geo-fence and rules, and recommend related advertisements after hitting the rules.
the data source of the whole system is Wi-Fi,Wi-Fi the distribution of devices is the key to the success of the system. In the process of laying the Wi-Fi, the location of the equipment (floor, plane coordinates, store, etc.) will be recorded in advance, and whether the Wi-Fi overlap according to the business situation: if accurate, multi-Point positioning is required. Otherwise, ensure that the range of Wi-Fi does not overlap to prevent data from polluting each other.
- use Wi-Fi to collect device information.
- Send the collected data to the receiving server through SLB.
- The receiving server sends data to the message queue (DataHub).
- Realtime compute subscribes to DataHub data.
- Associate the user information collected by the device with the geographic location information of the device.
- Complete the processing and write the results for downstream use.
The processing logic of realtime compute for Flink is as follows:
- data cleansing and deduplication.
- Dimension table Association, user Mac address is associated with device geographic information, real-time data is associated with historical data.
- Mobile phone brand identification, location identification, and new customer identification.
- Calculate the retention time and generate a trajectory.
Note: data collection and cleaning are the basis of the whole system. Based on these data, precise push and location advertising services can be carried out.
the real-time passenger flow analysis platform of new shopping mall involves multiple offline devices (2,000 devices). The Flink version of realtime compute processes 30K input data records per second and outputs 20K processed data records per second, the overall latency is seconds, and the overall benefits include:
- reduce O & M costs : O & M-free. Alibaba Cloud provides high security;
- connect upstream and downstream : direct registration without development;
- reduce development costs : SQL development has high efficiency and low threshold. The original three-day workload of single-job Java development was reduced to one day, with fewer bugs, and the entire system refactoring took only one week.
This system connects offline and online, provides data support of different dimensions for the operators of the mall, and improves the effectiveness of operational activities; in order to create a better shopping experience for customers, it also improves the overall revenue of the mall. Real-time customer flow analysis system is a typical case of combining IoT technology with real-time big data processing technology.Realtime compute Flink product Communication Group
realtime compute Flink-solution: https://developer.aliyun.com/article/765097阿里云实时计算Flink-场景案例: https://ververica.cn/corporate-practice阿里云实时计算Flink-产品详情页: https://www.aliyun.com/product/bigdata/product/ SC
Start Building Today with a Free Trial to 50+ Products
Learn and experience the power of Alibaba Cloud.Sign Up Now