By Liu Jinlong (Fuchen) AMAP Team
"In the accident-prone area ahead, keep the distance between cars..."
"Please rest in case of driving fatigue..."
"The scenic spot ahead is the Taishan tourist attraction, China's 5A-level scenic spot, which is the first batch of national-level scenic spots..."
These are the navigation scene of AMAP. There will be tens of billions of notes a day during Spring Festival to ensure travel safety. However, behind this, any system jitter or failure will affect the user's travel safety, so the whole team will be on duty and ready to ensure the stability of the system throughout Spring Festival.
In 2021, AMAP put forward a new brand proposition under the background of "AMAP is familiar with everywhere." The AMAP app added more usage scenarios, and the business system has become more complicated. Colleagues will go to the scene on duty during every festival promotion to ensure the stability of the system.
To solve this dilemma, we have thought about the system architecture and agreed that Serverless will be the future technology trend. Therefore, over the past year, we have conducted a lot of exploration on Serverless technology and implemented Serverless for some core businesses. There is no need for colleagues to work on-site on holidays because of the advantages of serverless, such as low cost, no operation and maintenance, high flexibility, and more.
The article discusses which explorations AMAP has conducted in the Serverless field over the past year and how to combine with businesses to implement a low-cost, low-code, and O&M-free modern Serverless R&D platform.
In July 2021, AMAP announced the new brand: an overall upgrade to the "open service platform for a good life," and the new brand proposition of "AMAP is familiar with everywhere." AMAP upgraded from navigation tools to a travel service platform and life information service entrance. The main map, my page, and the scene after the trip show business recommendation in the form of business cards to serve users better and expand the life information service scenarios related to travel. The following figure shows several typical card recommendation scenarios in the AMAP app.
One of the business features of AMAP is high-frequency style changes. Under the changing needs for a festive atmosphere, holiday tourism guidance, traffic information reminders, etc., a research and development model that can be rapidly iterated is needed. The traditional research and development model is that each change is developed on the app and then updated with the release of the app version. (The efficiency of this R&D model is low. It is difficult to meet business needs if you are still using this research and development model.)
The backend code behind the card business will continue to increase with the number of business types. Therefore, there are more corresponding business strategies. If you do not withdraw from the system function module in time, the system code continues to increase in disorder and become bloated and complex. The changeable strategy requires changes in the system architecture to achieve rapid addition and deletion of policies and real-time effectiveness.
If too much business logic is integrated into the app, the performance can be improved to a certain extent, but if the app is too large, users will be reluctant to update it. Hundreds of megabytes of updates will increase our bandwidth cost and occupy users' data traffic. In addition, there are many businesses involved due to too many codes. If each business requires fast iteration, the app client needs the ability to update quickly and frequently.
This is a vicious circle, which is not conducive to the long-term development of the app, so the app client must lose weight.
Everyone has the same morning and evening rush hour experiences Traffic jams! Similarly, the peak period in the morning and evening is obvious, and the gap between the peak and the trough is large for map navigation applications. If all machine resources are prepared according to the peak value, the cost is high, which will cause a great waste of resources during the off-peak period. Therefore, flexible resource technology that can be used on-demand is needed to reduce the cost of machine resources.
After many rounds of discussions in the architecture group, we finally reached an agreed goal to fully embrace Serverless. The advantages of Serverless solve our business pain points. We believe the future must belong to the Serverless era.
Needless to say, the hottest technology trend at the frontend is cloud integrated research and development mode. It is a modern application research and development model based on Serverless technology. The technical threshold of full-stack development engineers has been lowered, and the research and development efficiency has been improved. It has been fully implemented in many large apps in Alibaba, such as Taobao, Tmall, and Fliggy. After strict data statistics, Serverless can improve 38% in research and development efficiency, which is the most important data basis for us to choose Serverless. In addition, the CSR/SSR home page speed-up technology scheme based on Serverless implementation is mature currently, almost covering all of Alibaba's internal applications.
Engineering excellence has always been our goal, but due to the different focuses of major technical products, we have not paid much attention to the related matters of the last mile of research and development. It led to the separation between the functions of technical products and the user experience, and many business parties have given up.
The goal of Serverless is to allow you to focus as much on your business logic as possible and to pay less attention to or ignore non-business core operation and maintenance work. Accelerate development time and reduce the redundancy cost of online resources, so Serverless can carry the banner of taking off and landing and improving efficiency.
Request millisecond scheduling is the core competitiveness of Serverless. Compared with the traditional minute-level elastic expansion, Serverless technology has a huge cost advantage. The less time it takes to expand, the lower the reserved machine resources. At the millisecond level, there is no need to reserve any resources. Thus, the cost can be reduced, and the resource utilization rate can reach 100%.
Those who have been programmers know that the most painful thing is to change other people's code. Serverless technology is attractive, but learning how to migrate existing applications to Serverless architecture has troubled us.
We worked with the Serverless Team to develop the Runtime and solve the problem of unreasonable use of resources by tidal traffic in traditional architectures in a low-code way. Since the services of AMAP are designed in the languages of C++, Go, and Rust, we developed the Runtime of C++, Go, and Rust. Runtime integrates middleware within the group, enabling the Serverless architecture to meet the entire lifecycle of the previous services. This allows us to switch services to the Serverless platform.
However, as our usage increases and business scenarios become complex, some external services fail to use the internal Runtime. This serious problem has been hindering us, which makes the original architecture go from simple to complex. Our worries were unsolved until the personnel of the Alibaba Cloud Serverless Team launched the Custom Runtime/Container feature. We can migrate existing applications by changing a few lines of startup commands. The Serverless Team has also done a lot of innovative optimization work for the rapid distribution of images, such as the use of the four-level cache, P2P technology, and on-demand download. It provides the ability to download 3G images at the second level.
Thanks to the help of Serverless technology, the unattended duty can be achieved. All operation and maintenance operations are solved through Serverless's powerful scheduling capability. For example, the system will automatically expand in milliseconds during peak periods, and the system will scale in Graceful during off-peak periods, which does not involve any manual operation and maintenance. Therefore, it also solves the dilemma of too many people on on-site duty during the holiday rush period.
In terms of system design, two layers of Function as a Serverless (FaaS), end FaaS, and business FaaS are introduced.
On the whole, it is a practical and efficient architecture design to integrate Function Compute (FC) and containerized microservices and comprehensively utilize the high efficiency of Function Compute (FC) and the flexibility of traditional microservices.
As the core application, a complete CI/CD process is needed to ensure the quality of the application. A set of research and development processes based on the Serverless Devs tool chain has been implemented for the quality of the function. The process has the capabilities of a multi-function environment, function gray-scale flow cutting, function observability, etc. It also has realized the ability to start development in one minute and complete online in five minutes through this Serverless R&D system. At the same time, it has the capability of active geo-redundancy by default.
Throttling protection, degradation plans, and active geo-redundancy are the three essential functions for the stability of large-flow applications. In particular, unexpected traffic may cause a system avalanche in the sale. On the Serverless R&D platform, the concurrency throttling and QPS throttling capabilities of Alibaba Cloud Function Compute (FC) are integrated to achieve function granularity throttling.
Another highlight is that the disaster recovery solution of FaaS has huge advantages over traditional applications. In the disaster recovery solution of traditional applications, active geo-redundancy needs to prepare redundant machine resources in the secondary data center, which is the cost. However, it is fast elasticity in the solution of the Serverless scenario, so the resource preparation can be completed at the second level. Therefore, there is no need for resource redundancy preparation, which saves the cost. It is three units, but the cost is still one unit. We have implemented three-unit disaster recovery by default for all core function applications.
If your business is sensitive to a cold start, an increase in latency of 50 to 100 milliseconds cannot be accepted. Don't worry. It still can be made up through the expansion strategy: reserve resources to reduce the impact of a cold start on the business.
In 2021, AMAP made many explorations in the Serverless field and implemented Serverless in some core businesses. The peak QPS of the Serverless business alone has exceeded 40W QPS. However, this is not the end point. This is only the beginning. We will switch to Serverless in the future. For example, offline and non-online applications (such as map data processing, picture cutting, and message consumption) are also the best applicable scenarios for Serverless. More scenarios are expected to use Serverless in the future, and we will enjoy the technical dividends that Serverless offers us!
Alibaba Clouder - December 4, 2020
Alibaba Developer - May 21, 2021
amap_tech - March 16, 2021
amap_tech - April 20, 2020
ApsaraDB - November 17, 2020
amap_tech - August 27, 2020
A unified, efficient, and secure platform that provides cloud-based O&M, access control, and operation audit.Learn More
Managed Service for Grafana displays a large amount of data in real time to provide an overview of business and O&M monitoring.Learn More
Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.Learn More
Visualization, O&M-free orchestration, and Coordination of Stateful Application ScenariosLearn More
More Posts by Alibaba Cloud Serverless