The Road to Serverless Practice in Focus Media

Focus Media was born in 2003 and created the elevator media advertising model. In 2005, Focus Media became the first Chinese advertising media stock listed on NASDAQ in the United States. In 2015, Focus Media returned to A-share with a market value of more than 100 billion.

The key to Focus Media's revenue exceeding 10 billion is to seize the core scenario of [elevator]. Elevator is the infrastructure of the city. The daily life scene of elevator represents four words: mainstream crowd, necessary, high frequency and low interference. These four words are the core scarce resources that ignite the brand today.

The unique value of Focus is that it has formed high frequency and effective arrival every day in the elevator space that the mainstream crowd must pass through in the mainstream cities, thus forming a strong ability to ignite the brand. Focus elevator media, covering 310 million mainstream consumers in Chinese cities, and more than 2.6 million elevator terminals. In addition to elevator terminals, a large number of advertising posters will also be printed and distributed. How to ensure the posting effect of these static resources has become one of the important business indicators of Focus.

Therefore, Focus has developed its own image recognition processing system. When the staff changes the poster, they will take photos and upload them to the background server through the APP terminal. Every weekend, the static posters will be replaced in batches, and the background system will usher in the peak of processing, which may require centralized processing of millions of pictures. On weekdays, when the replacement frequency is relatively low, the background system will be relatively idle. The average difference between the peak traffic on weekends and weekdays is more than 10 times, as shown in the figure below. If resources are retained according to the peak value on weekends, a large number of idle resources will be generated on weekdays.

With the growth of the business scale, the business side's demand for the flexibility of the backstage service is becoming stronger and stronger. How to make the backstage system cope with the peaks and valleys more calmly and balance the resource expenditure becomes the biggest pain point.

In fact, as early as the end of 2019, Focus has been exposed to functional computing FC, and is also exploring the use of containers. After a period of exploration, we found that the function calculation model is more suitable for business development. For the business side, the main focus is on the business and algorithm. I don't want to touch too many underlying infrastructure concepts. The threshold for starting and later maintenance of the container is higher than the function calculation.

👉 Function calculation FC: https://www.aliyun.com/product/fc?

The landing practice of function calculation

Focus first used a single frame to handle the image recognition function. After switching to the function calculation, it adopted a structure that separated the front and back ends. The back end part used API gateway+FC. The use of API gateway is to standardize the API.

However, the use of FC was not smooth at that time. First of all, there were many doubts about the stability, ease of use, performance and other aspects of FC function calculation, and FC did have some limitations at that time, such as:

1. There is no way to monitor CPU usage and memory usage;

2. The maximum specification can only provide 2C3GB. I'm afraid that under complex algorithms, 2C cannot support the resource requirements of the algorithm;

3. The maximum code package supports 50MB, while the image recognition algorithm is often more than GB, and the minimum compression package is also hundreds of MB;

4. FC is unable to stay in the process, and is worried about the insufficient elastic efficiency, which will affect the response time.

After communication and test with Focus, it was found that FC's operating principle and virtual machine are different, and some concerns can be solved.

For FC, each request can monopolize instance resources and carry large traffic through horizontal elastic expansion. For example, if there are 10 requests to FC at the same time, FC can start 10 containers of the same specification to run the requests at the same time. The next request will be received after the current request is executed. Therefore, it can ensure that the CPU resources of each request are exclusive, and fault isolation can be achieved between requests.

After actual testing, it is found that 2G/about 1.33C resource specification can meet most image recognition scenarios. Some operations, such as watermark, can also be reduced to 512MB/about 0.33C (the minimum specification is 128MB memory/about 0.1C) to achieve the best resource utilization ratio to save costs.

For the algorithm package with large volume, it can also be solved by hanging the NAS disk. In terms of elasticity, the function calculation can achieve elastic scaling (cold start) at the level of 100 milliseconds. For the API interface at the APP end, the end-to-end average response is about 300 ms, which can be basically met; As for image recognition, it is asynchronous, so it is not sensitive to delay. After the final launch, the general business structure is as follows:

After a period of online operation, the function calculation carried the online business well, and the flexibility and response time basically met the business demands. At the peak of the service, more than 7K container instances will be expanded to process image recognition at the same time. At the peak and valley of the service, the instances will automatically shrink. Compared with the previous use of virtual machines, the cost savings are at least 50%.

Another significant benefit is that function calculation improves the efficiency of publishing and deployment, and the publishing time is shortened by an order of magnitude, and it is more convenient. Previously, the method of virtual machine deployment was adopted. The full update of the code needs to be written and run once on each machine. After FC only uploads the code once, the underlying machine will be automatically replaced with the latest code, and the business can be uninterrupted.

Optimization and upgrading of function calculation

However, with the continuous development of the business, the number of peak processing pictures has also been increasing. FC, which has always been as stable as Mount Tai, gradually began to generate some flow control and overtime error reports during the peak period of the business, as shown in the following figure:

After troubleshooting, it was found that the original FC+NAS mounting algorithm relies on the way to run the code. At the peak of the business, it will encounter a bandwidth bottleneck, which will cause some requests to run more time-consuming, exacerbate the consumption of concurrency, and eventually lead to flow control and run timeout.

For example, the monitoring shows that the original code dependency placed in NAS is about 1GB. When the concurrency is suddenly increased, a large number of FC instances will go to NAS to load the dependency, causing network congestion. The most direct way is to upgrade the bandwidth of the NAS instance directly, but to cure the symptoms rather than the root cause. After more than one year of development, function computing has also added a lot of practical functions. After communicating with Focus, it is recommended to deploy directly by image. Compared with the original ZIP package deployment method, it will add one step of image operation, but the benefits are more obvious. First, the package and business code can be deployed and maintained together, and the image method is more standard; In addition, the NAS disk can be saved, reducing the risk of network dependency and single point of failure.

There is another problem in the deployment process. The image is too large! The basic image of Python 3.8 is close to 1GB, all algorithms rely on close to 3GB, and the final image generated is 4.2GB. Directly deploy to FC. During the cold start process, it takes more than 1 minute to load the image alone. Fortunately, FC provides the image acceleration capability, which greatly reduces the loading time to about 10 seconds. The following is a comparison of the acceleration effect.

In addition, FC also supports large-format instances, which can be directly deployed to 16C 32GB large-format instances. For some algorithms that rely heavily on CPU resources, they can also be directly deployed to FC for operation.

Another good feature is the observable growth of FC. The CPU and memory utilization rates mentioned earlier are also open and supported. In the service configuration function, after the instance level monitoring is enabled, you can see the CPU usage, memory usage, network bandwidth, etc. of the instance in the function monitoring view. This is very useful for Focus business. For different image processing algorithms, the specifications of FC operation can be adjusted according to CPU usage, which can maximize the balance between cost and performance.

Summary and outlook

FC has a very good attraction in reducing costs and increasing efficiency. Especially for businesses with peaks and valleys and requiring extreme flexibility, it is a very good choice. In addition, the enhancement of capabilities such as image deployment, image acceleration, and observability can enable Focus to better control the business.

In addition, FC also released the GPU mounting capability, which is also the first in the industry. It is a good choice for the algorithm model that needs to rely on GPU reasoning and acceleration in the future. By taking advantage of the advantages of serverless elastic scaling and pay-as-you-go, the current situation of GPU "unaffordable" can be greatly reduced.

Alibaba Cloud's Serverless not only has a functional computing platform for micro-service applications, but also is the first to launch the Serverless application engine SAE in the industry. It also has obvious advantages over the current background micro-service deployed by Focus based on K8s: it can significantly reduce resource maintenance costs, improve the overall research and development efficiency, and achieve zero-code transformation and leveling. Later, we will explore the best practices of microservice On Serverless with Focus.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us