By Mo Jingguo (Technical Head of the Cloud Classroom Livestreaming Platform of New Oriental Education & Technology Group Inc.) and Hao Zheng (Alibaba Cloud Serverless Architect)
New Oriental Education and Technology Group is a comprehensive education group that uses the overall growth of students as the core and technology as the driving force. The cloud classroom system of New Oriental's online education business supports all online education scenarios of New Oriental, such as livestreaming, transcoding, and video-on-demand. With the increase in business volume, the low resource utilization rate of the self-built data center has become the core pain point of business due to the traffic peak and valley of the task processing platform, such as livestreaming transcription and video transcoding. In order to improve the utilization of computing resources and achieve the goal of reducing cost and improving efficiency, New Oriental has embarked on the road to Serverless practices (after several attempts). The following content was shared by Mo Jingguo (Technical Head of the Cloud Classroom Livestreaming Platform of New Oriental Education & Technology Group Inc.) at the Apsara Conference 2022.
In addition to its online education business that uses the cloud classroom platform, New Oriental has opened up its livestreaming capability through Metcom Cloud Live. The cloud classroom livestreaming platform mainly supports four business models:
Livestreaming and video-on-demand are the main course delivery modes of New Oriental. Live courses are highly interactive, which can achieve good classroom interaction and stimulate students' interest in learning. Teachers and students can evaluate each other based on students' classroom feedback or interaction, so teachers can fine-tune the teaching environment in time, thus making the teaching process more targeted. The teaching process and effect can be secure through interaction and classroom Q&A. Live courses are more suitable for young students.
Recorded courses are more suitable for students in high school and above, which are characterized by flexible learning time. Students can search the learning content independently and study selectively. The advantage of using recorded courses is that the teaching content can be continuously polished and edited, and the demand is increasing.
In the beginning, the team adopted the client-side screen recording technology, which records the teacher's livestreaming for students' repeated viewing. However, this method has a high error rate and a high CPU usage, and the recording UI layout can not be flexibly customized, so we can only record what we see. This method can only meet the needs of the younger age groups.
In 2022, New Oriental began to engage in the online education business of college students, which puts forward higher requirements for the quality of recorded courses. The team considered using the server-side recording to solve the problem. The two core points of server-side recording are live recording and standardized video production. Our business model determines that it is difficult for us to predict the volume of business. Therefore, the key technical task for New Oriental is to achieve computing elasticity.
The team has three optional technical routes to solve the server-side recording problem:
Use self-managed databases on ECS instances: The advantage of this solution is it is flexible, but the problem is elastic computing is not supported. Although cloud vendors provide the elastic allocation API of ECS resources, it requires much development work to realize the computing elasticity. At the same time, the subsequent operation and maintenance are complex, the resource cost is high, and it is difficult to achieve standardization.
Use SaaS cloud recording: The advantage of this solution is that it provides standardized services and requires less R&D investment and less operation and maintenance work. However, the problem lies in poor flexibility, high resource costs, and difficulty in performance optimization. We want to find a mature SaaS vendor to provide services to support the business quickly. However, after trial, the maturity and technical metrics of these platforms cannot meet our needs.
We found that Alibaba Cloud Function Compute (FC) can perfectly meet the requirement for elastic computing. We only need to pay attention to the specific requirements and conduct development on the platform. The R&D investment is low, and the O&M is free. The development process is independent and controllable, with high flexibility. On-demand use significantly reduces the cost and makes it relatively easy to achieve standardization. However, FC is a relatively new technology, and it will take some time for the team to get familiar with it.
After repeated comparisons, the New Oriental Team chose FC to solve the server-side recording problem.
First, we tried it in the recording and transcoding scenario. The core requirement of recording and transcoding is to transcode livestreams in real-time and save them in a standard video format for subsequent processing and use.
In this scenario, we saw the elastic advantage brought by FC for the first time. After the teacher enters the livestreaming room and initiates a transcoding request, we can quickly start the function instance for transcoding. After the class ends, we end the transcoding task, upload the temporary audio and video to the cloud storage, and immediately release the function instance without wasting computing resources. After the experience of applying FC in recording and transcoding projects, we have greater confidence in the FC solution.
After that, we started the cloud recording project. We used the Chrome browser to join the live room and capture the browser interface. The key to this solution is to elastically provide browser instances.
Therefore, we use Alibaba Cloud Function Compute (FC) to start a Linux container and elastically provide browser instances by running the Chrome browser in the Linux container. The whole recording process is listed below.
After the teacher enters the live classroom, the system starts the audio and video stream ingest and whiteboard operation. At the same time, the recording platform initiates a recording request, starts the function for processing, receives the audio and video streams and whiteboard operations of the classroom, and shows the entire classroom in the browser while taking screenshots. After the course ends, the platform initiates an end recording request, and the FC platform terminates the instance. Before the instance is terminated, temporary results are uploaded to the cloud storage. Then, the function instance is destroyed without any resource waste.
We believe observability is essential for the FC platform. First, a large number of function instances need to be started during business peak hours. Therefore, complete metrics, logs, and traces are required to effectively monitor massive instances. Second, since FC instances are created on-demand and destroyed after completing the task, the platform must keep a complete log to facilitate developers in troubleshooting problems.
The problem we faced during the development of the recording service was that after the function instance was started, the Chrome browser would access the live service. At this time, a network problem occurred, resulting in a recording failure. Later, we used the Alibaba Cloud SLS platform to view logs and found that the Chrome browser kernel was too sensitive for network processing. After finding out the problem, we introduced the retry mechanism, and the problem was solved.
Before using the FC technology, we expected it to pull up tens of thousands of instances in 100 milliseconds, timely perform the warm-up operation to solve the cold start difficulties, and help us carry the peak of transcoding and screen recording business during livestreaming. We expected it to respond effectively to large-scale burst online traffic, charge using the pay-as-you-go billing method, improve resource utilization, reduce 20% of resource costs, and significantly reduce O&M costs. Then, we can focus only on business innovation. In practice, we found that FC meets our needs perfectly and brings surprises. It allows developers to use the platform easily after mastering a few new concepts and using a few APIs. Since the FC solution was applied, the cost of cloud resources has been significantly reduced. In addition, the FC solution allows templates to be made according to business scenarios, and the templates can be used by other business parties.
Alibaba Clouder - December 19, 2018
Alibaba Cloud Serverless - September 29, 2022
Alibaba Clouder - July 20, 2018
Alibaba Developer - March 3, 2022
Alibaba Clouder - December 4, 2020
Alibaba Cloud Serverless - November 25, 2020
Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.Learn More
This solution enables you to rapidly build cost-effective platforms to bring the best education to the world anytime and anywhere.Learn More
Elastic and secure virtual cloud servers to cater all your cloud hosting needs.Learn More
Visualization, O&M-free orchestration, and Coordination of Stateful Application ScenariosLearn More
More Posts by Alibaba Cloud Serverless