How Serverless Task solves task scheduling

1、 Task scheduling

Task scheduling refers to the related operations that the system places different tasks into appropriate computing resources according to the current load situation. A perfect scheduling system often needs to balance the two requirements of isolation between tasks with different characteristics and optimal efficiency. The independent queue model and automatic load balancing strategy are adopted for the asynchronous task of function calculation, which has the ability of multi-rent isolation without affecting the processing performance.

Serverless Task task scheduling model

When a user submits a task, the system will convert the task into a message and put it into the internal queue through asynchronous distribution. The processing flow of a message is shown in the following figure:

Figure 1

The whole system mainly relies on the scheduler's consumption and control of queues in terms of multi-rent isolation and message backlog control in task scheduling. We will divide an account level queue for each user in advance, and the asynchronous calls (including task calls) of all functions of the user will share the queue.

Such a model structure will ensure that each user's asynchronous execution request (including task invocation) will not be affected by the invocation of other users. However, in some large-scale application scenarios, such as a large number of functions for a user and a large number of calls to each function, it is inevitable that all asynchronous messages share a queue, resulting in the interaction between calls. Some long tail calls may consume too much queue resources, leading to starvation in the execution of other functions.

In order to avoid this situation affecting the execution of important functions, function calculation provides a more detailed queue-function level queue. You can set a separate queue for each different function to ensure that the consumption of high-priority functions will not be affected by the execution of other functions under the same account. The relationship between queues is shown in the following figure:

Figure 2

Typical application scenarios

Suppose a user A has 2 different task functions. One of the tasks A needs to execute one message at a time due to the limitation of downstream services; The other task B is a large concurrent task, and I hope to finish it as soon as possible. In the default mode, tasks A and B share the same user queue; At this time, the following scenario will appear: Task A has concurrency constraints, and the function calculation side will control the queue rate of the entire task queue. This led to the delay in the task of Task B.

When Task A finishes executing, Task B gets the chance to queue out. At this time, the concurrency increases. Task B's messages preempt the resource pool for execution. Task A becomes difficult to queue out and cannot start execution for a long time. As a result, both A and B are seriously disturbed by the other party's business.

After the queue adjustment, tasks A and B occupy the queue independently. In this case, the consumption speed of task A and task B is not affected by the other party, and both can meet their own demands.

At present, Serverless Task provides a large backlog of tasks. You can obtain the number of tasks that have been backlogged in the task interface, and comprehensively analyze whether to open the exclusive queue of functions.

Serverless Task Task Queue Load Balancing Model

The above describes how to avoid the problem of "Noise Neighbour" through function-level queues. However, in some scenarios, if the concurrent magnitude of the task is too large, even if the task is divided into a single queue, it will lead to a backlog of tasks. To solve this problem, the load balancing strategy of Serverless Task needs to be introduced.

The task processing module of function calculation has the concept of Partition. Each user belongs to a partition by default, and the scheduler responsible for the partition will listen to the user's corresponding task queue. When there is a serious backlog, we will allocate multiple partitions for users according to the load situation, and assign them to different schedulers for consumption, to improve the overall consumption speed of the task.

Figure 3

It can be seen that Alibaba Cloud function computing has the ability of multi-tenancy and isolation in task queue management by default, which can be applied to most scenarios. For some scenarios with heavy load, long execution and large concurrency, function computing also supports horizontal expansion to speed up consumption. In task isolation, function calculation supports separate isolation of functions with different priorities to avoid the problem of Noisy Neighbour.

2、 Observability

The observability of the mission is one of the essential capabilities of the mission system. The strong observability will help the business side reduce the additional workload required at each stage of the task operation.

Development stage: the online debugging ability of the task and the debugging ability of the operation results will directly affect the business online progress;

The normal operation stage of the business: various monitoring, traffic statistics and runtime logs will help users quickly understand the development and changes of the business, as well as the rapid positioning and processing in case of failure;

Phased audit: the storage and retention of task history will provide users with good traceability, and subsequent business planning can be carried out based on historical information.

ServerlessTask observability support - development test phase

The main appeal of the business development stage is to quickly debug and locate problems. In support of this phase, ServerlessTask provides the ability to log in to instances and real-time logs. After the code is developed and uploaded, the process of test - debug - modify code - retest can be completed on the console, greatly improving the efficiency of research and development. If there is a need for performance debugging, third-party Binary debugging (such as FFmpeg debugging in the field of audio and video processing) can be completed with the help of the login instance function. The operation process is shown below:

Select the task to log in to the instance and click the instance link.

You will enter the instance monitoring page and click the login instance function in the upper right corner to log in to the corresponding instance.

ServerlessTask observability support - operation phase after business launch

When the business is online, it is often easy to cause failure because the downstream system cannot bear the pressure due to insufficient capacity estimation. Therefore, ServerlessTask provides runtime indicators, that is, the number of tasks submitted, completed and executed in a period of time. Users can quickly understand the current business load based on this indicator chart. When the downstream consumption of a user's task is slow, it may cause a backlog of tasks, which is also easily reflected in the indicator chart, and then make a quick response. At present, the relevant indicators provided by ServerlessTask are as follows:

The task monitoring system provides the following task monitoring data:

Monitoring indicators

explain

Number of tasks submitted

The total number of tasks submitted in the past 1 minute, including the number of running, completed and unqueued tasks.

Number of tasks completed

The number of tasks completed by tasks submitted in the past 1 minute, including those successfully or failed.

Number of tasks in queue

The number of tasks submitted in the past 1 minute that are still queued. If the quantity is not 0, there is a backlog of tasks.

Number of running tasks

The number of tasks submitted in the last 1 minute that are running.

Number of failed tasks

The number of tasks submitted in the last 1 minute that failed to run.

Number of instances occupied by running

The number of tasks submitted in the past 1 minute that are running successfully.

In terms of fast locating problems, function calculation supports real-time viewing of function logs and instance indicators. You can enter the task list page, find the actual task that failed to execute, and enter the log page and instance page to locate the problem:

ServerlessTask observability support - phased audit

When an online task runs for a period of time, it often needs to carry out a series of phased audits, such as the total number of tasks executed in the previous week, the number of failed tasks and the time of failed execution. At present, in addition to the console, function computing provides rich API capabilities to audit tasks. It mainly includes the following capabilities:

Filter by status, query only the execution of a certain status;

Filter according to the trigger time, such as querying the tasks initiated within a certain period of time in the past;

Query by task name. If your task has a TraceID for the upstream and downstream of the business, you can specify a meaningful task ID when triggering the task. Later, you can query the range according to the ID prefix;

The above filtering methods can be combined to meet more convenient requirements. The filter conditions supported by the console are shown in the following figure:

For more parameters, please refer to ListStatefulAsyncInvocation.

ServerlessTask observability support - dead letter queue and service compensation

In the message field, there is a very important concept - dead letter queue. When some messages cannot be consumed, they often need to be stored in a place for subsequent human intervention to avoid business losses caused by not processing. Serverless Task also supports such functions. You can set the target function for Serverless Task; When the task execution fails, the function calculation supports automatically pushing the context information of the execution failure to the message service such as message queue for subsequent processing. If your processing logic supports automation, the function calculation also supports pushing the context information of the failed task back to the function calculation, and executing a section of your custom business logic to achieve business compensation.

You can configure success and failure targets on the asynchronous call configuration page.

For more configuration content, please refer to PutFunctionAsyncInvokeConfig.

To sum up, the observability provided by Serverless Task can effectively support the monitoring requirements of the whole life cycle of the task. All console capabilities can be customized using open APIs to meet more needs. The target function of Serverless Task can not only compensate for task failure, but also serve as the data source of Event-Driven mode to automatically post the processed event to the downstream service.

Recommended recent hot events of Serverless

The Serverless function calculation evaluation and essay solicitation activity is coming. During the period from June 28 to July 31, participating in the product evaluation and publishing articles, you will have the opportunity to win Beats headset, mechanical keyboard, 1000 yuan Tmall Supermarket card, Youku member season card and many other good gifts!

The direction of submission can be referred to (but not limited to):

• Your experience and suggestions on functional computing FC product capabilities help other users choose Serverless services.

• Use function computing FC to create application scenario evaluation, such as building a cloud blog based on function computing FC, building an elastic and highly available Serverless web application, and building an elastic and highly available video processing system based on Serverless architecture.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us