This topic describes the runtime extension feature of Function Compute that is developed based on conventional long-running applications to help you eliminate idle costs.
Long-running applications and FaaS execution environment
Conventional long-running virtual machines (VMs) and managed container services often use a billing interval from the time when an instance is started to the time when the instance is stopped. You are charged even when no requests are executed within this interval. The billing granularity in Function Compute is accurate to milliseconds. You are charged only when requests are processed. Instances are frozen when no requests are processed. This basically eliminates the idle costs incurred by events. The freezing mechanism makes it difficult to predict performance of long-running processes in conventional architectures, which hinders application migration. For example, the commonly used open source distributed Managed Service for OpenTelemetry (formerly Tracing Analysis) libraries or third-party application performance management (APM) solutions cannot correctly report data due to the execution environment of Function Compute.
The following pain points hinder the smooth migration of conventional applications to a serverless architecture:
Data of asynchronous background metrics is delayed or lost. If the data fails to be sent during the execution of a request, the data may be delayed until the next request is sent, or the data points may be discarded.
The latency is increased if metrics data is synchronously sent. If a method similar to Flush is called after each request is completed, the latencies of requests are increased and excessive workloads are generated on backend servers.
To support graceful unpublishing of function instances, applications need to close connections, stop processes, and report execution status when instances are stopped. Developers do not know when function instances are unpublished in Function Compute. In addition, no webhook is provided to send notifications about unpublished events of function instances.
Programming model extensions
Function Compute provides the runtime extension feature to resolve the preceding pain points. The feature extends the existing programming model for HTTP servers by adding the PreFreeze and PreStop webhooks to the existing HTTP server model. Extension developers implement an HTTP handler to monitor lifecycle events of function instances.
PreFreeze: Before Function Compute freezes the current function instance, Function Compute sends an HTTP GET request to the /pre-freeze path. Extension developers implement specific logic to ensure that necessary operations, such as waiting for sending metrics, are complete before the function instance is frozen. The execution time of the PreFreeze hook is not included in the time when InvokeFunction is called.
PreStop: Before Function Compute stops the current function instance, Function Compute sends an HTTP GET request to the /pre-stop path. Extension developers implement logic to ensure that necessary operations, such as closing database connections, reporting data, and updating execution status, are complete before the function instance is unpublished.
Billing overview
The billing method for PreFreeze or PreStop calls is the same as that for InvokeFunction calls. You are not charged for the number of requests sent to extend HTTP hooks. The extensions are also applicable to scenarios in which multiple concurrent requests are executed on a single instance. This way, when multiple invocation requests are concurrently executed on the same instance and all the requests are executed, the PreFreeze hook is called before the instance is frozen. In the example shown in the following figure, the specification of a function instance is 1 GB and the period from t1 when PreFreeze starts to t6 when Request 2 is executed is 1s. The execution time of the instance is calculated based on the following formula: t6 - t1. The consumed resource is calculated based on the following formula: 1s × 1 GB = 1 CU.