Function Compute|How to use layers to solve the problem of dependent packages?

When using the Alibaba Cloud Function Computing Platform, if you have encountered the following problems, this article should help you:

1. The third-party dependency package is too large, it is very time-consuming to update the code every time, and it may even exceed the limit of the code package, what should I do?
2. After installing the third-party dependency package, it can run successfully locally, but an error will be reported when uploading to the Alibaba Cloud Function Computing Platform. What is the situation?
3. There are many commonly used dependency packages, which should be used by many users. Can’t Alibaba Cloud Function Compute be directly built into the runtime environment?
4. I have the same dependencies in multiple functions, how do I manage these same dependencies?

Layers provide a way to centrally manage, share code and data across multiple functions.
In January 2021, Alibaba Cloud Function Compute released the "Custom Layer" function, which allows users to customize layers and supports cross-function sharing. In August 2022, Alibaba Cloud Function Computing released the "Public Layer" function, providing an official public layer for users to use directly, further improving the user experience.

Next, let's introduce the functions and functions of the "custom layer".

custom layer

Before the layer function is released, the code must be packaged and deployed together with the dependencies of the code. These dependencies may be the same in different functions. In many cases, the size of these dependencies is much larger than the size of the code.

But after the release of the layer function, we can package the dependencies of the code or the shared parts of multiple functions into a Zip compressed file, and release it as a custom layer of Function Compute, which can be used by different functions.

Alibaba Cloud Function Compute will load the layer and function code together when calling. You can refer to the document at the end of the article: Create a custom layer [1], configure a custom layer in a function [2]

Why use custom layers?

Using custom layers has the following advantages:
(1) Code reuse across functions
Extract the common code or data in multiple functions, package it into a Zip package, and make it into a custom layer for reference by different functions, avoiding maintaining common code or data in multiple places.
At the same time, the separation of dependencies and business logic is also achieved, and users can focus on core business logic.
(2) Make the code package smaller
When the code package of the function becomes larger and larger, the deployment speed will become slower and slower, making the maintenance and testing of the function more difficult.
In addition, the size of the function code package is also limited. For example, the code package of Alibaba Cloud Function Compute is limited to 500MB (September 2022). Layers are one of the methods to break through this limit. Layers also have a size limit. Currently, the code package size of a single layer is limited to 500MB. A single function can configure up to 5 layers, and the total size cannot exceed 2GB.
(3) Accelerate code deployment and simplify function management
The smaller the function code package, the faster the code package can be deployed. Especially with some large dependencies, the core function code may only be a few megabytes, but the dependencies may be hundreds of megabytes. For example, the Puppeteer dependency package exceeds 100MB, and Alibaba Cloud's DataX dependency package exceeds 800MB.
Generally speaking, these dependencies are rarely modified, so after packaging them into layers, you can avoid frequent modification of these large dependencies when the core code is modified. These dependencies can also be split into multiple layers, and each time a function is modified, only one of the layers needs to be updated.
For example, we have implemented a custom runtime Python3.10 and the runtime-compatible scientific computing library SciPy, which can split the custom runtime and dependent packages into two layers. When the dependent package needs to be updated, only the dependent package needs to be updated. layer, while the layer for custom runtimes remains unchanged.

The Dilemma of Custom Layers

(1) There is a certain threshold for the production layer
The Zip package of the layer has a certain format specification, and the user needs to make it according to the specification. Taking Python's requests library as an example, the packaged file structure depends on:
my-layer-code.zip
└── python
└── requests

Why is there such a requirement? This involves the implementation logic of searching for third-party dependent packages at different runtimes. Taking Python as an example, the Python runtime will search for dependent packages under the sys.path path, and the above Zip package will be decompressed to the /opt directory of the function instance. After decompression, the requests package is placed in the /opt/python directory.

Then, the function computing platform will put some specific directories on the dependency search path of the runtime language, for example, the Python runtime will put /opt/python into the sys.path, so that the code can directly refer to the requests library up. For other runtime usage methods, please refer to the document at the end of this article - Create a custom layer. Of course, you can also not follow this format specification to make the layer. In this case, you need to add the corresponding search path to the code. For specific methods, please refer to the document at the end of this article - How to reference the dependencies in the layer in Custom Runtime? [3]

Layers need to be made under the specified operating system and processor architecture. Some dependencies are dependent on the operating system and processor architecture, such as Python’s scientific computing library NumPy, if you install it under MacOS on the M1 chip, its version is:
numpy-1.23.3-cp39-cp39-macosx_11_0_arm64
It can be seen that the compatible operating system is mac os, and the processor architecture is arm64. However, the instance environment of the Function Compute platform is Linux x86_64, and the distribution version of the operating system currently used is Debian 9, so the NumPy library installed under the M1 Mac cannot be used on the Alibaba Cloud Function Compute platform. We recommend installing it under the Debian 9 system, but the user may not have this environment locally. You can use the online build dependency library or use the official runtime image of Function Compute to build it, so I won’t go into details here.

layer needs to contain the newly added shared dynamic library. Some dependent libraries need to install additional shared dynamic libraries, and these shared dynamic libraries also need to be included when building the Zip package of the layer. For example, the dependency library Puppeteer of Nodejs needs to install more than 20 shared dynamic libraries (such as libxss1, libnspr4, etc.), and these dependent libraries must be packaged into a layer Zip package. How to successfully install the Puppeteer library is not a simple matter. It is recommended to put the shared dynamic library in the lib directory of the Zip package, and the Function Compute platform will add the /opt/lib directory to LD_LIBRARY_PATH (only for the built-in runtime).

(2) Unable to share across accounts
By default, the custom layer can only be shared between different functions of the same account and the same region, and cannot be shared across accounts. Therefore, the custom layer created by user A cannot be used by user B, which not only brings repeated workload to the user, but also is not conducive to the reuse of the same layer on the host.

Public layer

Due to these pain points of the custom layer, Alibaba Cloud Function Compute released the public layer function in August 2022. Realize layer sharing across accounts, and provide some official public layers [6] for users to use directly, which is convenient for users to quickly develop sample prototypes. Alibaba Cloud Function Computing Platform mainly provides three types of official public layers:
• Custom runtimes (such as Python 3.10, Nodejs17, PHP 8.1, Java17, .NET 6, etc.)
• Common dependent libraries (such as PyTorch, Scipy, Puppeteer, etc.)
• Aliyun SDK (such as Aliyun DataX)
For details, please refer to the official document at the end of the article - Configure the official public layer in the function [4]. Contact us in the Q&A group (DingTalk group number: 11721331), or submit an issue directly on Github[5].

How to expose custom layers?

Currently, the layer disclosure function is in internal testing. If you need it, you can contact us through DingTalk. At the same time, we also welcome everyone to contribute the public layer to the warehouse, and we will soon provide methods and examples of public layer contribution in the warehouse.

Example

For the latest version and usage instructions of the official public layer, please refer to Github. Below we introduce some typical examples of using the official public layer.
Example 1. A sample program for web page screenshots based on Nodejs16 + Puppeteer
Puppeteer is a Nodejs library that provides a high-level API to control Chrome (or Chromium) through the DevTools protocol. In layman's terms, it is a headless chrome browser, which can be used to complete many automated things, such as:
• Generate screenshots or PDFs of web pages
• Do automatic submission of forms, automated testing of UI, simulate keyboard input, etc.
• more...

This example uses Puppeteer to complete a webpage screenshot sample program.

First, we create a function start-puppeteer using the built-in runtime Nodejs16, where the request handler type selects "handle HTTP requests".

Then, set the memory specification to 1GB in the advanced configuration, and the memory usage of the sample program is about 550MB.

After the creation is successful, open the index.js file on the console, copy and overwrite the following code, and click the deploy button.

Briefly introduce the core logic of the above code. First, the code will parse the query parameter to obtain the url address that needs to be screenshotted (if the parsing fails, the serverless Devs official website homepage will be used by default), and then use Puppeteer to take a screenshot of the webpage and save it to / of the running instance tmp/example file, and then directly return the file as the return body of the HTTP request.

Then, we need to configure the Puppeteer public layer, find the layer in the function configuration, click Edit, and choose to add the official public layer.

What are the best practices for layers?

The previous article introduced what is a custom layer, why to use a custom layer, what is a public layer, and introduced two examples of official public layers. But we still have some doubts about the use of layers. For example, in what scenarios is it recommended to use layers? What is the difference between a layer and a code package? Is there a similar functionality to layers? What are the advantages and disadvantages of layers compared to these similar functions? Next try to answer these questions.

In what scenarios is it recommended to use layers?

At present, there are mainly two types of scenarios for using layers, one is custom runtime, and the other is dependent libraries of various languages. It is strongly recommended to build through layers and use a custom runtime, but for dependencies in various languages, you can refer to the following suggestions:
• It is recommended to use the official public layer first
• It is recommended to use layers to manage the dependent libraries of non-compiled languages, and judge the compiled language according to the actual situation (for example, for a custom runtime, if you use a JAR package to run a Java program, you cannot introduce dependencies in the layer , you can refer to the document How to reference the dependencies in the layer in Custom Runtime?)
• If the dependent library is large and does not exceed the limit size of the layer, it is recommended to use the layer
• If the dependent library requires additional installation of a shared dynamic library, it is recommended to use a layer (if the construction is complicated, you can contact the Function Compute team to make it)
• If there is a need to share code or data between multiple functions and multiple accounts, it is recommended to use layers

What is the difference between a layer and a code package?

Intuitively, a layer is to split part of the original code package and rebuild a code package, so why create a layer concept? The main difference here is that layers are designed differently than code packages.

• Layers have a cleaner versioning scheme
The layer version is automatically incremented from 1. Currently, a layer supports a maximum of 100 available versions (excluding deleted versions); there is no version concept for code packages, only at the service level. Relatively Layer versions are more complicated.
• - layer version is read-only, immutable
The content of a layer version cannot be changed after it is created (except permissions). If you want to modify the content of the layer, you can only publish a new version. The read-only feature of the layer version can avoid the impact of layer changes on functions.
• Layer sharing capabilities
Layers can be shared across functions and accounts, but code packages do not support it.
• Soft-delete policy for layer versions
After the layer version is deleted, it will not affect the normal operation of functions that have been configured with the changed layer version. Because when the layer version is deleted, the Alibaba Cloud Function Computing Platform will not directly delete the code of the layer version, but first perform a soft delete operation to prevent new functions from using the deleted layer version. When the layer version has no functions Only when referenced, will the layer version be completely deleted.

Is there any function similar to layers in Function Compute? What are the advantages and disadvantages of layers compared to similar functions?

In the Alibaba Cloud Function Computing Platform, functions similar to layers are the "mount NAS file system" and "mount OSS object storage" functions in the service configuration. Layers and mount NAS/OSS have different functions and application scenarios. Significant differences:

To briefly summarize, if the size of the code or data exceeds the limit of the layer, it is recommended to use the method of mounting NAS/OSS; if the code or data will be mirrored and changed, or there is a need to modify the data at runtime, then it is also recommended to use the mount NAS/OSS way.

epilogue

In Alibaba Cloud Function Computing, layer positioning is an immutable infrastructure, and layer consistency and reliability are guaranteed through the read-only feature of the layer version. This article first introduces the characteristics and difficulties of the custom layer, then introduces the functions of the public layer released recently, and details two sample programs based on the official public layer. Finally, it discusses what is the best practice of the layer. It allows readers to better understand the concept of layers and their application scenarios.

Layer functions are still being improved, and we will focus on optimization in the following directions:
• Improve the official public layer experience, add more commonly used dependent libraries or custom runtimes as the official public layer, and provide complete application examples.
• Provide methods and examples of public layer contributions to promote open source co-construction of the public layer.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us