This blog explains how to compress, migrate, and compile LibreOffice to Alibaba Cloud Function Compute for a serverless file converter.
soffice --convert-to pdf --outdir /tmp /tmp/test.doc
- Recompile and tailor LibreOffice to adapt it to the built-in gcc and kernel versions of the FC nodejn8 runtime environment.
- Install libssl3 dependencies that were missing during the runtime.
- Download and extract it during OSS runtime to override the 50 MB limit for ZIP packages.
- The example project has been created to support one-click deployment for a quick experience.
This document focuses on the entire migration process. It also records the key steps for migration to the function calculation platform in the future by using similar conversion tools. If you are interested in how to quickly build a cheap and scalable Word-to-PDF cloud service, see Launching a Word-to-PDF Cloud Service on Function Compute.
Preparation
- Docker-ce. For its installation method, see the official installation documentation.
- Fun, which is an orchestration tool used to quickly deploy functions on Function Compute.
brew tap vangie/formula
brew install fun
npm install @alicloud/fun -g
The command line tool for OSS is ossutil. Download and store the tool in the directory that $PATH points to.
Compile LibreOffice
We use the aliyunfc/runtime-nodejs8:build
docker image provided by fc-docker to compile LibreOffice. fc-docker provides a range
of docker images, whose runtime environments are very similar to actual Function Compute
environments. Because we will run LibreOffice in the nodejs8 environment, aliyunfc/runtime-nodejs8:build
is used in this case. The tab image requires more basic packages compared with other images.
Start a Compilation Environment
docker run --name libre-builder --rm -v $(pwd):/code -d -t --cap-add=SYS_PTRACE --security-opt seccomp=unconfined aliyunfc/runtime-nodejs8:build bash
In this case, a container named lipo-builder is started and the current directory
is mounted to the /code directory of the file system in the container. The additional
parameter --cap-add=SYS_PTRACE --security-opt seccomp=unconfined
is required for CPP program compilation. If this parameter is missing, you will be
prompted with a warning. Here, -d
indicates daemon and -t
indicates tty. The bash
command prevents the container from exiting. --rm
indicates that the container is automatically deleted once it stops.
Install the Compilation Tool
apt-get install -y ccache
apt-get build-dep -y libreoffice
CCache is a compilation tool that can accelerate multi-compilation of GCC to the same program. Although the initial compilation process takes a relatively long time, CCache can significantly accelerate subsequent compilation processes.
The build-dep subcommand of apt-get builds an environment for the software programs to be complied. Specifically, it installs all the required tools and packages.
Clone the Source Code
git clone --depth=1 git://anongit.freedesktop.org/libreoffice/core libreoffice
cd libreoffice
Add the --depth=1
parameter because full cloning is time-consuming for a large-scale LibreOffice project
and the Git submission history is useless for compilation.
Configure and Compile
# For compilation of the software program multiple times, this setting can accelerate the progress of subsequent compilations.
ccache --max-size 16 G && ccache -s
# The most important part. Run ./autogen.sh --help to see what each option means
./autogen.sh --disable-report-builder --disable-lpsolve --disable-coinmp \
--enable-mergelibs --disable-odk --disable-gtk --disable-cairo-canvas \
--disable-dbus --disable-sdremote --disable-sdremote-bluetooth --disable-gio --disable-randr \
--disable-gstreamer-1-0 --disable-cve-tests --disable-cups --disable-extension-update \
--disable-postgresql-sdbc --disable-lotuswordpro --disable-firebird-sdbc --disable-scripting-beanshell \
--disable-scripting-javascript --disable-largefile --without-helppack-integration \
--without-system-dicts --without-java --disable-gtk3 --disable-dconf --disable-gstreamer-0-10 \
--disable-firebird-sdbc --without-fonts --without-junit --with-theme="no" --disable-evolution2 \
--disable-avahi --without-myspell-dicts --with-galleries="no" \
--disable-kde4 --with-system-expat --with-system-libxml --with-system-nss \
--disable-introspection --without-krb5 --disable-python --disable-pch \
--with-system-openssl --with-system-curl --disable-ooenv --disable-dependency-tracking
make
The compilation result is stored in the ./instdir/
directory.
Reduce the Size
# this will remove ~100 MB of symbols from shared objects
strip ./instdir/**/*
# remove unneeded stuff for headless mode
rm -rf ./instdir/share/gallery \
./instdir/share/config/images_*.zip \
./instdir/readmes \
./instdir/CREDITS.fodt \
./instdir/LICENSE* \
./instdir/NOTICE
Verification
echo "hello world" > a.txt
./instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard \
--nolockcheck --nologo --norestore --convert-to pdf --outdir $(pwd) a.txt
Package
# archive
tar -zcvf lo.tar.gz instdir
docker cp libre-builder:/code/libreoffice/lo.tar.gz ./lo.tar.gz
Gzip vs Zopfli vs Brotli
File | Algorithm | MiB | Compression Ratio | Decompression Duration |
---|---|---|---|---|
chromium | - | 130.62 | - | - |
chromium.gz | Gzip | 44.13 | 66.22% | 0.968s |
chromium.gz | Zopfli | 43.00 | 67.08% | 0.935s |
chromium.br | Brotli | 33.21 | 74.58% | 0.712s |
From the preceding table, we can see that the Brotli algorithm is the most efficient.
aliyunfc/runtime-nodejs8:build
is based on the released Debian Jessie, it is difficult to install Brotli on Debian
Jessie. Therefore, we used the Ubuntu container to convert tar.gz files to tar.br
files.docker run --name brotli-util --rm -v $(pwd):/root -w /root -d -t ubuntu:18.04 bash
docker exec -t brotli-util apt-get update
docker exec -t brotli-util apt-get install -y brotli
docker exec -t brotli-util gzip -d lo.tar.gz
docker exec -t brotli-util brotli -q 11 -j -f lo.tar
In the current directory, a lo.tar.br file is generated.
Install Dependencies
@shelf/aws-lambda-brotli-unpacker
for tar.br files, and run apt-get to install the libnss3
dependency. Start a nodejs8 container to ensure that the dependency installation
and runtime environments are consistent.docker run --rm --name libreoffice-builder -t -d -v $(pwd):/code --entrypoint /bin/sh aliyunfc/runtime-nodejs8
@shelf/aws-lambda-brotli-unpacker
has a native binding, so packaging and uploading files by running npm install on
MacOS systems does not work.
docker exec -t libreoffice-builder npm install
docker exec -t libreoffice-builder apt-get install -y -d -o=dir::cache=/code libnss3
docker exec -t libreoffice-builder bash -c 'for f in $(ls /code/archives/*.deb); do dpkg -x $f $(pwd) ; done;'
docker exec -t libreoffice-builder bash -c "rm -rf /code/archives/; mkdir -p /code/lib;cd /code/lib; find ../usr/lib -type f \( -name '*.so' -o -name '*.chk' \) -exec ln -sf {} . \;"
Download and Decompress the tar.br Package
ossutil cp $SCRIPT_DIR/../node_modules/fc-libreoffice/bin/lo.tar.br oss://${OSS_BUCKET}/lo.tar.br \
-i ${ALIBABA_CLOUD_ACCESS_KEY_ID} -k ${ALIBABA_CLOUD_ACCESS_KEY_SECRET} -e oss-${ALIBABA_CLOUD_DEFAULT_REGION}.aliyuncs.com -f
module.exports.initializer = (context, callback) => {
store = new OSS({
region: `oss-${process.env.ALIBABA_CLOUD_DEFAULT_REGION}`,
bucket: process.env.OSS_BUCKET,
accessKeyId: context.credentials.accessKeyId,
accessKeySecret: context.credentials.accessKeySecret,
stsToken: context.credentials.securityToken,
internal: process.env.OSS_INTERNAL === 'true'
});
if (fs.existsSync(binPath) === true) {
callback(null, "already downloaded.");
return;
}
co(store.get('lo.tar.br', binPath)).then(function (val) {
callback(null, val)
}).catch(function (err) {
callback(err)
});
};
@shelf/aws-lambda-brotli-unpacker
npm package to decompress the lo.tar.br package.const {unpack} = require('@shelf/aws-lambda-brotli-unpacker');
const {execSync} = require('child_process');
const inputPath = path.join(__dirname, '..', 'bin', 'lo.tar.br');
const outputPath = '/tmp/instdir/program/soffice';
module.exports.handler = async event => {
await unpack({inputPath, outputPath});
execSync(`${outputPath} --convert-to pdf --outdir /tmp /tmp/example.docx`);
};
Deploy Functions on Fun
fun deploy
command to deploy a function.ROSTemplateFormatVersion: '2015-09-01'
Transform: 'Aliyun::Serverless-2018-04-03'
Resources:
libre-svc: # service name
Type: 'Aliyun::Serverless::Service'
Properties:
Description: 'fc test'
Policies:
- AliyunOSSFullAccess
libre-fun: # function name
Type: 'Aliyun::Serverless::Function'
Properties:
Handler: index.handler
Initializer: index.initializer
Runtime: nodejs8
CodeUri: './'
Timeout: 60
MemorySize: 640
EnvironmentVariables:
ALIBABA_CLOUD_DEFAULT_REGION: ${ALIBABA_CLOUD_DEFAULT_REGION}
OSS_BUCKET: ${OSS_BUCKET}
OSS_INTERNAL: 'true'
In actual scenarios, it is inappropriate to write both keys and variables to the template.yml
file. To separate the code from the settings, the variable placeholders ${ALIBABA_CLOUD_DEFAULT_REGION} and ${OSS_BUCKET}
are used in the example.
SCRIPT_DIR=`dirname -- "$0"`
source $SCRIPT_DIR/../.env
export ALIBABA_CLOUD_DEFAULT_REGION OSS_BUCKET
envsubst < $SCRIPT_DIR/../template.yml.tpl > $SCRIPT_DIR/../template.yml
cd $SCRIPT_DIR/../
All the preceding settings are written to the .env file. dotenv is a common community solution, which is supported by various tools.
Conclusion
This document explains how to compile LibreOffice, which is challenging during the migration process. LibreOffice also requires installing npm native binding and apt-get to the local directory. For this reason, this example also applies to Function Compute dependencies. The steps in this document strongly depend on the fc-docker image for both compilation and dependency installation. This image resolves the problem of environmental differences, greatly reducing migration difficulty. Loading data when a large file is running is another common Function Compute problem. In the conversion tool scenario, a binary program is usually a large file. In the machine learning scenario, a data file in the training model is normally large. In both scenarios, you can use OSS to download and decompress the packages. Given that Function Compute now supports NAS, the use of NAS to mount shared online storage is also applicable.
For the full source code used in this document, refer to the fc-libreoffice project.