Understand WebAssembly in One Article

By Xulun (from F(x) Team)

As frontend pages become complex, the performance of Javascript has been criticized repeatedly. Javascript is not designed for performance optimization, which makes the local language that can run on the browser popular.

From the perspective of compatibility, WebAssembly has reached the 94.7% high coverage rate from canlUse data. This value is similar to Javascript's await support program. Browsers after 2017 are supported, which has been five years since then. Mainstream browsers are all supported, including Chrome, Chrome for Android, Android Browser, Safari, Safari on iOS, Edge, Firefox, and Opera.

Since there is a chance to land, we will start the ice-breaking journey of WebAssembly.

Getting Started on WebAssembly

WebAssembly is an assembly language defined on an abstract machine. The browser is responsible for compiling it into a local code. The js engine (such as v8) is still responsible for this part of the work. Since it is an abstract machine, it can run across platforms. We can run with the interpreter provided by the toolchain or through the local Node.js.

Unlike x86 assembly, which uses imperative assembly language, WebAssembly is written using S expressions similar to the Lisp language. S expression is the language of statements enclosed in parentheses.

Look at a simple example:

(module
    (func (result i32)
        (i32.const 666)
    )
    (export "const_i32" (func 0))
)

The outermost parentheses are modules, and the code for WebAssembly is organized as a module. Functions can be defined through func in the module. The function has no name. If you want to give it an accessible name, bind it to a name through export.

WebAssembly Binary Toolchain

It is known that compiling machine instructions in assembly language requires an assembler. Meanwhile, a lot of binary toolchains, such as objdump, disassembly tools, etc., are needed.

WebAssembly Community Group provided us with WebAssembly Binary Toolkit (wabt).

Wabt is a project with multiple Git libraries. You can download the code here:

git clone --recursive https://github.com/WebAssembly/wabt
cd wabt
git submodule update --init

Then, cmake can be used to compile:

$ mkdir build
$ cd build
$ cmake ..
$ cmake --build .

After successful compilation, tools (such as assembler wat2wasm) can be used.

For example, if the example in the previous section is saved as test001.wat, it can be compiled like this:

wabt/build/wat2wasm test001.wat

When the compilation succeeds, test001.wasm is generated.

We can use wasm-objdump to view the contents of wasm:

wabt/build/wasm-objdump -s -d -x test001.wasm

-d is the disassembly. -x is the display section details. -s is the display raw data.

The following is the output:

test001.wasm:    file format wasm 0x1

Section Details:

Type[1]:
 - type[0]() -> i32
Function[1]:
 - func[0] sig=0 <const_i32>
Export[1]:
 - func[0] <const_i32> -> "const_i32"
Code[1]:
 - func[0] size=5 <const_i32>

Code Disassembly:

000026 func[0] <const_i32>:
 000027: 41 9a 05                   | i32.const 666
 00002a: 0b                         | end

Contents of section Type:
000000a: 0160 0001 7f                             .`...

Contents of section Function:
0000011: 0100                                     ..

Contents of section Export:
0000015: 0109 636f 6e73 745f 6933 3200 00         ..const_i32..

Contents of section Code:
0000024: 0105 0041 9a05 0b                        ...A...

Run Wasm Code

Run through Wasm Interpreter

Wabt provides the wasm interpreter with wasm-interp. We can run all exported functions, for example:

wabt/build/wasm-interp --run-all-exports ./test001.wasm

The following is the output:

const_i32() => i32:666

Run in a Node.js Environment

Since WebAssembly is the standard, it can be used without installing any third-party packages.

Running a function in WebAssembly only takes three steps:

Compile wasm in the buffer by WebAssembly.compile
Create an instance of the Web Assembly using WebAssembly.instantiate
Run the feature using the method in the instance's exports

We only need to use the fs API to read the file and run it directly in the Node environment:

const {readFileSync} = require('fs')

const outputWasm = './test001.wasm';

async function run(){
    const buffer = readFileSync(outputWasm);
    const module = await WebAssembly.compile(buffer);
    const instance = await WebAssembly.instantiate(module);
    console.log(instance.exports.const_i32());
}

run();

Run in the Browser

As WebAssembly is the standard, the API usage in the browser is the same as in Node.js. The difference is how to read the wasm file.

Start a local server and use the fetch function to obtain the local wasm file:

<!DOCTYPE html>
<html lang="en">
  <meta charset="utf-8" />
  <head>
    <script type="text/javascript">
      function fetchAndInstantiate(url) {
        return fetch(url)
          .then((response) => response.arrayBuffer())
          .then((bytes) => WebAssembly.instantiate(bytes))
          .then((results) => results.instance);
      }

      function test001() {
        fetchAndInstantiate("./test001.wasm").then((instance) => {
          alert(instance.exports.const_i32());
        });
      }
    </script>
  </head>
  <body onload="test001()">
    <p> WebAssembly测试页</p>
  </body>
</html>

The following result is returned:

Disassembly Wasm

Since it is an assembly language, it is easier to disassemble.

Wabt provides the wasm2wat tool to disassemble the wasm file:

wabt/build/wasm2wat test001.wasm

Look at the results of the disassembly:

(module
  (type (;0;) (func (result i32)))
  (func (;0;) (type 0) (result i32)
    i32.const 666)
  (export "const_i32" (func 0)))

The result of disassembling has one thing more than what we have written, which is the definition of the type. Since this type of assembler can be inferred, there is no need to write by hand.

Function Parameter Transfer

Through disassembly, it is seen that functions have no name. Bind a symbol as a name only when you need to export it to an external call. Similarly, function parameters have no names. However, we can give a parameter when writing the assembly. In wat, function parameters should be declared with param keyword, and a name starting with $ can be given.

Look at an example:

(module
    (func (param $a i32) (result i32)
        (i32.add 
            (local.get $a) 
            (i32.const 1)
        )
    )
    (export "inc_i32" (func 0))
)

The following is the result of disassembling with wasm2wat:

  (type (;0;) (func (param i32) (result i32)))
  (func (;0;) (type 0) (param i32) (result i32)
    local.get 0
    i32.const 1
    i32.add)
  (export "inc_i32" (func 0)))

It is found that the $a parameter has disappeared. It does not exist at the time of declaration and becomes the serial number 0 at the time of the call.

In addition, changes are found in the order of instructions. The pre-order expression we use (or Polish expression) is that the i32.add instruction is in the front, and the two operands are in the back. After disassembly, it becomes the inverse Polish expression, namely, the post-order expression. The i32.add instruction is at its back, and its two operands are at the front.

Arithmetic Operation

WebAssembly, hereinafter referred to as wasm, has four types of numbers:

i32: signed 32-bit integer
i64: signed 64-bit integer
f32: signed 32-bit floating point number
f64: signed 32-bit floating point number

There are four complete instruction sets for these four types.

There is no unsigned number type, but the integer type is an instruction for unsigned calculation.

Corresponding to four basic number types, there are four instructions to push a constant of this type into the stack: i32.const, i64.const, f32.const, and f64.const.

Addition, Subtraction, and Multiplication

There are four types of addition, one for each type:

i32.add
i64.add
f32.add
f64.add

There are also four types of subtraction, one for each type:

i32.sub
i64.sub
f32.sub
f64.sub

The same goes for multiplication:

i32.mul
i64.mul
f32.mul
f64.mul

Look at an example of f32 multiplication:

(module
    (func (param $a f32) (result f32)
        (f32.mul 
            (local.get $a) 
            (f32.const 1024)
        )
    )
    (export "mul_1k_f32" (func 0))
)

Write a Node.js script and run:

const {readFileSync} = require('fs')

const outputWasm = './test003.wasm';

async function run(){
    const buffer = readFileSync(outputWasm);
    const module = await WebAssembly.compile(buffer);
    const instance = await WebAssembly.instantiate(module);
    console.log(instance.exports.mul_1k_f32(3.14));
}

run();

Division

The division is simple for floating-point numbers, with only one div instruction:

f32.div
f64.div

Division is classified into signed division and unsigned division for integers:

i32.div_s
i64.div_s
i32.div_u
i64.div_u

In addition, there are two instructions for the signed remainder and unsigned remainder:

i32.rem_s
i64.rem_s
i32.rem_u
i64.rem_u

Let's take the 64-bit remainder as an example:

(module
    (func (param $a i64) (param $b i64) (result i64)
        (i64.rem_u 
            (local.get $a) 
            (local.get $b)
        )
    )
    (export "rem_u_i64" (func 0))
)

When i64 type is running on Node.js, you need to enter BigInt and add an n suffix when entering:

const {readFileSync} = require('fs')

const outputWasm = './test_remu.wasm';

async function run(){
    const buffer = readFileSync(outputWasm);
    const module = await WebAssembly.compile(buffer);
    const instance = await WebAssembly.instantiate(module);
    console.log(instance.exports.rem_u_i64(1000n,256n));
}

run();

Floating-Point-Specific Instructions

Absolute Value
- f32.abs
- f64.abs
Inverse
- f32.neg
- f64.neg
Rounds up or down to an Integer
- Obtains the smallest integer greater than or equal to a number
- Obtains the largest integer less than or equal to a number
- Round to 0
- Round to the nearest integer
Square Root
- f32.sqrt
- f64.sqrt
Maximum and Minimum
- f32.min
- f64.min
- f32.max
- f64.max
Take Sign Bit
- f32.copysign
- f64.copysign

Let's start with the unfamiliar copysign. It replaces the sign of the current number with the sign of another number.

(module
    (func (param $a f64) (result f64)
        (f64.copysign
            (local.get $a) 
            (f64.const -1.0) 
        )
    )
    (export "copysign_f64" (func 0))
)

We are copying the symbol of -1.0 to the 64-bit integer from the copysign_f64 function.

Try 3.14:

const {readFileSync} = require('fs')

const outputWasm = './copysign.wasm';

async function run(){
    const buffer = readFileSync(outputWasm);
    const module = await WebAssembly.compile(buffer);
    const instance = await WebAssembly.instantiate(module);
    console.log(instance.exports.copysign_f64(3.14));
}

run();

The result of the operation was -3.14, which was replaced by the symbol of -1.0.

Comparison Instruction

Equal to 0

Only integers can judge whether it is 0, so it is an instruction for each of the two integers:

i32.eqz
i64.eqz

Look at an example:

(module
    (func (param $a i32) (result i32)
        (i32.eqz 
            (local.get $a)
        )
    )
    (export "i32_eqz" (func 0))
)

Run it:

const {readFileSync} = require('fs')

const outputWasm = './cmp.wasm';

async function run(){
    const buffer = readFileSync(outputWasm);
    const module = await WebAssembly.compile(buffer);
    const instance = await WebAssembly.instantiate(module);
    console.log(instance.exports.i32_eqz(0));
    console.log(instance.exports.i32_eqz(-1));
}

run();

The output is 1, 0.

Equal and Unequal

Four Equal:

i32.eq
i64.eq
f32.eq
f64.eq

Four Unequal:

i32.ne
i64.ne
f32.ne
f64.ne

Less Than and Greater Than

Floating point numbers are simple. Smaller than is lt and larger than is gt:

f32.lt
f32.gt
f64.lt
f64.gt

Integers are also divided into signed and unsigned:

i32.lt_s
i32.lt_u
i64.lt_s
i64.lt_u
i32.gt_s
i32.gt_u
i64.gt_s
i64.gt_u

If it is less than or equal to, replace lt with le. If it is greater than or equal to, replace gt with ge.

Process Control Statements

Function Calls

Although the function has no name, it can be marked with a reference and called by the call instruction.

Use i32_eqz2 to encapsulate the i32_eqz function of the preceding section.

(module
    (func $f1 (param $a i32) (result i32)
        (i32.eqz 
            (local.get $a)
        )
    )
    (func (param $b i32) (result i32)
        (call $f1 (local.get $b))
    )
    (export "i32_eqz2" (func 1))
)

Look at the result of the disassembly. The reference value is translated into the index number of the function:

(module
  (type (;0;) (func (param i32) (result i32)))
  (func (;0;) (type 0) (param i32) (result i32)
    local.get 0
    i32.eqz)
  (func (;1;) (type 0) (param i32) (result i32)
    local.get 0
    call 0)
  (export "i32_eqz2" (func 1)))

Branch Judgment

Wasm provides an if instruction, which reads an integer of type i32 from the top of the stack. If the parameter is not 0, the code of then is executed. Otherwise, the code of else is executed.

Let's take an example and rewrite i32_eqz:

(module
    (func (param $a i32)(result i32)
        (local.get $a)
        (if (result i32)
            (then (i32.const 0))
            (else (i32.const 1))
        ) 
    )
    (export "i32_eqz3" (func 0))
)

if can return to a value like a function and both then and else are required.

The requirement of if is not immediate operand. It needs to be placed in the stack in advance to avoid an error.

Look at the results after disassembly. then is not a keyword. The if...else...end structure can also be used to write:

(module
  (type (;0;) (func (param i32) (result i32)))
  (func (;0;) (type 0) (param i32) (result i32)
    local.get 0
    if (result i32)  ;; label = @1
      i32.const 0
    else
      i32.const 1
    end)
  (export "i32_eqz3" (func 0)))

Write the following – making sure not to put brackets outside if, otherwise (if) blocks are expected to be (then) and (else) blocks:

(module
    (func (param $a i32)(result i32)
        (local.get $a)
        if (result i32)
            i32.const 0
        else
            i32.const 1
        end 
    )
    (export "i32_eqz4" (func 0))
)

Loop

It is used to loop. If we want to proceed to the next round of loops in advance, the br instruction can be used, which is equivalent to the continue statement in the C language.

If we want to exit the loop, the br_if instruction can be used.

Now, I will follow the imperative language, which the loop comparison follows to write the assembly:

(module
    (func (param $a i32)(result i32)
        (local $sum i32)
        (local.set $sum (i32.const 0))
        loop
            local.get $a
            i32.const -1
            i32.add
            local.set $a
            local.get $a
            local.get $sum
            i32.add
            local.set $sum
            local.get $a
            br_if 0
        end
        (return (local.get $sum))
    )
    (export "i32_sum" (func 0))
)

The local instruction is used to define local variables. The local.set instruction is used to assign values to local variables. The return instruction is used to return a function.

SIMD Instruction

Although WebAssembly is an abstract machine, it also needs to restore the hardware's capabilities. SIMD is short for Single Instruction Multiple Data. Speaking of SIMD, it was derived from the MMX instruction set of Intel CPU in 1996. It can use a 64-bit register as a combination of two 32-bit registers or eight 8-bit registers. In 1999, an SSE instruction set supporting 128-bit registers was introduced on the Pentium III processor. In 2008, Intel introduced the AVX instruction set supporting 256-bit registers on the second-generation Core processor Sandy Bridge. In 2013, Intel released the AVX 512 instruction set with 512-bit registers. The AVX 512 instruction set has increased power consumption. Linus Torvalds commented, “I hope AVX-512 dies a painful death.” However, I am getting off topic. WebAssembly also supports 128-bit SIMD instruction sets called vector instruction sets.

Look at an example to get a better understanding:

(module
    (func (result i32)
        v128.const i32x4 1 1 1 1 
        v128.const i32x4 2 2 2 2
        i32x4.add
        v128.any_true
        return
    )
    (export "v128_anytrue" (func 0))
)

Unlike numeric constants (such as i32, i64, f32, and f64), v128 constants require instructions on how to interpret the usage of 128 bits. For example, as such, we use it as four 32-bit registers. The i32x4 instruction set is used to do the operation.

Meanwhile, we can also deal with v128 as a whole, using the instruction set of v128.

In another example, we want to use the swizzle command to reorder 8x16 numbers:

Write an 8x16 instruction:

        v128.const i8x16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
        v128.const i8x16 1 2 0 3 4 5 6 7 8 9 10 11 12 13 14 15
        i8x16.swizzle

If we disassemble it, it was disassembled into i32x4 format, but it does not affect the use of i8x16 instruction.

    v128.const i32x4 0x04030201 0x08070605 0x0c0b0a09 0x100f0e0d
    v128.const i32x4 0x03000201 0x07060504 0x0b0a0908 0x0f0e0d0c
    i8x16.swizzle

SIMD instructions in wasm were first developed as an extension of Javascript, SIMD.js. Now, they are a part of wasm. Please refer to this link for details.

Summary

The article briefly introduces the instruction set of the WebAssembly abstract machine and the writing and running methods of assembly language. With this basic knowledge, we will be able to read the wasm code compiled through emsdk, and it is easy to understand when we see v8-related code.

Community