The best feature of WebAssembly is its high performance compared to interpreted JS, and even the ability to port projects written in other languages to run in the browser by compiling to wasm. Almost everyone with a computer-related education has learned C. Writing a simple wasm module in C to improve the performance of your front-end code is not a high barrier and it’s fun.
Start by running a simple module
To compile C/C++ code into wasm modules, we need to prepare the Emscripten environment.
Even Windows users can install it directly from inside WSL.
The WebAssembly runtime environment itself can only perform computational tasks, and when loading the wasm module it needs to import an object containing JS functions, memory areas, and other properties. Emscripten is not only for compiling C/C++ code to generate wasm, but also for generating a bunch of JS “glue code” to connect the browser to the WebAssembly runtime environment, to implement those standard libraries of C/C++, and to handle memory allocation, input and output, etc.
The tutorial on MDN gives an example of compiling a wasm file from Hello world with the corresponding glue code and HTML file. However, the glue code is much bigger than the wasm file itself …… which is not good if the module is simple. The good thing is that Emscripten supports the compilation option
-s SIDE_MODULE=1 to make the compiled wasm file as a separate dynamic library, so you don’t need that bunch of glue code, but you need to interact with the JS part yourself. The various standard library functions used need to be implemented in C/C++ (
strlen can be directly copied from Emscripten’s implementation), or imported into the wasm module after being implemented in JS by manipulating memory areas.
According to Emscripten,
SIDE_MODULE can be set to 1 or 2, the difference is that the former will export all functions in the C/C++ code, the latter will trim out the unused code, then you need to
#include <emscripten/emscripten.h> and manually prefix the functions to be exported with
Start with a simple module that calculates the multiplication of two numbers. Since this is just a library, the main function is not needed and even if it is written it will not be executed automatically when loaded.
-O3 means use the highest speed compilation optimization.
You can also use
-Oz to indicate a compilation optimization that makes the compiled file size smaller, but of course it will run much slower.
Use the following JS code to load in HTML (currently
importObject can be omitted).
WebAssembly.installStreaming is cleaner to load, but requires the correct MIME type
application/wasm to be provided.
Also, both of the above writeups load the module asynchronously, because loading large modules synchronously can block for a long time (Chrome even has its own rule that synchronously loaded modules cannot exceed 4 KB). If the module is small enough to load in negligible time, and you need to use it in synchronous code, you can write the binary data of the module in
Uint8Array format (though it is recommended to convert from Base64 strings for shorter code) to JS code and load it in the following way.
WebAssembly is also strongly typed (32/64 bit integers/floating point numbers), so in this example if you use decimals you will be automatically type converted and will not get the correct result, and you also need to pay attention to the overflow problem.
Here is the corresponding “assembly code”,
$multiply is the above multiplication function.
Using the VSCode extension WebAssembly you can display open wasm files in text format and convert between the two formats. between the two formats.
Using external functions, memory and pointers
The next step is to do something that beginners of every programming language love to do: output a Hello world.
Compile …… Wait a minute, since it is compiled to WebAssembly, where do I find the standard library
stdio.h and this
puts? Look at the text format code.
puts is something you need to import from JS. You can actually leave
#include <stdio.h> out of the C code and just declare it with
int puts(const char *str);.
The strings are all written into a single data segment (the Chinese part uses the same UTF-8 encoding as the source code). When loading the wasm module, you need to provide an area of memory to hold this data from an offset position.
Here we start by using
console.log instead of
puts and create a
WebAssembly.Memory as a memory area, which is passed to the wasm module to be loaded via an object (i.e.
WebAssembly.Memory is a block of memory for wasm modules, allocated in 64 KB “pages” as a base unit, and can be dynamically expanded after creation. It can be read and written to in JS using
Try loading and executing.
The string is written to a location in memory starting at
__memory_base, and as in C, the argument to the call to
console.log) is a pointer to the start of the string (the array index). If you implement a
puts yourself, you can output the string to the console (or to the innerText of some DOM on the page).
By the way, there is a library called Locutus that tries to use JS to implement standard libraries for other languages, although there are not many implementations for C ……
For example, to implement
printf, see here.
Try to change the data of the first cell in memory to
0x41 (that is, the letter A), call the function again, and you will see that the output has changed accordingly. Of course, in practice, if you need to write data to memory, you have to reserve memory space according to
__memory_base and the size of the data segment in your code to avoid overwriting the data used inside the wasm module.
If you need to use local variables in your code, common sense tells you that these variables are stored on the stack, and the compiled wasm will require the introduction of
__stack_pointer to set the stack pointer.
Like the real CPU, the stack pointer extends to the lower address, so we get a simple process address space model like this:
| read-only data | <- stack | heap -> |, which is starting to get a bit involved in operating system knowledge ……
Now that we mention the heap, can we use
free to dynamically allocate memory on the heap? However, even without performance considerations, a full memory allocator is very complex and not usable in the case of such simple modules compiled as
SIDE_MODULE, and will not be covered in depth here. The “Building an allocator” section of this article suggests a simple alternative: the allocated memory address is simply the starting address of the heap The allocated memory addresses are simply accumulated from the start of the heap, and as for
free? Just leave it empty.
An example of using WebAssembly to increase performance to 30x
The next step is to use a real-world example to demonstrate the superior performance of WebAssembly compared to JS for computationally intensive tasks. I tried to implement RC4 encryption and decryption algorithm in WebAssembly and JS respectively, why did I use RC4 as an example?
- Encryption and decryption involve a lot of computational operations
- RC4 is a short algorithm, and encryption and decryption are the same set of algorithms, so it is easy to implement.
- RC4 is a stream cipher, which does not require much plaintext and key length, and does not need to handle different working modes like packet ciphers, so it is easy to use
- has some practicality (although the security of RC4 is now more limited ……)
Here is the pseudo-code from Wikipedia.
Follow the pseudo-code and use JS to implement it again (the implementation was verified to be completely correct, the verification process is omitted).
Implementation using JS.
Then ported to C.
Implementation using C language, and associated JS glue code
The test is to use
KB random key to encrypt 4KB, 8KB ……128MB of the same random data using two implementations, and compare the execution times.
I realized after the test that I don’t need such a long key …… more than the first 256 bytes is meaningless, but it doesn’t affect the test results, so I won’t retest it!
As for the test results, …… was first tested on my daily Firefox, and the performance was directly improved by 8x!
Then there are the test results on Chrome. Even against Chrome, which uses the most powerful V8 engine, WebAssembly still has a performance advantage of 2x (although the speed is similar to that of Firefox ……)
As for the title, 30x…… is measured in the old Edge that has been abandoned, but the execution speed of both WebAssembly and JS is much worse than the other two.
You can also try it for yourself here ~ (wasm file is already embedded as a Data URL)
The conclusion is obvious, WebAssembly’s performance is much higher than JS. And if you don’t write the function name clearly, it’s hard to see the RC4 algorithm directly against its “assembly code”, so it might be a good idea to put some key operations into the wasm module to prevent the front-end code from being reversed?
Even if you’re not familiar with C and Emscripten’s set of tools, there are tools like Walt and AssemblyScript on GitHub assemblyscript) to write wasm modules directly using TypeScript syntax, so try that later!
Extra: SIMD Support for WebAssembly
The WebAssembly standard has a proposal for a SIMD instruction set, and recently the major browsers and the JS engine for Node.js have finally implemented support for SIMD WebAssembly/simd/blob/main/proposals/simd/ImplementationStatus.md). In theory, using SIMD could further increase the execution speed of wasm (although this doesn’t seem to be the case in the simple tests I’ve done ……)
Referring to GoogleChromeLabs/wasm-feature-detect, SIMD support can be tested with the following code.
According to the Emscripten documentation, compiling with the parameter
-msimd128 allows you to use SIMD optimization directly at compile time.