WebAssembly and Rust: performance analysis

Code examples

Introduction

I have recently been learning The Rust programming language and I absolutely love it! Rust is a systems-level language (like C and C++) which provides compile-time guarantees about memory safety and safe concurrent behaviour. Readers of my Cybersecurity and technology blogs will know that I am interested in computer security, and we’re reminded on an almost monthly basis at the moment that computer security often leaves a lot to be desired, what with vulnerabilities and exploits of public data sadly being a frequent occurrence. Rust programs simply will not compile when the program contains code which often causes these problems, meaning that bad code can be caught during development rather than later when there is more at stake. This is definitely the way we need to go, so I’d recommend anyone with an interest in programming or security to give it a try!

As well as Rust, I have also been investigating WebAssembly, which is a new way of deploying compiled programs to the web (and other) platforms. WebAssembly is a new, low-level, binary format which can be used as a compiler target, meaning that compiled languages can be compiled to it and then run in a web browser. There are already some amazing demonstrations of this available, such as the Epic Zen Garden, which uses the Unreal 4 engine compiled to WebAssembly!

The good news is that Rust can already compile to WebAssembly! So, I strongly believe that a combination of Rust and WebAssembly will be a powerful driving force for the web in the years to come.

WebAssembly performance

Despite WebAssembly being brand new, browser support is already very good. Rather than simply being usable or not, I was interested in the relative performance of WebAssembly in its current state compared to plain JavaScript.

JavaScript has been a fundamental part of every major browser for many years now, with a lot of effort being given to providing the best performance possible. This is of course to be expected; as JavaScript currently powers all client-side web apps, it is in the browser vendor’s best interests to make JavaScript code execute as quickly as possible.

WebAssembly allows non-JavaScript code to be executed in a client-side web environment, which is a huge advancement. Another goal of course is to provide a low-level language which can easily be compiled to native machine code and executed extremely quickly by the browser. Because of this, I decided to create a demonstration to compare the performance of plain JavaScript and WebAssembly, which is what this post is about.

Demonstration project

In order to investigate WebAssembly performance further, I created a demonstration project. The project consists of a simple HTML page which loads three modules: one in plain JavaScript, one WebAssembly and one WebGL. Each of these modules is capable of rendering a representation of the Mandelbrot set for various parameters which are controllable via the page. The page allows each of the three separate renderers to be selected and shows the time taken for each to render a single frame.

My goal for this demonstration is to provide a convenient way of analysing the relative performance of the three modules in different browsers on different platforms. I will keep this demonstration updated as support for WebAssembly improves in order to ensure that all of the latest features are used to provide the best performance.

The plain JavaScript module simply implements the rendering function in JavaScript, which enables us to see how well the browser optimises JavaScript code using its in-built engine.

The WebAssembly module is written in Rust and compiled using Rust’s own wasm32-unknown-unknown target, which is currently only available in the nightly branch. I also hope to include a module built using the wasm32-unknown-emscripten target to see if performance varies between the two.

Finally, the WebGL module uses a GPU shader to perform the rendering. This is by far the fastest of the three modules which is to be expected as the GPU is made for tasks such as this. It’s useful to compare the performance of the other two modules with this however so that the performance difference between the best case can be observed.

Results

Currently, running the demonstration in the latest versions of both Chrome and Firefox on Linux, with an integrated Intel GPU, shows that the WebGL version is the fastest, as expected. Interestingly however, the performance of the WebAssembly version matches the plain JavaScript version almost exactly!

This just goes to show how well modern JavaScript engines perform when optimising JavaScript code. The speed achievable by plain old JavaScript really surprised me, and is a testament to the effort put into JavaScript engines by browser developers. I was also surprised that WebAssembly did not perform better, however there are some possible reasons for this which I will discuss next.

Threads

As of writing, the WebAssembly post-MVP roadmap includes planned support for threads. There is nothing preventing the browser’s JavaScript engine from breaking a job down into multiple threads to improve performance, however this currently is not an option for WebAssembly. The performance of the WebAssembly version could therefore be significantly improved once threads are introduced.

Browser support

As mentioned, WebAssembly is brand new. Browser developers have done a great job of integrating the specification, but this is just the first step. As with most new technologies, the first step is to get things working; optimising the implementation usually follows once support has stabilised. I strongly suspect that the implementations at present are un-optimised and that performance will continue to improve with each new browser version.

Foreign function call costs

The JavaScript engine is obviously optimised heavily to allow JavaScript code to call JavaScript functions. The same is probably not true however for calls from JavaScript to WebAssembly module code. There is inevitably going to be some overhead in calling code from another module, such as marshalling of arguments and return values, and access to WebAssembly module memory is currently handled by a JavaScript Uint8ClampedArray. Again, I expect these overheads to be reduced as support is improved in browsers.

Conclusion

Feel free to run these tests yourself, and please let me know if you have any interesting results!

I plan to update this demonstration project whenever a new development occurs so that it can always be used as an up-to-date way to benchmark WebAssembly. I will also update the project to include different modules for WebAssembly compiled with different methods, such a compiled from C code.

The code for the Rust implementation is available on GitHub so feel free to take a look and let me know if my implementation could be improved.



Links