1.1. Lead-in

This article talks about WebAssembly and can be read to get the first glimpse of it. At the same time, I articulate my opinion and doubts. The summary is that WebAssembly is an interesting approach and we will see what it will become.

1. Java in the Browser, the past

There was a time when we could run Java applets in the browsers. There were a lot of problems with it, although the idea was not total nonsense. Nobody could tell that the future of browser programmability is not Java. Today we know that JavaScript was the winner and the applet as it is deprecated in Java 9 and is going to be removed from later Java versions. This, however, does not mean that JavaScript is without issues and it is the only and best possible solution for the purpose that a person can imagine.

JavaScript has language problems, there are a lot of WTF included in the language. The largest shortage, in my opinion, is that it is a single language. Developers are different and like different languages. Projects are different best solved by different programming languages. Even Java would not so immensely successful without the JVM infrastructure supported by so many different languages. There are a lot of languages that run on the JVM, even such a crap as ScriptBasic.

Now you can say that the same is true for the JavaScript infrastructure. There are other languages that are compiled to JavaScript. For example, there is TypeScript or there is even Java with the GWT toolkit. JavaScript is a target language, especially with asm.js. But still, it is a high level, object-oriented, memory-managed language. It is nothing like machine code.

Compiling to JavaScript invokes the compiler once, then the JavaScript syntax analyzer, internal bytecode and then the JIT compiler. Isn’t it a bit too many compilers till we get to the bits that are fed into the CPU? Why should we download the textual format JavaScript to the browser and compile it into bytecode each time a page is opened? The textual format may be larger, though compression technologies are fairly advanced, and the compilation runs millions of times on the client computer emitting a lot of carbon into the air, where we already have enough, no need for more.

(Derail: Somebody told me that he has an advanced compression algorithm that can compress any file into one bit. There is no issue with the compression. Decompression is problematic though.)

2. WebAssembly

Why can’t we have some bytecode based virtual machine in the browser? Something that once the JVM was for the applets. This is something that the WebAssembly guys were thinking in 2015. They created WebAssembly.

WebAssembly is a standard program format to be executed in the browser nearly as fast as native code. The original idea was to "complement JavaScript to speed up performance-critical parts of web applications and later on to enable web development in other languages than JavaScript." (WikiPedia)

Today the interpreter runs in Firefox, Chromium, Google Chrome, Microsoft Edge and in Safari. You can download a binary program to the browser and you can invoke it from JavaScript. There is also some tooling supporting developing programs in "assembly" and also on higher level languages.

2.1. Structure

The binary web assembly contains blocks. Each block describes some characteristics of the code. I would say that most of the blocks are definition and structure tables and there is one, which is the code itself. There is a block that lists the functions that the code exports, and which can be invoked from JavaScript. Also, there is a block that lists the methods that the code wants to invoke from the JavaScript code.

The assembly code is really assembly. When I started to play with it I had some nostalgic feeling. Working with these hex codes is similar to programming the Sinclair ZX80 in Z80 assembly when we had to convert the code manually to hex on paper and then we had to "POKE" the codes from BASIC to the memory. (If you understand what I am talking about you are seasoned. I wanted to write 'old' but my editor told me that is rude. I am just kidding. I have no editor.)

I will not list all the features of the language. If you are interested, visit the WebAssembly page. There is consumable documentation about the binary format.

There are, however, some interesting features that I want to talk about to later express my opinions.

2.1.1. No Objects

The WebAssembly VM is not an object-oriented VM. It does not know objects, classes or any similar high-level structures. It really looks like some machine language. It has some primitive types, like i32, i64, f32, f64 and that it is. The compiler that compiles high-level language has to use these.

2.1.2. No GC

The memory management is also up to the application. It is assembly. There is no garbage collector. The code works on a (virtually) continuous memory segment that can grow or shrink via a system call and it is totally up to the application to decide which code fragment uses which memory address.

2.1.3. Two Stacks

There are two stacks the VM works with. One is the operation stack for arithmetic operations. The other one is the call stack. There are functions that can call each other and return to the caller. The call sequence is stored in a stack. This is a very usual approach. The only shortage is that there is no possibility to mark the call stack and purge it when an exception happens. The only possibility to handle try/catch programming structure is to generate code before and after function calls that check for exception conditions and if the exception is not caught on the caller function level then the code has to return to the higher level caller. This way the exception handling walks through the call stack with the extra generated code around each function call. This slows down not only the exception handling but also the function calls.

2.1.4. Single Thread

There is no threading in WebAssembly.

2.2. Support, Tooling

The fact that most of the browsers support WebAssembly is one half of the bread. There have to be developer tools supporting the concept to have code that can be executed.

There is an LLVM backed compiler solution so technically any language that is compiled to LLVM should be compilable to WebAssembly and run in the browser. There is a C compiler in the tooling and you can also compile RUST to WebAssembly. There is also a textual format in case you want to program directly in assembly level.

2.3. Security

Security is at least questionable. First of all, WebAssembly is binary, therefore it is not possible, or at least complex to look at the code and analyze the code. The download of the code does not require channel encryption (TLS) therefore it is vulnerable to MITM attack. Similarly, WebAssembly does not support code signature that would assert that the code was not tampered with since being generated in the (hopefully protected) development environment.

WebAssembly runs in a sandbox, just like JavaScript or like Flash was running. Fairly questionable architecture from the security point of view.

2.4. Roadmap

WebAssembly was developed for to years to reach a Minimal Viable Product (MVP) that can be used as a PoC. There are features, like garbage collection, multi-thread support, exception handling support, SIMD type instructions, DOM access support directly from WebAssembly, which are developed after MVP.

3. Present and Future

I can say after playing like a weekend with WebAssembly that it is an interesting and nice toy. In its current state, it is a toy, nothing more. Without the features planned after MVP, I see only one viable use case: WebAssembly is the perfect tool to deploy malicious mining code on the client machines. In addition to that, any implementation flaw in the engine is a security risk. Note that these security risks come from a browser functionality that gives no value to the average user. You can disable WebAssembly in some of the browsers. It is a little worrisome that it is enabled by default, although it is needed only for early adopters for PoC and not commercial projects. If I were paranoid I would say that the browser vendors, like Google, have a hidden agenda with the WebAssembly engine in the browser.

I am afraid that we see no security issues currently with WebAssembly only because technology is new and IT felons have not learned yet the tools. I am almost certain that the security holes are currently lurking in the current code waiting to be exploited. Disable WebAssembly in your browser till you want to use it. Perhaps in a few years (or decades).

The original aim was to amend JavaScript. With the features after MVP, I strongly believe that WebAssembly will rather aim to replace JavaScript than amend it. There will be a time when we will be able to write applications to run in the browser in Golang, Swift, Java, C, Rust or whatever language we want to. So looking at the question in the title "will Java get back to the browser?" the answer is definitely NO. But some kind of VM technology, JIT, bytecode definitely will sometime in the future.

But not yet.

Comments imported from Wordpress

tamasrev 2018-02-28 17:29:43

Isn’t it a huuuge second system effect?

Tamás Viktor 2018-03-01 13:11:52

Thanks for the post!

I’m not very convinced about some of the security concerns:

Binary format / static code analysis: Most JS code is also transmitted in minimized and obfuscated format which also makes static code analysis harder, so I don’t think this makes a difference between WebAssembly and JS.

The sandbox should be reliable and preferably open-sourced, this is how it goes generally in all cases.

Secure transmission and integrity checking requires a general solution for all kinds of content, not just for WebAssembly. And I don’t think WebAssembly needs a stronger check, but the general solution should be strong enough. A plain HTML can be also malicious enough, not from the technical point of view but how it deceives users by its content. (For example mimicking a banking page.)

However, I do share the concern about the fact that WebAssembly is switched on by default. It might contain bugs which are real security holes.

Peter Verhas 2018-03-01 15:55:44

Could you elaborate what you mean?

Peter Verhas 2018-03-01 16:05:24

"Most JS code is also transmitted in minimized and obfuscated format which also makes static code analysis harder,"

There are different difficulty levels. The first level is that JavaScript is obfuscated by default. It is the nature of JavaScript. Minified code is one level more difficult to read and analyze. Binary is again one step. My opinion is that this last step is a large one.

"Secure transmission and integrity checking" are two different things. Secure transmission ensures that the same code arrives at the browser, which started from the server. Integrity checking (with digital signature) ensures that the module was not tampered with since it was released by the identifiable author (person or org). If, for example, there is a mining functionality in it that was not advertised then the author is legally responsible and it can be proven that they were doing it.

The fact that there is no general solution to this problem does not mean that we should not want a specific solution for a highly risky part of the system. It really is a problem that I can not be sure of the authenticity of the HTML page when I am doing online banking and I have to solely rely on the TLS channel transmission, I do not want to increase my problem to the next level running uncertified binary code.

tamasrev 2018-03-02 09:48:49

The wikipedia gives the following definition: "The second-system effect (also known as second-system syndrome) is the tendency of small, elegant, and successful systems, to be succeeded by over-engineered, bloated systems, due to inflated expectations and overconfidence."

There was the javascript, which looks like a toy language, with no types and no traditional classes, and doesn’t support multithreading and mutates quickly.

Later we learned that javascript has several advantages. One of them is small footprint for small scripts. Then, prototypes and dynamic typing lets developers to implement features quickly. The lots of different js versions were mitigated with schims and shivs and jquery. Everybody was happy, except those who maintained legacy js libraries.

And now there’s a js alternative. This is basically a bytecode that’s supposed to run super-quickly. Computation speed hasn’t really been a problem, but they speed it up. And then this super-fast webassembly still doesn’t support multithreading. So we got a complicated solution for a no-problem (computation speed), and no solution for an actual problem (single-threaded js prevents devs to fully utilize CPU-s).

Or, maybe, it’s just hindsight bias: I’m judging the outcome, but not the decision.

Peter Verhas 2018-03-02 10:01:38

"Computation speed hasn’t really been a problem"

It was not a problem, only because nobody tried to use it for tasks where it would have been a problem. Write a highly intensive graphics application (a game), some sound or video processing application. Then speed is a problem.

"no solution for an actual problem (single-threaded js prevents devs to fully utilize CPU-s"

Do not judge WebAssembly about this on the MVP (Minimum Viable Product https://en.wikipedia.org/wiki/Minimum_viable_product). After MVP there are plans to support multi-thread, exception handling, garbage collections and so on.

WebAssembly is aiming to be a solution for problems that are not running currently in browsers because JavaScript can not do that. My prediction is that in 5 to 10 years it will scale down to simpler applications as well, where JavaScript currently sufficient and WebAssembly will replace JavaScript.


Comments

Please leave your comments using Disqus, or just press one of the happy faces. If for any reason you do not want to leave a comment here, you can still create a Github ticket.