Folks here might be interested in WebAssembly from the Ground Up (https://wasmgroundup.com) — an online book to learn Wasm by building a simple compiler in JavaScript. (Disclaimer: I'm one of the authors.)
So far the book only covers WebAssembly 1.0 though. We'll likely publish an update to cover the (few) new features in 2.0, but WebAssembly 3.0 is a pretty big update. Garbage collection and typed references especially add quite a lot to the spec. But, they also make a lot more things possible, which is great.
The spec itself is also very readable and accessible. Everything in the spec that's described formally is also described in plain language. I can't think of any other spec that's quite as nice to read.
The WebAssembly spec is quite approachable, but for anyone who is interested in learning Wasm and doesn't want to read the spec —
WebAssembly from the Ground Up (https://wasmgroundup.com/) an online book to learn Wasm by building a simple compiler in JavaScript. It starts with handcrafting bytecodes in JS, and then slowly builds up a simple programming language that compiles to Wasm.
const MIN_U32 = 0;
const MAX_U32 = 2 ** 32 - 1;
function u32(v) {
if (v < MIN_U32 || v > MAX_U32) {
throw Error(`Value out of range for u32: ${v}`);
}
return leb128(v);
}
I love Ada, because you can do this:
subtype U32 is Interfaces.Unsigned_64 range 0 .. 2 ** 32 - 1;
I know. That said, I keep mentioning Ada because it is widely used in mission critical systems, and because it supports contracts (yes, I know, so does Eiffel), and you can do formal verification using Ada / SPARK, meaning that it could be used in place of Rust, whereas Pascal probably not.
Is it possible to "instrument" the WASM code to enable in-process debugging? In other words, would it be possible to generate WASM based off some input string (my custom language) on-the-fly, and then run it with breakpoints and memory inspection, all within the same Javascript script hosted on, say, a web page?
That still requires the usage of dev tools and linear source code mapping between the original and the generated WASM, correct? Would it be possible to avoid dev tools, and implement the debugger in Javascript? Or the WASM technology doesn't provide such an opportunity?
I'd like to break on breakpoints, jump back into Javascript, unroll it into the original location in the source code, and display it all in an IDE-like window all within a browser page, and without involvement of dev tools (that can't handle non-linear conversions between source code and generated JS/WASM).
Yes, if you use bytecode rewriting then all the offsets are changed and you need a mapping. This is one of the advantages of engine-side instrumentation; bytecode offsets don't change. It'll be some time before we can get engines to agree on a standard interface for instrumentation, but there have been some discussions.
Whamm can inject arbitrary instrumentation logic, so you could, e.g. inject calls to imports that are implemented in JS. You'll have some heavy lifting to do on the JS side.
Visual Studio supports debugging C# compiled to WASM when your page is made with Blazor.
Granted, you're debugging in another window that isn't a browser; but overall the debugger is about 80% of what you get when debugging a .net process running outside of the debugger.
I'm not sure I totally understand what you mean by "in-process" here. But you could have some JavaScript that compiles some code in your custom language to WebAssembly and then execute it, and you can use the browser dev tools to set breakpoints the Wasm, inspect the memory, etc.
In the book, we don't cover source maps, but it would also be possible to generate source maps so that you can set breakpoints in (and step through) the original source code in your custom language, rather than debugging at the Wasm instruction level.
Sadly, no, I'd like to write a ~Prolog interpreter (compiler into WASM that would dynamically replace parts of the implementation as source code evolves), and have the debugger and WASM memory inspector as part of the web page written in Javascript, which was used to compiled the code in the first place.
That is, would it be possible to implement a debugger and memory inspector in Javascript without resorting to dev tools? Prolog doesn't map 1:1 to WASM/Javscript via source maps, making it nearly impossible to properly debug it in dev tools.
Inspecting Wasm memory is easy from JS, but to be able to do the debugging, you'd probably either need to rewrite the bytecode (e.g., inserting a call out to JS between every "real" instruction) or a self-hosted interpreter like wasm3 (https://github.com/wasm3/wasm3).
(Or maybe there are better solutions that I'm not thinking of.)
It's worth noting that Python 3 uses a PEG-based parser and its performance (when it was introduced at least) was within 10% of the old parser: https://peps.python.org/pep-0617/
In absolute terms, the numbers given in the PEP work out to about a thousand machine instructions per byte of input, for both the old and new parsers. From my point of view, that's pretty slow.
Note that this includes the time for tokenization, which is generally the bulk of the time spent in parsing, and which I believe did not change between the old recursive-descent parser and the new PEG parser.
Typically the tokenizer must iterate over some 32 characters per line of input code, which it reduces to some 8 tokens for the benefit of the parser proper, so usually the tokenizer is the bottleneck. However, the performance in this case seems too poor to attribute to the tokenizer.
That PEP also discusses some avoidable inefficiencies in the old parser that accounted for a substantial fraction of its runtime.
One of the biggest advantages of PEGs is that you don't need a separate tokenization pass. But that means you are incurring Packrat's inefficiency per character instead of per token. As I said above, I believe the CPython implementation does not do this.
I've written and worked on PEG parser generators, and I think PEG parsing is the best option in a wide variety of circumstances. But it is rarely the fastest option and usually far from it.
Yes, you're right that PEG parsing is usually not the fastest option. As with many things, the operative question is: is it fast _enough_?
> Note that this includes the time for tokenization, which is generally the bulk of the time spent in parsing
Hmmm, I've never heard this before and I'm curious if you can point me to any benchmarks or other sources that demonstrate this. If it's true it's surprising to me!
In my line of work (writing JavaScript that will run on Android devices) there is no such thing as fast enough. If you’re not going as fast as possible the experience is bad. After all this is a platform where getting data from the local disk is frequently slower than the network because there’s so little CPU & disk bandwidth…
Also, if you're computing for 3 milliseconds for every touchmove event instead of 0.3 milliseconds, you'll run the battery down a lot faster. Though not actually ten times faster, because the screen keeps sucking power even when the CPU is asleep.
I can't! But if you have a conventional lex/yacc parser handy, it should be easy to test by measuring the time for just lexing a large input file and discarding the tokens. If you do the experiment, I'm interested to hear your results!
I don't believe that software is ever fast enough, but trading off speed is often worthwhile, for example to get composability or readability.
Project Parsing/Lexing Ratio
----------------------------------------
jdk8 5.34x
Spring Framework 3.81x
Elasticsearch 5.76x
RxJava 5.69x
JUnit4 4.51x
Guava 5.37x
Log4j 2.92x
Obviously this is just one toolkit, one language, and one set of examples. But in these benchmarks, the tokenization (lexing) time is a small fraction of the total parse time.
Well, "small" is relative. It's certainly not a bottleneck, but it's still between 17.4% to 34.2% of the total time. That's definitely still in range where optimizing could have a measurable impact on performance difference (depending on how much room there's left for optimization).
Right, but it's clearly not the situation I was describing where cutting the parse time to zero for the stages after the tokenization stage would only slightly decrease total time.
The main difference in Ohm grammars vs "standard" PEGs is support for left recursion. It's the same approach that was used in OMeta, which is described in the paper "Packrat parsers can support left recursion": https://web.cs.ucla.edu/~todd/research/pepm08.pdf
The other thing Ohm does that's different from many PEG parsing libraries is a strict separation of syntax and semantics. Many PEG tools mix these two concepts, so you write little snippets of code (semantic actions) alongside the rules.
Very close! Alex Warth created OMeta (https://en.wikipedia.org/wiki/OMeta) as part of the STEPS project. Ohm was designed as a kind of successor to OMeta, but was created after STEPS.
This is great! The WebAssembly Core Specification is actually quite readable, although some of the language can be a bit intimidating if you're not used to reading programming language papers.
If anyone is looking for a slightly more accessible way to learn WebAssembly, you might enjoy WebAssembly from the Ground Up: https://wasmgroundup.com
I think it's much better to just learn how to read inference rules. They're actually quite simple, and are used ubiquitously to define PL semantics definitions.
Constraining this on "that's not an option" is a big waste of time - learning this will open up all of the literature written on the subject.
The WASM spec is so well defined presumably because Andreas Rossberg is the editor - and he did a bunch of PL research on extensions to Standard ML, which is famous for it's specification!
I totally agree. Standard ML set the stage for everyone to finally formally specify a full language, and only WebAssembly has carried the torch as far as I know (other than small research languages).
I know one of WebAssembly's biggest features by design is security / "sandbox".
But I've always gotten confused with... it is secure because by default it can't do much.
I don't quite understand how to view WebAssembly. You write in one language, it compiles things like basic math (nothing with network or filesystem) to another and it runs in an interpreter.
I feel like I have a severe lack/misunderstanding. There's a ton of hype for years, lots of investment... but it isn't like any case where you want to add Lua to an app you can add WebAssembly/vice versa?
WebAssembly can communicate through buffers. WebAssembly can also import foreign functions (Javascript functions in the browser).
You can get output by reading the buffer at the end of execution/when receiving callbacks. So, for instance, you pass a few frames worth of buffers to WASM, WASM renders pixels into the buffers, calls a callback, and the Javascript reads data from the buffer (sending it to a <canvas> or similar).
The benefit of WASM is that it can't be very malicious by itself. It requires the runtime to provide it with exported functions and callbacks to do any file I/O, network I/O, or spawning new tasks. Lua and similar tools can go deep into the runtime they exist in, altering system state and messing with system memory if they want to, while WASM can only interact with the specific API surface you provide it.
That makes WASM less powerful, but more predictable, and in my opinion better for building integrations with as there is no risk of internal APIs being accessed (that you will be blamed for if they break in an update).
> Lua and similar tools can go deep into the runtime they exist in, altering system state and messing with system memory if they want to
That's not correct, when you embed Lua you can choose which APIs are available, to make the full stdlib available you must explicitly call `luaL_openlibs` [1].
The File System Access API is useful for opening a local .ipynb and .csv with JupyterLite, which builds CPython for WASM as Pyodide.
There is a "Direct Sockets API in Chrome 131" but not in FF; so WebRTC and WebSocket relaying is unnecessary for WASM apps like WebVM:
https://news.ycombinator.com/item?id=42029188
The embedder could hand the module functions for manipulating external buffers via externrefs. (I'm not sure if that's a good idea, or not, just that it could.)
But if the module wants to compute on the values in the buffer, at some level it would have to copy the data in/out.
Assuming you're talking about reading binary data like (array i8), the GC MVP doesn't have a great answer right now. Have to call back into wasm to read the bytes. Something for the group to address in future proposals. Sharing between wasm modules is better right now.
> But I've always gotten confused with... it is secure because by default it can't do much.
Yes. That’s a super accurate description. You’re not confused.
> I don't quite understand how to view WebAssembly. You write in one language, it compiles things like basic math (nothing with network or filesystem) to another and it runs in an interpreter.
Almost. Wasm is cheap to JIT compile and the resulting code is usually super efficient. Sometimes parity with native execution.
> I feel like I have a severe lack/misunderstanding. There's a ton of hype for years, lots of investment... but it isn't like any case where you want to add Lua to an app you can add WebAssembly/vice versa?
It’s definitely a case where the investment:utility ratio is high. ;-)
Here’s the trade off between embedding Lua and embedding Wasm:
- Both have the problem that they are only as secure as the API you expose to the guest program. If you expose `rm -rf /` to either Lua or Wasm, you’ll have a bad time. And it’s surprisingly difficult to convince yourself that you didn’t accidentally do that. Security is hard.
- Wasm is faster than Lua.
- Lua is a language for humans, no need for another language and compiler. That makes Lua a more natural choice for embedded scripting.
- Lua is object oriented, garbage collected, and has a very principled story for how that gets exposed to the host in a safe way. Wasm source languages are usually not GC’d. That means that if you want to expose object oriented API to the guest program, then it’ll feel more natural to do that with Lua.
- The wasm security model is dead simple and doesn’t (necessarily) rely on anything like GC, making it easier to convince yourself that the wasm implementation is free of security vulnerabilities. If you want a sandboxed execution environment then Wasm is better for that reason.
Not quite. Web assembly isn't a source language, it's a compiler target. So you should be able to write in C, Rust, Fortran, or Lua and compile any of those to WebAssembly.
Except that WebAssembly is a cross-platform assembly language/machine code which is very similar to the native machine code of many/most contemporary CPUs. This means a WebAssembly interpreter can be very straightforward, and could often translate one WebAssembly instruction to one native CPU instruction. Or rather, it can compile a stream of WebAssembly instructions almost one-to-one to native CPU instructions, which it can then execute directly.
A JIT should be able to translate most arithmetic and binary instructions to single-opcodes, however anything involving memory and functions calls needs safety checks that becomes multi-instruction. branches could mostly be direct _unless_ the runtime has any kind of metering (it should) to stop eternal loops (if it also wants to be crash-safe even if it's exploit safe).
Not necessarily; on AMD64 you can do memory accesses in a single instruction relatively easily by using the CPU's paging machinery for safety checks plus some clever use of address space.
> branches could mostly be direct _unless_ the runtime has any kind of metering (it should) to stop eternal loops
Even with metering the branches would be direct, you'd just insert the metering code at the start of each basic block (so that's two extra instructions at the start of each basic block). Or did you mean something else?
Can't remember exactly what one but I remember reading an article about some VM that added interruption checks not at block boundaries, but rather only at _backwards_ branches and call-sites, so "safe" forward jumps (if/else/break) wouldn't cost anything extra but anything that could go on "forever" had the checks.
Reserving 4gb address space oughta work on any 64bit machine with a decent OS/paging system though? I was looking into it but couldn't use it in my case however since it needs to cooperate with another VM that already hooks the PF handler (although maybe I should take another stab if there is a way to put in a hierarhcy).
Yes, interpretation on the fly was never its intention. The intention was to provide interpreted languages with a way to implement fast compiled functions.
I think the biggest advantage of wasm in terms of security is that it doesn't accept machine language written in the target machine, only in this artificial machine language. This means that it cannot encode arbitrary code that could be executed by the host machine. Everything it runs has necessarily to go through the wasm interpreter.
> This means that it cannot encode arbitrary code that could be executed by the host machine.
But the host machine still can, so it's not as big of advantage in that regard. If you could somehow deliver a payload of native code and jump to it, it'd work just fine. But the security you get is the fact that it's really hard to do that because there's no wasm instructions to jump to arbitrary memory locations (even if all the host ISAs do have those). Having a VM alone doesn't provide security against attacks.
It's often the case that VMs are used with memory-safe languages and those languages' runtime bounds checks and other features are what gives them safety moreso than their VM. In fact, most bytecode languages provide a JIT (including some wasm deployments) so you're actually running native code regardless.
That's quite interesting. This is way outside of my wheelhouse - has this kind of approach been tried in other security contexts before? What would you even call that, virtualization?
I bought the early access of your book a while ago and completed the first few chapters which were available. I found it a fantastic resource, and I was keen to continue but other responsibilities have gotten in the way since. I recommend it!
If you haven't looked at it in a while — we just published a draft of the final technical chapter, and are planning an official launch on March 4. So, might be a good time to dig back in :-)
You can understand the WASM spec in your sleep if you’ve ever worked through a type-system paper from the last two decades (or a logic paper from even earlier I guess).
Granted, not many people have, but there’s a reason why it makes sense for it to be written in that style: they want it to be very clear that the verification (typechecking, really) algorithm doesn’t have any holes, and for that it’s reasonable to speak the language of the people who prove that type of thing for a living.
The WASM spec is also the ultimate authoritative reference for both programmers and implementers. That’s different from the goals of an ISA manual, which usually only targets programmers and just says “don’t do that” for certain dark corners of the (sole) implementation. (The RISC-V manual is atypical in this respect; still, I challenge you to describe e.g. which PC value the handler will see if the user code traps on a base RV32IMA system.)
> You can understand the WASM spec in your sleep if you’ve ever worked through a type-system paper from the last two decades
Is there a lot of crossover between those people and people who work with assemblers or code generation? There's even more crossover with those people and understanding how to read a minimal ISA document.
WebAssembly isn't a programming language, it's an ISA or a bytecode VM. They've defined it in a very over-engineered way, with different representations all the way through. It has created issues for people working on assemblers and code generation because it's hard to choose a good representation for compilers to generate. This all could have been avoided.
So far the book only covers WebAssembly 1.0 though. We'll likely publish an update to cover the (few) new features in 2.0, but WebAssembly 3.0 is a pretty big update. Garbage collection and typed references especially add quite a lot to the spec. But, they also make a lot more things possible, which is great.
The spec itself is also very readable and accessible. Everything in the spec that's described formally is also described in plain language. I can't think of any other spec that's quite as nice to read.