Hacker Newsnew | past | comments | ask | show | jobs | submit | pdubroy's commentslogin

I know there have been solutions in the Java world for >20 years, though I can't comment on well they work in practice.

From a quick search, _An Implementation of Scoped Memory for Real-Time Java_ (https://people.csail.mit.edu/rinard/paper/emsoft01.pdf) provides a decent overview:

> Real-Time Java extends this memory model to support two new kinds of memory: immortal memory and scoped memory. Objects allocated in immortal memory live for the entire execution of the program. The garbage collector scans objects allocated in immortal memory to find (and potentially change) references into the garbage collected heap but does not otherwise manipulate these objects.

> Each scoped memory conceptually contains a preallocated region of memory that threads can enter and exit. Once a thread enters a scoped memory, it can allocate objects out of that memory, with each allocation taking a predictable amount of time. When the thread exits the scoped memory, the implementation deallocates all objects allocated in the scoped memory without garbage collection. The specification supports nested entry and exit of scoped memories, which threads can use to obtain a stack of active scoped memories. The lifetimes of the objects stored in the inner scoped memories are contained in the lifetimes of the objects stored in the outer scoped memories. As for objects allocated in immortal memory, the garbage collector scans objects allocated in scoped memory to find (and potentially change) references into the garbage collected heap but does not otherwise manipulate these objects.

> The Real-Time Java specification uses dynamic access checks to prevent dangling references and ensure the safety of using scoped memories. If the program attempts to create either 1) a reference from an object allocated in the heap to an object allocated in a scoped memory or 2) a reference from an object allocated in an outer scoped memory to an object allocated in an inner scoped memory, the specification requires the implementation to throw an exception.


Awesome! You're exactly the kind of person we were thinking of when we wrote the book…experienced programmers who are interested in understanding things at a lower level.

Let us know how it goes! You can find us in the book Discord, or email us at hello@wasmgroundup.com.


I agree, it really is quite approachable.


Ah, good catch! I see them in one of the screenshots. Those are just inlay hints, they're not in the source code. (The editor is https://zed.dev)


Yes, it can all be found here: https://github.com/wasmgroundup/code

Over the course of the book, we also build up a small library for creating Wasm modules and emitting bytecode; that's available as an NPM package (https://www.npmjs.com/package/@wasmgroundup/emit) and the code is here: https://github.com/wasmgroundup/emit


Hi HN! Co-author of the book here, happy to answer any questions you have.

Beyond the sample chapters which are linked from the landing page, we also have a couple blog posts which may be interesting:

- A WebAssembly Interpreter: https://wasmgroundup.com/blog/wasm-vm-part-1/

- An older blog post, "A WebAssembly compiler that fits in a tweet" (https://wasmgroundup.com/blog/wasm-compiler-in-a-tweet), was also on HN earlier this year: https://news.ycombinator.com/item?id=42814948


It looks good! I may pick up a copy. Besides the convenience, why do you recommend your book over reading the WASM spec as you implement your basic compiler? I find that WASM has a beautifully readable spec. One of its best features!

Either way, I’ll likely buy a copy to support the hard work on a piece of tech that I am very fond of


> I find that WASM has a beautifully readable spec. One of its best features!

Disclaimer: I'm one of the guys whose face is advertising this book, as someone who bought the early access version and loved it enough to help a bit with proofreading.

I'm a self-taught programmer who essentially started from the lowest level with Z80 assembly on the TI-83+. I just wanted to know how to fit the bytes together directly without dealing with the rest of the toolchain.

I've tried reading the spec multiple times and what it revealed to me is that my lack of formal training in the subject matter is really holding me back here. I feel like I can follow 90% of it, but that doesn't matter really. It's the remaining 10% I don't understand that does.

The spec is written as a reference and gives you all the pieces, but doesn't really do a great job at fitting all the pieces together.

Everyone I know who does have some relevant background to compiler writing agrees with you though. So I think that for them it's obvious how to fit the pieces together.

Speaking for myself though, this is the first book that made the bytecode "click" as a whole.

Having said that, I think this book and the spec together are the real combo to go for. The book covers the core, and understanding that foundation makes all the extensions easy to grasp from spec alone.


Thank you!

I would 100% agree that the spec is quite readable. At the top of our Minimum Viable Compiler chapter, we say:

> The binary module format is defined in the WebAssembly Core Specification. You’ll notice that we back up many of our explanations with reference to the relevant part of the spec. One of our goals with this book is to convince you that the spec is a valuable resource that’s worth getting familiar with.

I think the spec is great as reference material, but we wrote the book to be more of a tutorial. We've talked to many people who say they've looked at the spec, but find it too overwhelming. For those people, we hope the book provides a good structure that ultimately helps them become comfortable with the spec!


Doing is a far better way to learn than just reading.


And is there anything you’ve done that has helped you learn WebASM?


Does it cover tail calls?


Edit: changed slightly to provide a more useful answer.

No, it doesn't — not this version of the book at least. We only cover WebAssembly 1.0.

That said, as my co-author says below, there's really not much to tail calls. Once you've worked through the book, you'd be able to grok tail calls pretty quickly.

As an aside — 2.0 was announced just a few weeks after we launched the book, and 3.0 a few months ago. And with 3.0 (which added tail calls), the spec has more than doubled in size vs 1.0, so it would be hard to cover everything.

We've talked about doing a new chapter to cover some of the interesting parts of 2.0 (e.g. SIMD), but covering everything in 3.0 (garbage collection, typed reference, exception handling, tail calls…) feels almost like an entire 2nd book!


Co-author here.

if you are interested in tail calls you just need to understand the call instruction which we cover in the book and then replace it with either:

- return_call <funcidx>, the tail-call version of call

- return_call_indirect <tableidx> <typeidx>, the tail-call version of call_indirect

More info here: https://github.com/WebAssembly/tail-call/blob/main/proposals...


Thank you both for the replies.


It's explained in the post:

> Then, my plan was to construct a ProseMirror transaction that would turn the old tree into the new one. To do that, it’s helpful to know which nodes appeared in the old document, but not the new one.

So, it's not actually about reclaiming the memory. It's about taking some action on the nodes that will be reclaimed. It's akin to a destructor/finalizer, but I need that to happen synchronously at a time that I control. JavaScript does now support finalization (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) but it can't be relied on to actually happen, which makes it useless in this scenario.


I think you're skipping over some important distinctions here.

In a mark & sweep GC, the mark phase traverses the object graph, visiting only live objects. You need to recursively visit any objects that are not already marked — this is the process known as tracing. The marking time is proportional to the number of live objects.

In the sweep phase, you typically do a linear traversal of memory and reclaim any objects that are not marked. You do not examine pointers inside any objects, so the graph structure is irrelevant. The time is proportional to the total size of memory.

In reference counting, when a refcount hits 0, you need to decrement the refcount of pointed-to objects, and recursively visit any objects whose refcount is now 0. The time is proportional to the number of objects that have just died.

Structurally, decrementing refcounts is *very* similar to tracing. You're right that it's purpose is similar to the sweep phase of a mark & sweep GC, but they aren't algorithmically similar.


It seems that reference counting traces both the live and dead space, in the sense that refcounts have to be carefully maintained as the ownership sharing is propagated, and then the finalization of a refcounted object can trigger others dead ones to be identified.

Garbage collection culd also trace both spaces. After we have performed the mark phase of basic mark-and-sweep, we could again go back to the root pointers and traverse the object graph again in the asme way, this time looking for objects which are not marked reachable. Similarly to refcounting, we could do the finalization for each such object and then recursively look for others that are unreachable.

We don't do that for various reasons:

- the belief that it's faster to traverse the heaps, even though in doing so, we it may be necessary to skip numerous free objects.

- the use of a copying algorithm, whereby we moved the live objects to a new heap, so there is no need to trace or otherwise traverse the dead space in detail.

The belief is justified because there usually aren't any free objects when GC is spontaneously triggered. A recursive trace vists the entire graph of objects that were previously live, some of which are now dead. A sweep through the heaps visits all the same ones, plus also some entries marked free (of which there are none in a spontaneous GC pass).

But the recursive traversal involves inefficient dependent pointer loads, poor caching, and the visitation of of the same objecct multiple times. While in the case of dead objects, we can easily tell that we are visiting a dead object a second time (the first time we finalized it and marked it free), we have to account for multiple visits to the reachable ones; we have to paint each one a new color to indicate that it was visited in the second pass.


(OP here) It’s possible, but I doubt it. Perhaps the way I wrote it makes it sound like I was thinking about it as a GC problem from the beginning, but I wasn’t. It wasn’t until I started seeing it as as being like GC that (a) I realized that my naive solution was akin to tracing GC, and (b) I came up with the reference counting solution.


Hi HN! I'm the main author of the Ohm Editor, and co-creator and primary maintainer of Ohm itself.

Ohm is a user-friendly parsing toolkit for JavaScript/TypeScript. You can find out more at https://ohmjs.org.

Anyone interested in the editor might like to check out a blog post I wrote about it a few years ago called "Visualizing Packrat Parsing" (https://dubroy.com/blog/visualizing-packrat-parsing/)

Some people also might be interested our book WebAssembly from the Ground Up (https://wasmgroundup.com): you can learn Wasm by building a simple compiler in JavaScript, using Ohm.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: