More

tschneidereit · on Dec 7, 2021

It's really the other way around: we joined Fastly because we knew it's a place where we could do this kind of work in the open.

None of the code involved here existed a year ago, and none of it was somehow forced to be open source. (Also, the code in these two repositories is in many ways the most boring part of this solution. See this blog post for an excellent overview of the more interesting parts: https://bytecodealliance.org/articles/making-javascript-run-...)

NicoJuicy · on Dec 7, 2021

The article I linked too ( and the hn comments) in my original post seemed that fastly acquired the wasm team.

Sorry if I was mistaking.

tschneidereit · on Dec 7, 2021

Oh, no worries at all!

While there was no acquisition involved, a whole group of folks working on WebAssembly at Mozilla (myself included) moved to Fastly last Fall. What I tried to emphasize is that instead of the projects at hand here being open source because they somehow had to be, we all joined Fastly because it's a place where this kind of project can be made open source (and be created in the first place!) :)

tschneidereit · on Nov 12, 2019

What you're describing will indeed be introduced with the WebAssembly GC proposal: https://github.com/WebAssembly/gc

For languages that can express unforgeable pointers as first-class concept, that is indeed a very attractive, fine-grained approach. Unfortunately bringing that to languages like C/C++/Rust is a different matter altogether.

Since we want to support those languages as first-class citizens, we can't require GC support as a base concept, so we have to treat a nanoprocess as the unit of isolation from the outside.

Once we have GC support, nothing will prevent languages that can use it from expressing finer-grained capabilities even within a nanoprocess, and that seems highly desirable indeed.

(full disclosure: I'm a Mozilla employee and one of the people who set up the Bytecode Alliance.)

danielearwicker · on Nov 12, 2019

That future possibiilty reminds me of https://en.wikipedia.org/wiki/Singularity_(operating_system) - where process/address-space isolation was replaced with fine-grained static verification of high-level code (presumably not the first experiment in this area).

tschneidereit · on Nov 12, 2019

Indeed: that and many other things are prior art in this space. And there is a lot of prior art for what we're working on—this is not meant as an academic research project! :)

lambda · on Nov 12, 2019

Yes, one of the answers I want to give any time someone asks "why will WASM succeed when the JVM didn't" is that there is 25 years more experience and research to draw upon.

pjmlp · on Nov 13, 2019

And yet bounds checking access validation was left out of the design, something that most of previous research projects took care to taint as unsafe packages when present.

saagarjha · on Nov 12, 2019

> For languages that can express unforgeable pointers as first-class concept, that is indeed a very attractive, fine-grained approach. Unfortunately bringing that to languages like C/C++/Rust is a different matter altogether.

The semantics of these languages aren’t incompatible with unforgeable references, though: it generally works in practice, but it’s technically undefined to create pointers out of thin air. Why can’t we take advantage of the standard here to disallow illegally created references? (Which, as I understand it, many other vendors are already beginning to do with e.g. pointer authentication and memory tagging.)

steven_is_false · on Nov 13, 2019

It's slow. You should read up on the challenges of implementing memcpy in C emulated on Java. Basically you have to manually implement paging.

lambda · on Nov 12, 2019

What would allow other languages to represent unforgeable pointers as a first class concept and not C/C++/Rust?

Forging a pointer is UB in all of these languages as far as I know.

It seems like you should be able to have opaque types that represent these unforgeable pointers which you can't do arithmetic on or cast to raw pointers, but can access values in type safe ways, or provide a view to a byte slice which does bounds check on access.

Is there a good place for discussion of this design? I seem to be having this conversation with you and Josh both here and on Reddit, and it seems like a lot of the discussion is spread out in a lot of places.

afiori · on Nov 14, 2019

In unsafe rust you can arbitrarily increase the length of a vector/string by modifying the stored length. You do not need to forge the pointer itself to break the pointer's invariant.

lambda · on Nov 14, 2019

You would need to do either static or dynamic bounds checking when accessing memory via these capabilities. You obviously can't just give arbitrary code a pointer and let it read however far it wants past the end of it.

Given that most code in Rust is safe code and includes bounds checks before access, you should be able to have the verifier rely on those when they exist, and add in bounds checks in cases in which the access is not protected by a bounds check.

Maybe that would be intractable, or to inefficient to be worth it with all of the extra bounds checks. I'm not sure. I'm asking because it's something that I feel should be possible, but I haven't been involved in the research or development, so I'm wondering if those who have been more involved have references to discussion about the topic.

steven_is_false · on Nov 13, 2019

>Since we want to support those languages as first-class citizens, we can't require GC support as a base concept

I feel like you're overthinking it. Can't you just have a table that holds GCable objects and only hand out indexes to C and co?

tschneidereit · on Nov 13, 2019

That is how we support references in the Rust toolchain right now, via wasm-bindgen, and it's an important part of making unforgable references work for languages that rely on linear memory.

It doesn't help with making capabilities more fine-grained, though: we have to treat all code that has access to that table as having the same level of trust.

tschneidereit · on Nov 12, 2019

To expand on this, capabilities allow us to go further than pledge(2): it enables selective forwarding of capabilities to other nanoprocesses, such as only forwarding a handle to a single file out of a directory, or a read-only handle from a read-write one, etc...

jedisct1 · on Nov 12, 2019

Capabilities are also more complicated.

I fear that at the end of the day, capabilities will have the same fate as other sandboxing mechanisms: nobody will use them. And, just so that their application works and avoid support burden, developers will tell people to use a setup that enables access to everything.

pledge(2) and unveil(2) learned from the past and are way simpler. I really wish WebAssembly had adopted similar mechanisms.

tschneidereit · on Nov 12, 2019

Agreed, and there are a lot of UX questions to sort out. Many security concepts took many attempts to figure out in full (or to the extent that they have been figured out :))

One important aspect here is that this doesn't just target whole apps. It also targets developers using dependencies: while it's desirable to restrict an application's capabilities, there's a lot of value in developers only giving packages they depend on very limited sets of capabilities. And that seems much more tractable, given that kitchen-sink packages aren't what most people want to use anyway.

tschneidereit · on Nov 12, 2019

And Wasmtime will also support both of those. Support for environments in which JITting is not an option is of course really important to this!

tschneidereit · on March 27, 2019

We're working on that, too :) See this post from last Fall where we laid out a way to think about where WebAssembly is going, which use cases to enable, and how: https://hacks.mozilla.org/2018/10/webassemblys-post-mvp-futu...

tschneidereit · on March 27, 2019

Good news: we fully agree with these goals!

On 1, the libc we're working on[1] is based on musl. It won't ever be 100% compatible with all code, because that runs into constraints imposed by our security goals, but the vast majority of code should eventually just compile when targeting this. (Eventually, because this is all early days.)

On 2, yes, that is explicitly the goals. I'd add that it's not just about OSes, but also about platforms and hardware form factors.

[1] https://github.com/CraneStation/wasi-sysroot

tschneidereit · on March 27, 2019

Oh, this is amazing!

tschneidereit · on March 27, 2019

We've mainly based the current design on CloudABI/Capsicum, but it's all early days, and Fuchsia is on our list of systems to at the very least take heavy inspiration from :)

writepub · on March 27, 2019

Will this be backwards compatible with existing libc?

sunfish · on March 27, 2019

This tutorial gives an overview of how compatibility with existing portable C code works:

https://github.com/CraneStation/wasmtime-wasi/blob/wasi/docs...

tschneidereit · on March 27, 2019

(Member of the team at Mozilla here )

Yes, that's the list.

And the layout of structs, strings, etc is up to the compiler, within the bounds of the restrictions WebAssembly imposes.

We'll definitely have a test suite, but this is all early days, so a lot of all that isn't yet in place.

And yes, this can be targeted by LLVM-based and other compilers. In fact, Emscripten could use this as the foundation for their POSIX-like libc and library packages. The syscalls are indeed exposed as Wasm function imports.

echeese · on March 27, 2019

Will WASI normalize differences between platforms? e.g. convert argv or paths to a consistent character encoding?

tschneidereit · on March 27, 2019

a1369209993 · on March 27, 2019

How will you deal with valid paths such as:

  /tmp/[DE][AD][BE][EF].txt # ext2 / linux
  # OR
  C:\stuff\[DEED][FFFE].txt # ntfs / windows
  # where [hex] indicates a single filesystem charater with that value

sunfish · on March 27, 2019

One fun thing about the capability model is that at the system call level, there are no absolute paths. All filesystem path references are relative to base directory handles. So even if an application thinks it wants something in C:\stuff, it's the job of the libraries linked into the application to map that to something that can actually be named. So there's room for the ecosystem to innovate, above the WASI syscall layer, on what "C:\" should mean in an application intending to be portable.

Concerning character encodings, and potentially case sensitivity, the current high-level idea is that paths at the WASI syscall layer will be UTF-8, and WASI implementations will perform translation under the covers as needed. Of course, that doesn't fix everything, but it's a starting point.

comex · on March 27, 2019

That’s good to know, but the parent’s examples seem to be referencing the issue of filenames that aren’t valid Unicode. The Linux example is invalid UTF-8, since Linux filenames are natively arbitrary byte sequences. The Windows example contains an unpaired surrogate followed by the reserved codepoint 0xfffe, since Windows filenames are natively UCS-2.

the_mitsuhiko · on March 27, 2019

WTF-8 could solve the windows issue and I think for Linux it’s time to demand unicode filenames :)

azakai · on March 27, 2019

What about platform differences like how file permissions work on windows vs posix? (i.e., stuff that Python does not fully normalize)

AnIdiotOnTheNet · on March 27, 2019

I have a dollar that says all platform difference issues will be solved by just doing whatever POSIX does and expecting the host OS to figure it out if it isn't already POSIX. Whenever you try to abstract away arbitrarily different implementations while retaining their non-common functionality you either end up reimplementing one of them and expecting the others to work around it, or you end up forcing the programmer to bypass the abstraction anyway and implement logic for each implementation.

Yoric · on March 27, 2019

I wouldn't count on that.

I have worked on file APIs. There are so many differences between Windows and Posix that abstracting them away just doesn't work. Undoubtedly, there will eventually be platform-specific APIs that implement one or the other, and cross-platform APIs that implement the intersection.

sunfish · on March 27, 2019

It's a good question. WASI currently doesn't allow you to set custom access-control permissions when creating files. But we're just getting started, so if we can find a design that works, we can add it.

tschneidereit · on April 19, 2018

This is indeed a problem for Wasm/JS integration. The JS WeakRef proposal[1] will address it for many use cases, and the WebAssembly GC proposal[2], combined with JS Typed Objects[3] will address many others. Even before those features will be available, I'd expect the community to iterate on patterns to make APIs nicer.

[1] https://github.com/tc39/proposal-weakrefs [2] https://github.com/WebAssembly/meetings/blob/master/2018/pre... [3] https://github.com/tschneidereit/typed-objects-explainer