The "Colored function problem" is a complete fallacy: aside from the fact that async and non-async fns are interoperable in Rust (block_on) and other languages, the same argument could be made against functions with any preconditions.
> The solution to expensive context switches is cheap context switches, plain and simple.
Except the performance difference between a kernel-mode context switch and a user-mode one is only going to narrow in the future. The overhead that cannot be eliminated from context switches is their effect on the cache, since you start to run into the laws of physics at that point...
The real solution to expensive context switches is to just do fewer of them... No context switch is always faster than a "fast" context switch.
> I sincerely believe this is Rust's ballpark
I think it's plausible that Rust could get a library-level solution for fibers that does not rely on unstable details of the compiler. Rust will never again have that baked in to the language, as it would make the language completely unsuitable for many low-level tasks.
Fibers, especially the way they are implemented in Go, come with a lot of their own complexity.
This is just one of the segfaults caused by Go's complex stack control. I don't want to rely on a runtime that contains these sorts of bugs, and the best way to avoid that is to avoid having a runtime in the first place.
Blocking an OS thread as a mean to be compatible is not exactly what we're trying to do here.
> Except the performance difference between a kernel-mode context switch and a user-mode one is only going to narrow in the future
OS overhead can be minimized, but program stacks are a function of the language's. And if you're not right sizing preemption points in your stack, you'll be switching large parts of it. This means you _must_ have stackful coroutines if you want to keep switching threads.
> The real solution to expensive context switches is to just do fewer of them... No context switch is always faster than a "fast" context switch.
Sure, but writing the perfect assembly and using gotos has always been the fastest. Abstraction has a cost, and some runtimes/languages are currently proving that they can reduce this cost to zero in the current conditions of IO being much costlier than a few 100s of nanos. We're just happening to be at a time where the compiler is starting to be smarter than the user. But I guess the benchmarks will settle all this.
So that's a bug in the go compiler. They can either fix it or pay a "a small speed penalty (nanoseconds)" as a workaround, which the author qualifies as "acceptable".
Yes, that's not the absolute performance possible. But why care about that? At some point it all comes down to TCO (except for latencies in HFT); and TCO tells you that it's ok. Development complexity and maintainability matters. Especially when you can max out your IO usage for the decades to come.
> Blocking an OS thread as a mean to be compatible is not exactly what we're trying to do here.
It's what Go does whenever you call into a C function or make a system-call. As long as those blocking functions are not the bottleneck then it works fine.
My problem with the "colored function" analogy is that it implies that the problem is somehow due to the surface syntax, when in reality the problem still exists in all languages that support procedural IO: some of those languages like to just pretend that the problem doesn't exist.
The only language I'm aware of which truly solves that is Haskell, since all IO happens via a monad.
> Yes, that's not the absolute performance possible. But why care about that?
This point was not about performance. It's about the pitfalls of writing all of your code on top of a complex and buggy runtime.
Programming is a lot simpler, and development is a lot faster, when I don't have to worry about that.
There are also several things that are a lot more complicated when you bake a complex runtime into the language like Go does. Thread-local storage is completely broken for one. If you do any kind of GUI programming, you may need to use `runtime.LockOSThread` as most GUIs expect function calls from a single thread. etc. etc.
I don't know too much about Rust, but in .NET, blocking on Task.Result is considered an anti-pattern and a Bad Thing To Do. Not in the least because it very easily leads to deadlocks.
This is considered bad in .NET because callbacks can "Post" back to the original SynchronizationContext which depends on what async/await is being used for. For example, a call to await on a WPF UI thread will join to the calling thread so if you call Task.Result without configuring the task to not join back to the calling thread then you can deadlock the callback processing queue. To avoid this you would use ConfigureAwait(false) depending on your situation. It's the source of a lot of confusion in .NET. I don't believe that Rust has this "feature" and if you somehow wrote code to achieve the same thing as .NET does then it probably wouldn't compile in Rust due to the ownership rules.
Wait, Rust doesn't associate executors with futures? If it does, and there is a single-threaded executor available, then it's absolutely possible to deadlock just like in .NET.
That is a good question. It is possible to avoid deadlocks in a single threaded executor with higher priority interrupts but I’m no authority in this area. Maybe someone else can comment. Most of my understanding in this area comes from reading this article: https://os.phil-opp.com/async-await/
In Rust a task which is started on a given executor never leaves it. It can not switch executors like it can in C# where the continuations could be called on an arbitrary thread. In Rust, wakeups for an async task essentially just schedule it for running again, but the execution happens on the previous executor.
Anyway, in both models you can have deadlocks. And even if there is no deadlock, blocking the eventloop is still an antipattern, since it prevents other tasks which might be able to make progress from running.
> The solution to expensive context switches is cheap context switches, plain and simple.
Except the performance difference between a kernel-mode context switch and a user-mode one is only going to narrow in the future. The overhead that cannot be eliminated from context switches is their effect on the cache, since you start to run into the laws of physics at that point...
The real solution to expensive context switches is to just do fewer of them... No context switch is always faster than a "fast" context switch.
> I sincerely believe this is Rust's ballpark
I think it's plausible that Rust could get a library-level solution for fibers that does not rely on unstable details of the compiler. Rust will never again have that baked in to the language, as it would make the language completely unsuitable for many low-level tasks.
Fibers, especially the way they are implemented in Go, come with a lot of their own complexity.
Just look at this issue: https://marcan.st/2017/12/debugging-an-evil-go-runtime-bug/
This is just one of the segfaults caused by Go's complex stack control. I don't want to rely on a runtime that contains these sorts of bugs, and the best way to avoid that is to avoid having a runtime in the first place.