Thread locals do solve the problem. You create a wrapper around the original function. You set a global thread local user data, you pass in a function which calls the function pointer accepting the user data with the global one.
Thread locals don't fully solve the problem. They work well if you immediately call the closure, but what if you want to store the closure and call it later?
Because we create `reverse_sort` between creating `normal_sort` and calling it, we end up with a reverse sort despite clearly asking for a normal sort.
Yep. Thread locals are probably faster than the other solutions shown too.
It’s confusing to me that thread locals are “not the best idea outside small snippets” meanwhile the top solution is templating on recursion depth with a constexpr limit of 11.
The method of having static variables to store state in functions is used heavily in ANSI C book. It’s honestly a beautiful technique when used prudently.
Imagine a comparison function that needs to call sort() as part of its implementation. You could argue that's probably a bad idea, but it would be a problem for this case.
(You could solve that with a manually maintained stack for the context in a thread local, but you'd have to do that case-by-case)