Hacker Newsnew | past | comments | ask | show | jobs | submit | tavianator's commentslogin

Presumably that's just a mistake. The author calls it "stochastic gradient descent" correctly elsewhere in the article


My bfs project also uses io_uring: https://github.com/tavianator/bfs/blob/main/src/ioq.c

I'm curious how lsr compares to bfs -ls for example. bfs only uses io_uring when multiple threads are enabled, but maybe it's worth using it even for bfs -j1


Oh that's cool. `find` is another tool I thought could benefit from io_uring like `ls`. I think it's definitely worth enabling io_uring for single threaded applications for the batching benefit. The kernel will still spin up a thread pool to get the work done concurrently, but you don't have to manage that in your codebase.


I did try it a while ago and it wasn't profitable, but that was before I added stat() support. Batching those is probably good


and grep / ripgrep. Or did ripgrep migrate to using io_uring already?


No, ripgrep doesn't use io_uring. Idk if it ever will.


Curious: Why? Is it not a good fit for what ripgrep does? Isn't the sort of "streaming" "line at a time" I/O that ripgrep does a good fit for async io?


For many workloads, ripgrep spends the vast majority of its time searching through files.

But more practically, it would be a terror to implement. ripgrep is built on top of platform specific standard file system APIs. io_uring would mean a whole heap of code to work with a different syscall pattern in addition to the existing code pattern for non-Linux targets.

So to even figure out whether it would be worth doing that, you would need to do a whole bunch of work just to test it. And because of my first point above, there is a hard limit on how much of an impact it could even theoretically have.

Where I would expect this to help is to batch syscalls during directory tree traversal. But I have nonidea how much it would help, if at all.


I believe that io_uring does not support getdents (despite multiple patch series being proposed). So you'd get async stat(), if you need them, but nothing else.


Oh. Well, that's definitely a deal breaker then. ripgrep already avoids stat calls on most files.


You may want to look into improvements to A* for grids, like Rectangular Symmetry Reduction.



If just use A*, but you rank open to loop for lowest (f, h) pairs, then the search frontier just dives despite having multiple optimal paths, as the new node tie-breaking ensures we prefer nodes that seem closest to the goal.


afaik jump point search would work for uniform cost grids but not if there's the exponential term that OP has



There was no active one. We saw this and thought it would be a nice nod to history. We've actually spoken to some developers at apple who thought this was really neat :)


They were referencing the title of a specific paper, which invented type classes: https://dl.acm.org/doi/10.1145/75277.75283


Thank you, I did not know this paper and totally missed the reference.


The C standard (since C99) says that `main()` has an implicit `return 0`, you don't need to write it explicitly.


Sure but are we teaching good habits to students, or are we golfing?


Given how many tutorials leave best practices out on how to do proper error handling, strings and arrays in C, doing analysis as part of the build, I would say golfing most of the time.


I was looking for a relevant paper and found this one, which isn't what I was looking for but is related: https://sigops.org/s/conferences/hotos/2023/papers/liargkova...


My actual "production" implementation of this concept does support that: https://github.com/tavianator/bfs/blob/main/configure

But I wanted the blog post sized version to be simpler for exposition.


Very nice


Just tried reconfiguring LLVM:

    27.24s user 8.71s system 99% cpu 36.218 total
Admittedly the LLVM build time dwarfs the configuration time, but still. If you're only building a smaller component then the config time dominates:

    ninja libc  268.82s user 26.16s system 3246% cpu 9.086 total


Nice! I used to do something similar, don't remember exactly why I had to switch but the two step process did become necessary at some point.

Just from a quick peek at that repo, nowadays you can write

#if __has_attribute(cold)

and avoid the configure test entirely. Probably wasn't a thing 10 years ago though :)


The problem is that the various `__has_foo` aren't actually reliable in practice - they don't tell you if the attribute, builtin, include, etc. actually works the way it's supposed to without bugs, or if it includes a particular feature (accepts a new optional argument, or allows new values for an existing argument, etc.).


    #if __has_attribute(cold)
You should use double underscores on attribute names to avoid conflicts with macros (user-defined macros beginning with double underscores are forbidden, as identifiers beginning with double underscores are reserved).

    #if __has_attribute(__cold__)
    #  warning "This works too"
    #endif

    static void __attribute__((__cold__))
    foo(void)
    {
        // This works too
    }


yep. C's really come a long way with the special operators for checking if attributes exist, if builtins exist, if headers exist, etc.

Covers a very large part of what is needed, making fewer and fewer things need to end up in configure scripts. I think most of what's left is checking for items (types, functions) existence and their shape, as you were doing :). I can dream about getting a nice special operator to check for fields/functions, would let us remove even more from configure time, but I suspect we won't because that requires type resolution and none of the existing special operators do that.


You still need a configure step for the "where are my deps" part of it, though both autotools and CMake would be way faster if all they were doing was finding, and not any testing.


True. That isn't something the compiler can do.

That said, that (determining the c flags and ld flags for dependencies) is something that might be able to be mixed into compilation a bit more than it is now. Could imagine that if we annotate which compilation units need a particular system library, we could start building code that doesn't depend on that library while determining the library location/flags (ie: running pkg-config or doing other funny business) at the same time.

Or since we're in the connected era, perhaps we're downloading the library we require if it's not found and building it as an embedded component.

With that type of design, it becomes more clear why moving as much to the build stage (where we can better parallelize work because most of the work is in that stage) and more accurately describing dependencies (so we don't block work that could run sooner) can be helpful in builds.

Doing that type of thing requires a build system that is more flexible though: we really would need to have the pieces of "work" run by the build system be able to add additional work that is scheduled by the build system dynamically. I'm not sure there are many build systems that support this.


Download/build on demand is cute when it works, but it's a security nightmare and a problem for Nix which runs the build in an environment that's cut off from the network.

This is already a problem for getting Bazel builds to run nicely under Nix, with the current solution (predownload everything into a single giant "deps" archive in the store and then treat that as a fixed input derivation with a known hash value) is deeply non-optimal. Basically, I hope that any such schemes have a well-tested fallback path for bubbling the "thing I would download" information outward in case there are reasons to want to separate those steps.


I agree that there are problems when laying multiple build systems on top of one another, and I see that often as a user of nix (it's also bad with rust projects that use cargo, and though there are a variety of options folks have written they all have tradeoffs).

To some extent, the issue here is caused by just what I was discussing above: Nix derivations can't dynamically add additional derivations (ie: build steps not being able to dynamically add additional build steps makes things non-optimal).

I am hopeful that Nix's work on dynamic derivations will improve the situation for nix (with respect to bazel, cargo, and others) over time, and I am hopeful that other build systems will recognize how useful dynamically adding build steps can be.


It's true— fundamentally, nothing about a build realizing partway through that it needs more stuff breaks the Nix philosophy, assuming the build is holding a hash for whatever it is it wants to pull so that everything stays hermetic. It's a bit annoying to not know upfront exactly what your build graph looks like but honestly it's not the worst— like, you already don't know how long each derivation is going to take.

In any case, the tvix devs have definitely understood the assignment on this and are making not only ifd a first class citizen, but also the larger issue of allowing the evaluation step to decompose, and for the decomposed pieces to run in parallel with each other and with builds— and that really is the game-changer, particularly with a cluster-backed build, to be able to start work immediately rather than waiting out a 30-60 second single-threaded eval.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: