More

tavianator · 2025-10-26T13:56:48 1761487008

Presumably that's just a mistake. The author calls it "stochastic gradient descent" correctly elsewhere in the article

tavianator · 2025-07-18T16:27:00 1752856020

My bfs project also uses io_uring: https://github.com/tavianator/bfs/blob/main/src/ioq.c

I'm curious how lsr compares to bfs -ls for example. bfs only uses io_uring when multiple threads are enabled, but maybe it's worth using it even for bfs -j1

rockorager · 2025-07-18T16:40:10 1752856810

Oh that's cool. `find` is another tool I thought could benefit from io_uring like `ls`. I think it's definitely worth enabling io_uring for single threaded applications for the batching benefit. The kernel will still spin up a thread pool to get the work done concurrently, but you don't have to manage that in your codebase.

tavianator · 2025-07-18T16:57:49 1752857869

I did try it a while ago and it wasn't profitable, but that was before I added stat() support. Batching those is probably good

mshockwave · 2025-07-18T17:10:05 1752858605

and grep / ripgrep. Or did ripgrep migrate to using io_uring already?

burntsushi · 2025-07-18T17:20:28 1752859228

No, ripgrep doesn't use io_uring. Idk if it ever will.

porridgeraisin · 2025-07-18T18:41:26 1752864086

Curious: Why? Is it not a good fit for what ripgrep does? Isn't the sort of "streaming" "line at a time" I/O that ripgrep does a good fit for async io?

burntsushi · 2025-07-18T19:35:01 1752867301

For many workloads, ripgrep spends the vast majority of its time searching through files.

But more practically, it would be a terror to implement. ripgrep is built on top of platform specific standard file system APIs. io_uring would mean a whole heap of code to work with a different syscall pattern in addition to the existing code pattern for non-Linux targets.

So to even figure out whether it would be worth doing that, you would need to do a whole bunch of work just to test it. And because of my first point above, there is a hard limit on how much of an impact it could even theoretically have.

Where I would expect this to help is to batch syscalls during directory tree traversal. But I have nonidea how much it would help, if at all.

Sesse__ · 2025-07-18T21:58:20 1752875900

I believe that io_uring does not support getdents (despite multiple patch series being proposed). So you'd get async stat(), if you need them, but nothing else.

burntsushi · 2025-07-20T00:37:17 1752971837

Oh. Well, that's definitely a deal breaker then. ripgrep already avoids stat calls on most files.

tavianator · 2025-05-15T18:52:35 1747335155

You may want to look into improvements to A* for grids, like Rectangular Symmetry Reduction.

taneq · 2025-05-15T22:30:19 1747348219

Also jump point search: https://zerowidth.com/2013/a-visual-explanation-of-jump-poin...

dietr1ch · 2025-05-16T03:36:07 1747366567

If just use A*, but you rank open to loop for lowest (f, h) pairs, then the search frontier just dives despite having multiple optimal paths, as the new node tie-breaking ensures we prefer nodes that seem closest to the goal.

porphyra · 2025-05-16T17:42:54 1747417374

afaik jump point search would work for uniform cost grids but not if there's the exponential term that OP has

tavianator · 2025-05-13T20:45:22 1747169122

> There was no other helix db

https://en.wikipedia.org/wiki/Helix_(database)

GeorgeCurtis · 2025-05-13T20:53:54 1747169634

There was no active one. We saw this and thought it would be a nice nod to history. We've actually spoken to some developers at apple who thought this was really neat :)

tavianator · 2025-05-06T14:35:58 1746542158

They were referencing the title of a specific paper, which invented type classes: https://dl.acm.org/doi/10.1145/75277.75283

demurgos · 2025-05-06T14:45:10 1746542710

Thank you, I did not know this paper and totally missed the reference.

tavianator · 2025-05-05T12:07:25 1746446845

The C standard (since C99) says that `main()` has an implicit `return 0`, you don't need to write it explicitly.

01HNNWZ0MV43FF · 2025-05-05T14:57:48 1746457068

Sure but are we teaching good habits to students, or are we golfing?

pjmlp · 2025-05-06T08:52:54 1746521574

Given how many tutorials leave best practices out on how to do proper error handling, strings and arrays in C, doing analysis as part of the build, I would say golfing most of the time.

tavianator · 2025-04-26T18:56:27 1745693787

I was looking for a relevant paper and found this one, which isn't what I was looking for but is related: https://sigops.org/s/conferences/hotos/2023/papers/liargkova...

tavianator · 2025-04-26T17:07:21 1745687241

My actual "production" implementation of this concept does support that: https://github.com/tavianator/bfs/blob/main/configure

But I wanted the blog post sized version to be simpler for exposition.

tekknolagi · 2025-04-27T05:14:58 1745730898

Very nice

tavianator · 2025-04-26T16:00:55 1745683255

Just tried reconfiguring LLVM:

    27.24s user 8.71s system 99% cpu 36.218 total

Admittedly the LLVM build time dwarfs the configuration time, but still. If you're only building a smaller component then the config time dominates:

    ninja libc  268.82s user 26.16s system 3246% cpu 9.086 total

tavianator · 2025-04-26T04:56:35 1745643395

Nice! I used to do something similar, don't remember exactly why I had to switch but the two step process did become necessary at some point.

Just from a quick peek at that repo, nowadays you can write

#if __has_attribute(cold)

and avoid the configure test entirely. Probably wasn't a thing 10 years ago though :)

o11c · 2025-04-26T05:46:42 1745646402

The problem is that the various `__has_foo` aren't actually reliable in practice - they don't tell you if the attribute, builtin, include, etc. actually works the way it's supposed to without bugs, or if it includes a particular feature (accepts a new optional argument, or allows new values for an existing argument, etc.).

aaronmdjones · 2025-04-26T11:07:48 1745665668

    #if __has_attribute(cold)

You should use double underscores on attribute names to avoid conflicts with macros (user-defined macros beginning with double underscores are forbidden, as identifiers beginning with double underscores are reserved).

    #if __has_attribute(__cold__)
    #  warning "This works too"
    #endif

    static void __attribute__((__cold__))
    foo(void)
    {
        // This works too
    }

codys · 2025-04-26T05:04:24 1745643864

yep. C's really come a long way with the special operators for checking if attributes exist, if builtins exist, if headers exist, etc.

Covers a very large part of what is needed, making fewer and fewer things need to end up in configure scripts. I think most of what's left is checking for items (types, functions) existence and their shape, as you were doing :). I can dream about getting a nice special operator to check for fields/functions, would let us remove even more from configure time, but I suspect we won't because that requires type resolution and none of the existing special operators do that.

mikepurvis · 2025-04-26T05:45:31 1745646331

You still need a configure step for the "where are my deps" part of it, though both autotools and CMake would be way faster if all they were doing was finding, and not any testing.

codys · 2025-04-27T00:29:27 1745713767

True. That isn't something the compiler can do.

That said, that (determining the c flags and ld flags for dependencies) is something that might be able to be mixed into compilation a bit more than it is now. Could imagine that if we annotate which compilation units need a particular system library, we could start building code that doesn't depend on that library while determining the library location/flags (ie: running pkg-config or doing other funny business) at the same time.

Or since we're in the connected era, perhaps we're downloading the library we require if it's not found and building it as an embedded component.

With that type of design, it becomes more clear why moving as much to the build stage (where we can better parallelize work because most of the work is in that stage) and more accurately describing dependencies (so we don't block work that could run sooner) can be helpful in builds.

Doing that type of thing requires a build system that is more flexible though: we really would need to have the pieces of "work" run by the build system be able to add additional work that is scheduled by the build system dynamically. I'm not sure there are many build systems that support this.

mikepurvis · 2025-04-28T04:16:24 1745813784

Download/build on demand is cute when it works, but it's a security nightmare and a problem for Nix which runs the build in an environment that's cut off from the network.

This is already a problem for getting Bazel builds to run nicely under Nix, with the current solution (predownload everything into a single giant "deps" archive in the store and then treat that as a fixed input derivation with a known hash value) is deeply non-optimal. Basically, I hope that any such schemes have a well-tested fallback path for bubbling the "thing I would download" information outward in case there are reasons to want to separate those steps.

codys · 2025-04-28T23:41:45 1745883705

I agree that there are problems when laying multiple build systems on top of one another, and I see that often as a user of nix (it's also bad with rust projects that use cargo, and though there are a variety of options folks have written they all have tradeoffs).

To some extent, the issue here is caused by just what I was discussing above: Nix derivations can't dynamically add additional derivations (ie: build steps not being able to dynamically add additional build steps makes things non-optimal).

I am hopeful that Nix's work on dynamic derivations will improve the situation for nix (with respect to bazel, cargo, and others) over time, and I am hopeful that other build systems will recognize how useful dynamically adding build steps can be.

mikepurvis · 2025-04-29T15:28:25 1745940505

It's true— fundamentally, nothing about a build realizing partway through that it needs more stuff breaks the Nix philosophy, assuming the build is holding a hash for whatever it is it wants to pull so that everything stays hermetic. It's a bit annoying to not know upfront exactly what your build graph looks like but honestly it's not the worst— like, you already don't know how long each derivation is going to take.

In any case, the tvix devs have definitely understood the assignment on this and are making not only ifd a first class citizen, but also the larger issue of allowing the evaluation step to decompose, and for the decomposed pieces to run in parallel with each other and with builds— and that really is the game-changer, particularly with a cluster-backed build, to be able to start work immediately rather than waiting out a 30-60 second single-threaded eval.