I maintain the reproducible builds effort for my company and, please, let me tel...

kardos · on July 10, 2024

The reproducible builds effort means same input (code, libraries, compiler, flags, build-environment), should give the same output (build artifacts). I don't think RB is trying to get reproducible across platforms and across varying compilers/dependencies -- that is a much bigger and much harder goal.

> There is always going to be a degree of un-reproducibility just due to the nature of math.

Fundamentally the math is deterministic (reproducible): if you do the same sequence of mathematical operations you get the same result. I gather you're getting at the non-associativity of floating point (eg additions), which is a fair point, but if you arrange to do your floating point operations in the same sequence then it will be reproducible.

orbital-decay · on July 10, 2024

Multithreaded code can easily be non-deterministic, with things like OS preemption leaking into the sandbox. Guaranteed determinism is hard, it needs a deterministic emulator.

Quekid5 · on July 10, 2024

Not really for building code, at least. It's a common problem, but it's not insurmountable. Use the equivalent of 'make -j1' or whatever...

This requires engineering on the part of compiler writers, but ultimately is solvable, given enough funding.

(About the compiler writers part: I'm thinking about GHC Haskell and the like... the Clangs/GCCs of the world will have no problems because of Translation Units. Things may change when Modules are a practical choice.)

hinkley · on July 12, 2024

You're gonna need a better programming language if you want to get to that level.

bobmcnamara · on July 10, 2024

For these sorts of things, please do not assume hardware is reliable and compilers are bug free.

8organicbits · on July 10, 2024

Bug free is less important for reproducible builds, so long as the bugs are reproducible. Compiler bugs are of course still a problem for other reasons.

bobmcnamara · on July 10, 2024

My point is only that even when the software is correct, the reproducible build rate will either be 100% or we've got a representative sample set.

Until then, those issues will continue to be lumped together with all the software reasons it wasn't reproducible.

kardos · on July 10, 2024

Sounds like a mechanism to test for bad hardware (akin to memtest86) and find compiler bugs =)

yjftsjthsd-h · on July 11, 2024

In a slightly different form, this has already happened: https://julialang.org/blog/2020/09/rr-memory-magic/

bobmcnamara · on July 10, 2024

Bingo - even if we do everything right the percentage of matching builds isn't expected to be 100%. If it is, there just aren't enough samples.

My first Pentium could usually but not always math correctly until Intel replaced it during a recall.

BonusPlay · on July 10, 2024

From my experience, you either go full reproducible builds with nix, or none at all.

Sitting in the middle results with additional downsides from modifying pipeline without core upsides of reproducible builds.

torcete · on July 11, 2024

Since hardware is the most difficult piece to replicate exactly for a reproducible build, should we say that for a complete reproducible environment we need an abstract virtual machine?

actionfromafar · on July 10, 2024

So we need our toolchains to run in WASM, too. Haha, only serious.