random17's comments

random17 · 2025-12-20T08:17:08 1766218628

I think a lot of people in the comments are getting hung up on titles and missing the real point of the post. The headline probably didn’t help with that.

The post actually does a great job of highlighting a genuinely valuable skill that the best engineers practice regardless of their title. In particular, “reducing ambiguity” is something I believe would be really beneficial for many early-career engineers to intentionally develop.

random17 · 2025-05-12T17:02:29 1747069349

Congrats on the launch!

Im curious about what kinds of workloads you see GPU-accelerated compute have a significant impact, and what kinds still pose challenges. You mentioned that I/O is not the bottleneck, is that still true for queries that require large scale shuffles?

winwang · 2025-05-12T17:16:28 1747070188

Large scale shuffles: Absolutely. One of the larger queries we ran saw a 450TB shuffle -- this may require more than just deploying the spark-rapids plugin, however (depends on the query itself and specific VMs used). Shuffling was the majority of the time and saw 100% (...99%?) GPU utilization. I presume this is partially due to compressing shuffle partitions. Network/disk I/O is definitely not the bottleneck here.

It's difficult to say what "workloads" are significant, and easier to talk about what doesn't really work AFAIK. Large-scale shuffles might see 4x efficiency, assuming you can somehow offload the hash shuffle memory, have scalable fast storage, etc... which we do. Note this is even on GCP, where there isn't any "great" networking infra available.

Things that don't get accelerated include multi-column UDFs and some incompatible operations. These aren't physical/logical limitations, it's just where the software is right now: https://github.com/NVIDIA/spark-rapids/issues

Multi-column UDF support would likely require some compiler-esque work in Scala (which I happen to have experience in).

A few things I expect to be "very" good: joins, string aggregations (empirically), sorting (clustering). Operations which stress memory bandwidth will likely be "surprisingly" good (surprising to most people).

Otherwise, Nvidia has published a bunch of really-good-looking public data, along with some other public companies.

Outside of Spark, I think many people underestimate how "low-latency" GPUs can be. 100 microseconds and above is highly likely to be a good fit for GPU acceleration in general, though that could be as low as 10 microseconds (today).

_zoltan_ · 2025-05-12T18:57:12 1747076232

8TB/s bandwidth on the B200 helps :-) [yes, yes, that is at the high end, but 4.8TB/s@H200, 4TB/s@H100, 2TB/s@A100 is nothing to sneeze at either).

winwang · 2025-05-12T19:14:01 1747077241

Very true. Can't get those numbers even if you get an entire single-tenant CPU VM. Minor note, A100 40G is 1.5TB/s (and much easier to obtain).

That being said, ParaQuery mainly uses T4 and L4 GPUs with "just" ~300 GB/s bandwidth. I believe (correct me if I'm wrong) that should be around a 64-core VM, though obviously dependent on the actual VM family.

random17 · 2025-03-15T02:19:54 1742005194

I wish Github required some sort of immutability for actions by default as most package managers do, either by requiring reusable actions to be specified via commit hash or by preventing the code for a published tag to be changed.

At the moment the convention is to only specify the tag, which is not only a security issue as we see here, but may also cause workflows to break if an action author updates the action.

wutwutwat · 2025-03-15T02:33:09 1742005989

You can target `some/action@commithash` already, that's up to you. You're also free to fork or clone each action you use, vet the code, and consume your fork in your workflows. You can also disable the use of third party actions at an org level, or approve them on a case-by-case basis.

This all depends on your threat model and risk tolerance, it's not so much a GitHub problem. There will always be bad code that exists, especially on the largest open source code hosting platform. You defend against it because that's more realistic than trying to eradicate it.

chuckadams · 2025-03-15T20:50:20 1742071820

Someone elsewhere suggested a lockfile, which seems a pretty obvious solution in hindsight. I'm fine with commit hashes, but the UX is terrible and consists of pasting the action into into StepSecurity's thingie, when this is something that GH should have built in.

wutwutwat · 2025-03-16T08:36:53 1742114213

> Someone elsewhere suggested a lockfile

commit hashes are immutable, and your own commit history can serve as the lock file.

but if you're targeting a commit hash directly, it's already locked. Lock files are for mapping a version range to a real life version number. Lock files are useless if you pin the exact version for everything.

chuckadams · 2025-03-16T14:21:31 1742134891

Sure, it's just that the DX of getting that commit hash isn't terrific, so one might be more inclined to trust an auto-update bot to automatically update them instead. A lock file is more like TOFU on a tag. I'd also take a UI like a "bake" button and CLI flag that substituted the hashes automatically, but you just know people are going to build `--bake` right in to their automation.

Another solution would be to implement immutable tags in git itself, but git upstream has so far been hostile to the whole concept of immutability in any part of git.

arianvanp · 2025-03-15T12:17:08 1742041028

One problem with this is that actions can be Composite and call arbitrary other actions. So only if you use actions that themselves lock everything by commit for the actions they depend on you're safe.

wutwutwat · 2025-03-16T08:34:18 1742114058

You just described a supply chain, and the risks that come with them, which is something every dep management system is dealing with, rubygems, npm, etc

Again, it all comes down to your risk tolerance. There's a certain level of trust built into these systems.

mubou · 2025-03-15T07:25:21 1742023521

Someone else in the thread mentioned that this is coming! https://github.com/features/preview/immutable-actions

random17 · on Aug 11, 2024

I wouldn’t call SAM video “understanding” though, it’s a model whose sole job is to segment frames into distinct objects, and has not demonstrated any innate understanding of the physics or logic of the videos themselves.

random17 · on March 6, 2024

Author here! Happy to answer any questions.