https://github.com/triton-lang/triton/pull/7298#discussion_r2202281596 > By disa...

frogblast · 2025-10-03T15:04:39 1759503879

Often not elusive bugs, but elusive performance. GPU compilers are hard: Once you've done the basics, trying to do further transforms in a mature compiler will almost always produced mixed results. Some kernels will go faster, some will go slower, and you're hoping to move the balance and not hit any critical kernel too hard in your efforts to make another go faster.

An optimization with a universal >=0 speedup across your entire suite of tests is a really hard thing to come by. Something is always going to have a negative speedup.

My experience is with non-Nvidia GPU systems, but this feels like a familiar situation. They probably found something that has great outcomes for one set of kernels, terrible outcomes for another, and no known reliable heuristic or modeling they could use to automatically choose.

Eridrus · 2025-10-03T15:56:44 1759507004

A saner design would turn this optimization into a documented flag that anyone can opt into.

rcoveson · 2025-10-03T16:58:20 1759510700

Speaking from a place of long-term frustration with Java, some compiler authors just absolutely hate exposing the ability to hint/force optimizations. Never mind that it might improve performance for N-5 and N+5 major releases, it might be meaningless or unhelpful or difficult to maintain in a release ten years from now, so it must not be exposed today.

recursivecaveat · 2025-10-03T22:18:57 1759529937

I once exposed a "disableXYZOptimization" flag to customers so they could debug a easier without stuff getting scrambled. Paid for my gesture for the next year signing off on release updates, writing user guide entries, bleh.

Eridrus · 2025-10-06T04:20:34 1759724434

So it's better to hardcode your specific library name and deal with the same issue after people have reverse engineered it and started depending on it anyway?

MichaelZuo · 2025-10-03T17:10:33 1759511433

That seems valid for customers expecting a warranty or support. But they should allow it if customers waive all such in writing.

Dylan16807 · 2025-10-03T19:30:02 1759519802

Warranty and support specifically for that flag? Because I don't see how general warranty and support requires keeping any hint flags forever.

shadowpho · 2025-10-04T00:48:12 1759538892

If you remove the hint flag peoples build will break

Dylan16807 · 2025-10-04T21:53:16 1759614796

Doesn't need to, it can acknowledge and ignore the hints.

shadowpho · 2025-10-05T06:50:17 1759647017

True, but there might be more problems — like if you drop support their run time will be slow because they rely on this flag and they are unhappy

Dylan16807 · 2025-10-05T13:22:44 1759670564

The premise of removing the flag is that it's useless or a problem. If it's still causing a big speed boost somewhere then you need to figure something out, but the core scenario here is that it's obsolete.

godelski · 2025-10-04T02:51:10 1759546270

  > An optimization with a universal >=0 speedup across your entire suite of tests is a really hard thing to come by. Something is always going to have a negative speedup.

Maybe a common example of this is that people can write matrix matrix multiplication kernels that outperform standard implementations (also in BLAS for CPU). But that's not a General Matrix Matrix multiply. Is the speedup still there for spare matrices? Larger ones? Small ones? Ones that aren't powers of 2? Non-square? And so on. You can beat the official implementation in any one of these but good luck doing it everywhere. In fact, you should beat the official method because you don't have the overhead to check which optimization you should use.

It's easy to over simplify a problem and not even realize you have done so. There's always assumptions being made and you should not let these be invisible.

temp0826 · 2025-10-03T05:48:39 1759470519

Thanks for a little context, this is not my wheelhouse at all (never even heard of this project) and I could not make heads or tails of the title or the linked PR.