More

leeter · 2026-02-20T21:53:48 1771624428

Based on the info if you click into them, likely no. I would have expected them to be incidental materials from tunneling, but reading the description that's not the case.

leeter · 2026-02-18T14:46:22 1771425982

[removed]

kbolino · 2026-02-18T15:08:43 1771427323

Part of the reason, I think, is that Qualcomm and Apple cut their teeth on mobile devices, and yeah wider SIMD is not at all a concern there. It's also possible they haven't even licensed SVE from Arm Holdings and don't really want to spend the money on it.

In Apple's case, they have both the GPU and the NPU to fall back on, and a more closed/controlled ecosystem that breaks backwards compatibility every few years anyway. But Qualcomm is not so lucky; Windows is far more open and far more backwards compatible. I think the bet is that there are enough users who don't need/care about that, but I would question why they would even want Windows in the first place, when macOS, ChromeOS, or even GNU/Linux are available.

jovial_cavalier · 2026-02-18T14:53:53 1771426433

A ton of vector math applications these days are high dimensional vector spaces. A good example of that for arm would I guess be something like fingerprint or face id.

Also, it doesn't just speed up vector math. Compilers these days with knowledge of these extensions can auto-vectorize your code, so it has the potential to speed up every for-loop you write.

josefx · 2026-02-18T15:18:26 1771427906

> A good example of that for arm would I guess be something like fingerprint or face id.

So operations that are not performance critical and are needed once or twice every hour? Are you sure you don't want to include a dedicated cluster of RTX 6090 Ti GPUs to speed them up?

jovial_cavalier · 2026-02-18T17:24:56 1771435496

I'd argue that those are actually very performance critical because if it takes 5 seconds to unlock your phone, you're going to get a new phone.

The point is taken, though, that seemingly the performance is fine as it is for these applications. My point was only that you don't need to be running state of the art LLMs to be using vector math with more than 4 dimensions.

pertymcpert · 2026-02-18T19:03:49 1771441429

Those are extremely performance critical operations. A lot of people use their phone many times an hour.

leeter · 2026-01-26T19:57:11 1769457431

I believe you're thinking of the x86 Hotpatching hook[1], which doesn't exist on x86-64[2] (in the same form, it uses a x86-64 safe one).

[1] https://devblogs.microsoft.com/oldnewthing/20110921-00/?p=95...

[2] https://devblogs.microsoft.com/oldnewthing/20221109-00/?p=10...

sidewndr46 · 2026-01-26T21:33:43 1769463223

yes, that's it. Thanks for clarifying

leeter · 2026-01-11T00:06:14 1768089974

Almost assuredly, given that 10.0 was released on 32bit PPC... and was built around Carbon, not Cocoa... yeah it's changed just a wee bit.

leeter · 2025-12-09T21:46:28 1765316788

I remember failing an interview with the optimization team of a large fruit trademarked computer maker because I couldn't explain why the x87 stack was a bad design. TBF they were looking for someone with a masters, not someone just graduating with a BS. But, now I know... honestly, I'm still not 100% sure what they were looking for in an answer. I assume something about register renaming. memory, and cycle efficiency.

kens · 2025-12-09T22:01:44 1765317704

Having given a zillion interviews, I expect that they weren't looking for the One True Answer, but were interested in seeing if you discussed plausible reasons in an informed way, as well as seeing what areas you focused on (e.g., do you discuss compiler issues or architecture issues). Saying "I dunno" is bad, especially after hints like "what about ..." and spouting complete nonsense is also bad.

(I'm just commenting on interviews in general, and this is in no way a criticism of your response.)

leeter · 2025-12-09T22:06:50 1765318010

I think I said something about the stack efficiency. I was a kid that barely understood out of order execution. Register renaming and the rest was well beyond me. It was also a long time ago, so recollections are fuzzy. But, I do recall is they didn't prompt anything. I suspect the only reason I got the interview is I had done some SSE programming (AVX didn't exist yet, and to give timing context AltiVec was discussed), and they figured if I was curious enough to do that I might not be garbage.

Edit: Jogging my memory I believe they were explicit at the end of the interview they were looking for a Masters candidate. They did say I was on a good path IIRC. It wasn't a bad interview, but I was very clearly not what they were looking for.

leeter · 2025-12-05T13:43:00 1764942180

I believe the Academy Awards and a few other things too also influence this. The rules to be eligible still very much favor legacy studios IIRC. But, with this that may change? Hard to say. I know that quite a few Netflix movies have had theatrical runs at random mom and pop theaters in Cali so they could meet eligibility requirements for the various awards.

jmkd · 2025-12-05T14:13:58 1764944038

A current example (although not Netflix) is The Secret Agent with an award qualification run in NYC and LA before wider release.

leeter · 2025-11-17T16:07:20 1763395640

Honestly? I expected this to be talking about the MiSTer project FPGA core[1]. That has been tuned so it's capable of running the AREA5150 demo[2] which is an insane challenge (AFAIK the timings of the v20 break that demo). Not saying this isn't cool, but it's definitely not what I was expecting.

[1] https://github.com/MiSTer-devel/PCXT_MiSTer

[2] https://www.youtube.com/watch?v=tOmcgp99fEk

leeter · 2025-10-21T17:18:58 1761067138

I've said for years that any smart thermostat should have a bimetallic backup that controls maximum ranges and acts in the dumbest way possible. Just max temp and min temp for AC and heat. Nothing that should ever be hit... but there nonetheless.

mrWiz · 2025-10-21T18:21:32 1761070892

You could just put a backup dumb thermostat in parallel with the smart one.

leeter · 2025-10-20T15:53:31 1760975611

I'm reminded of Raymond Chen's many many blogs[1][2][3](there are a lot more) on why TerminateThread is a bad idea. Not surprised at all the same is true elsewhere. I will say in my own code this is why I tend to prefer cancellable system calls that are alertable. That way the thread can wake up, check if it needs to die and then GTFO.

[1] https://devblogs.microsoft.com/oldnewthing/20150814-00/?p=91...

[2] https://devblogs.microsoft.com/oldnewthing/20191101-00/?p=10...

[3] https://devblogs.microsoft.com/oldnewthing/20140808-00/?p=29...

there are a lot more, I'm not linking them all here.

manwe150 · 2025-10-21T01:12:41 1761009161

One of my more annoying gotchas on Windows is that despite this advice being very reasonable sounding, the runtime itself (I believe it actually happens in the kernel) essentially calls TerminateThread on all child threads before running global destructors and atexit hooks. Good luck following this advice when the kernel actively fights you when it come time to shutdown

leeter · 2025-10-21T13:12:07 1761052327

So there is a reason that in the C++ spec if a std::thread is still joinable when the destructor is called it calls std::terminate[1]. That reason being exactly this case. If the house is being torn down it's not safe to try to save the curtains[2]. Just let the house get torn down as quickly as possible. If you wanted to save the curtains (e.g. do things on the threads before they exit) you need to do it before the end of main and thus global destructors start getting called.

[1] https://en.cppreference.com/w/cpp/thread/thread/~thread.html

[2] https://devblogs.microsoft.com/oldnewthing/20120105-00/?p=86...

layer8 · 2025-10-21T10:34:28 1761042868

Global destructors and atexit are called by the C/C++ runtime, Windows has nothing to do with that. The C and C++ specs require that returning from main() has the same effect of ending the process as exit() does, meaning they can’t allow any still-running threads to continue running. Given these constraints, would you prefer the threads to keep running until after global destructors and atexit have run? That would be at least as likely to wreak havoc. No, in C/C++, you need to make sure that other threads are not running anymore before returning from main().

spacechild1 · 2025-10-21T11:30:58 1761046258

When you return from main(), there shouldn't be any child threads running in the first place. Join your threads and you will be fine.

leeter · 2025-10-15T16:28:48 1760545728

I don't disagree... but, there is a use case for orgs that don't allow forks. Some tools do their merging outside of github and thus allow for PRs that cannot be clean from a merge perspective. This won't trigger workflows that are pull_request. Because pull_request requires a clean merge. In those cases pull_request_target is literally the only option.

The best move would be for github to have a setting for allowing the automation to run on PRs that don't have clean merges, off by default and intended for use with linters only really. Until that happens though pull_request_target is the only game in town to get around that limitation. Much to my and other SecDevOps engineers sadness.

NOTE: with these external tools you absolutely cannot do the merge manually in github unless you want to break the entire thing. It's a whole heap of not fun.

woodruffw · 2025-10-15T16:37:21 1760546241

That's a fantastic use case that should be supported discretely!

leeter · 2025-10-15T16:39:50 1760546390

Why github didn't is beyond me. Even if something isn't merge clean doesn't mean linters shouldn't be run. I get not running deployments etc. but not even having the option is pain.