More

doodlesdev · 2026-02-16T15:09:10 1771254550

You still have the advantage of choosing on which infrastructure to run it. Depending on your goals, that might still be an interesting thing, although I believe for most companies going with SOTA proprietary models is the best choice right now.

doodlesdev · 2026-02-16T14:59:45 1771253985

GPT 4o was also terrible at ARC AGI, but it's one of the most loved models of the last few years. Honestly, I'm a huge fan of the ARC AGI series of benchmarks, but I don't believe it corresponds directly to the types of qualities that most people assess whenever using LLMs.

gkbrk · 2026-02-17T09:49:30 1771321770

That's not because 4o is good at things, that's because it's pretty much the most sycophantic model and people easily fall for a model incorrectly agreeing with them then a model correctly calling them out.

nananana9 · 2026-02-16T17:10:43 1771261843

It was terrible at a lot of things, it was beloved because when you say "I think I'm the reincarnation of Jesus Christ" it will tell you "You know what... I think I believe it! I genuinely think you're the kind of person that appears once every few millenia to reshape the world!"

mrybczyn · 2026-02-16T16:36:10 1771259770

because arc agi involves de novo reasoning over a restricted and (hopefully) unpretrained territory, in 2d space. not many people use LLMs as more than a better wikipedia,stack overflow, or autocomplete....

doodlesdev · 2026-02-13T17:47:53 1771004873

   > I don't get why every language's community doesn't just do the same thing: roll an idiomatic UI lib on top of SDL.

   > I haven't worked on screen reader support, yet. Support for alternative text input is built into SDL. UI size scaling is a feature I plan on adding eventually.

Well, that's why :)

For most serious applications, accessibility isn't a second thought, it's a requirement and it's very hard to implement correctly.

net_ · 2026-02-13T18:03:08 1771005788

So the solution is to build applications around less of a common base? I don't follow the logic, with respect to Zed. I get what you mean if there's a first-party UI solution in your language (e.g. Swift), but in that case you don't need an open-source UI library.

mort96 · 2026-02-13T20:00:38 1771012838

The solution, if you want a production ready GUI, is to use a GUI toolkit which already has decent accessibility support.

There aren't that many of those: .NET, AppKit/UIKit, SwiftUI, Qt, GTK, the web, wxWidgets (which is really just GTK/AppKit/.NET), probably a couple others. So you either use the native language of one of those toolkits, or you use bindings from your language to those toolkits.

doodlesdev · 2026-01-28T04:22:02 1769574122

Honestly, this is absurdly funny, but it makes me wonder whether we'll ever see Computer Science and Computer Engineering as seriously as other branches of STEM. I've been debating recently whether I should keep working in this field, after years of repeatedly seeing incompetence and complacency create disastrous effects in the real world.

Oftentimes, I wonder if the world wouldn't be a bit better without the last 10 or 15 years of computer technology.

vsgherzi · 2026-01-28T05:10:45 1769577045

This is really something that’s making me quite fed up with industry. I’m looking towards embedded and firmware in hopes that the lower in the stack I go the more people care about these type of things out of business necessity. But even then I’m unsure I’ll find the rigor I’m looking for

eloisius · 2026-01-28T05:30:58 1769578258

I’ve been thinking the same thing lately. It’s hard to tell if I’m just old and want everyone off my lawn, but I really feel like IT is a dead end lately. “Vintage” electronics are often nicer to use than modern equivalents. Like dials and buttons vs touch screens. Most of my electronics that have LCDs feel snappy and you sort of forget that you’re using them and just do what you were trying to do. I’m not necessarily a Luddite. I know tech _could_ be better theoretically but it’s distressing to know that it’s also not possible for things to be different for some other reasons. Economically, culturally? I don’t know.

dumpsterdiver · 2026-01-28T04:29:56 1769574596

> makes me wonder whether we'll ever see Computer Science and Computer Engineering as seriously as other branches of STEM

It's about as serious as a heart attack at this point...

doodlesdev · 2026-01-28T04:08:16 1769573296

How so? I tend to disagree with the general statement that this is common in the infosec world, but I'd like to understand better what you mean by that.

vachina · 2026-01-28T06:25:44 1769581544

Impact in this case, is non-existent (Wow they got my email)

> I'd like to understand better what you mean by that.

Recall there was a period where every CPU sidechannel attack had a dedicated (wow) website and a rock band name assigned to it (when in reality their impact again, was/is limited).

doodlesdev · 2026-01-28T04:02:12 1769572932

Honestly, hallucinated references should simply get the submitter banned from ever applying again. Anyone submitting papers or anything with hallucinated references shall be publicly shamed. The problem isn't only the LLMs hallucinating, it's lazy and immoral humans who don't bother to check the output either, wasting everyone's time and corroding public trust in science and research.

lionkor · 2026-01-28T11:48:44 1769600924

I fully agree. Not reading your own references should be grounds for banning, but that's impossible to check. Hallucinated references cannot be read, so by definition,they should get people banned.

fuzzfactor · 2026-01-28T15:08:41 1769612921

>Not reading your own references

This could be considered in degrees.

Like when you only need a single table from another researcher's 25-page publication, you would cite it to be thorough but it wouldn't be so bad if you didn't even read very much of their other text. Perhaps not any at all.

Maybe one of the very helpful things is not just reading every reference in detail, but actually looking up every one in detail to begin with?

doodlesdev · 2026-01-28T03:58:01 1769572681

Which proves its own points! Absolutely genius! The cost asymmetry of producing and checking for garbage truly is becoming a problem in the recent years, with the advent of LLMs and generative AI in general.

parentheses · 2026-01-28T06:15:44 1769580944

Totally agree!

I feel like this means that working in any group where individuals compete against each other results in an AI vs AI content generation competition, where the human is stuck verifying/reviewing.

dormento · 2026-01-28T12:58:22 1769605102

> Totally agree!

Not a dig on your (very sensible) comment, but now I always do a double take when I see anyone effusively approving of someone else's ideas. AI turned me into a cynical bastard :(

doodlesdev · 2026-01-23T04:33:28 1769142808

   > Submitted those at like 10 or 11 pm and went to sleep.

That's a classic :)

Hopefully this hasn't caused any real harm. At least it sure did give me a good laugh when I first saw it.

fragmede · 2026-01-23T13:40:07 1769175607

In corporate, that's pushed to prod and then got on an international flight on a Friday afternoon.

doodlesdev · 2026-01-23T04:10:21 1769141421

They don't care. Azure has a revenue higher than GCP, losing only to AWS. It's Microsoft's new baby, and they love it, no matter what you want to run there. Also, they're still the 4th largest company by market cap.

Honestly, only us nerds in Hacker News care about this kind of stuff :) (and that's why I love it here).

edit: also, the article cites OpenAI did adopt Azure Cosmos DB for new stuff they want to shard. Still shows how far you can take PostgreSQL though.

doodlesdev · 2026-01-23T03:52:24 1769140344

SQLite is public-domain software and one of the best well-maintained pieces of software around today. You absolutely have to be very careful before saying things like these, as they bring lots of implications. I wouldn't call it offensive _per se_, but I'd say it's in bad faith at least. I'd just remove that if I were the devs, because everything else there makes me find the project at least interesting.