> I think you should remove Claude as a contributor to your repo
I actually really appreciate it when people do not hide their use of Claude code in their repo like that. It's usually the first thing I check on Show HN posts these days.
Ironically, most of the app is a webview. The comments just have some additional CSS styling slapped on top of the hackernews website. So you still have an entire HackerNews site loaded at all times when reading comments anyway.
Besides the usual complaints about electron and CEF applications, another pain point is they work horrendously in emulation. GoG Galaxy is only available as an x86 application on Windows. I'm running Windows ARM64 in a VM on an M-series macbook to play some games occasionally, and Galaxy is the slowest piece of software I have. Ironically, it runs worse than the games it spawns, which have a much more complex rendering procedure (and, like Galaxy, they also run in emulation, since the binaries are x86).
Emulation works particularly slow with JITted languages, so having the entire UI written in JavaScript doesn't help at all.
I even checked their job posting in the hope that it will be about a ground up rewrite for GNU/Linux, without the browser (since they are looking for a C++ developer), but it seems there are no plans to change that in the porting process. Which makes senes, it's a lot of work, but still a pity.
On a tangential note, requirements like this in the job posting also do not inspire much hope for improvements in the near future.
> Actively use and promote AI-assisted development tools to increase team efficiency and code quality
I always thought it was a bit funny GoG GALAXY 2.0 went the web tech route for parts of the client and still managed to get itself stuck in a place where it ships an x86 only binary on macOS anyways.
About as usual for a CEF/Electron app last I tested ~ a year ago. I.e. "impressively well considering, but still not all too great". For a while there was an issue where it was launching games in compatibility mode, but I assume that's been long fixed without needing to migrate the base app (if anyone regularly uses GoG on their Mac feel free to chime in, I only have a work MacBook these days).
This doesn't seem like an article that was made with proper research or proper sincerity.
The claim is that there is one million times more data to feed to LLMs, citing a few articles. The articles estimate that there is 180-200 zettabytes (the number mentioned in TFA) of data total in the world, including all cloud services, including all personal computers, etc. The vast majority of that data are not useful to train LLMs at all, they will be movies, games, databases. There is a massive amount of duplication in that data. Only a tiny-tiny fraction will be something useful.
> Think of today’s AI like a giant blender where, once you put your data in, it gets mixed with everyone else’s, and you lose all control over it. This is why hospitals, banks, and research institutions often refuse to share their valuable data with AI companies, even when that data could advance critical AI capabilities.
This is not the reason, the reason is that this data is private. LLMs do not just learn from data, they can often reproduce it verbatim, you cannot give medical records or bank records of real people, that will put them at a very real risk.
Let alone that a lot of them will be, well-structured, yes, but completely useless information for LLM training. You will not get any improvement in the perceived "intellect" of a model by overfitting it with terabytes of tables with bank transaction records.
"This is not the reason, the reason is that this data is private. LLMs do not just learn from data, they can often reproduce it verbatim, you cannot give medical records or bank records of real people, that will put them at a very real risk."
(OP) You make great points. I think we're actually more in agreement than might be obvious. Part of the reason you need to "give" data to an LLM is because of the way LLMs are constructed... which creates the privacy risk.
The principle of attribution-based control suggested in this article would break that principle, enabling each data owner to control which AI predictions they make more intelligent (as opposed to only controlling which IA models they help train).
So to your point... this is a very rigorous privacy protection. Another way to TLDR the article is "if we get really good at privacy... there's a LOT more data out there... so let's start really caring about privacy"
Anyway... I agree with everything in your comment. Just thought I'd drop by and try to lend clarity to how the article agrees with you (sounds like there's room for improvement on how to describe attribution-based control though).
> Written in Zig for optimal performance and memory safety
Ironic to see this, as even a cursory glance over the code shows a lot of memory bugs, such as dangling pointers, data races, some potential bugs like double free, etc.
It is good that the OP is learning things, but I would caution against relying on LLMs and taking on a bigger project like this, before the basics are well understood.
Personally, this is one of the reasons I dislike the LLM hype, people are enabled to produce much more code, code that they aren't qualified to support or even understand.
While the project linked is clearly designated as strictly for "learning purposes", the applications we will get in the large will be of no better quality.
The difference being, before LLMs, those, who didn't have qualification, wouldn't even approach problems like this, now they can vibecode something that works on a lucky run, but is otherwise completely inadequate.
There is no such question when using D2 either. It was only an issue with D1, which was discontinued almost 15 years ago and was irrelevant for longer.
This isn't mentioned anywhere on the page, but fork is generally not a great API for these kinds of things. In a multi-threaded application, any code between the fork and exec syscalls should be async-signal-safe. Since the memory is replicated in full at the time of the call, the current state of mutexes is also replicated and if some thread was holding them at the time, there is a risk of a deadlock. A simple print! or anything that allocates memory can lead to a freeze. There's also an issue of user-space buffers, again printing something may write to a user-space buffer that, if not flushed, will be lost after the callback completes.
I actually really appreciate it when people do not hide their use of Claude code in their repo like that. It's usually the first thing I check on Show HN posts these days.
reply