Hacker Newsnew | past | comments | ask | show | jobs | submit | red75prime's commentslogin

Yep, there's nothing wrong about walled gardens. They might risk to become walled museums, but it's their choice.

Moderation is needed exactly because it's not a walled garden, but an open community. We need rules to protect communities.

Humans are no longer the only entities that produce code. If you want to build community, fine.

Generated code is not a new thing. It's the first time we are expected (by some) to treat code generators as humans though.

Imagine if you built a bot that would crawl github, run a linter and create PRs on random repos for the changes proposed by a linter - you'd be banned pretty soon on most of them and maybe on Github itself. That's the same thing in my opinion.


Why solar aficionados consistently forget about the night and intermittency in general? It's not like diurnal and seasonal storage will pop up for free and the energy grid will reorganize itself.

I can't believe I haven't seen this yet: https://blindsight.space/memories/

It never occurred to me to see if there was a film!

Mind you even more amazing I was on youtube yesterday and a short film showing the first chapter of the brand new book (published really recently) that I was reading popped up.

Now I see that there is not only that film (in the DUST series) but also a miniseries someone has made...

https://www.youtube.com/results?search_query=antimemetics+di...


> and let them run wild.

Yep, that's the most worrying part. For now, at least.

> The moment agents start sharing their embeddings

Embedding is just a model-dependent compressed representation of a context window. It's not that different from sharing a compressed and encrypted text.

Sharing add-on networks (LLM adapters) that encapsulate functionality would be more worrying (for locally run models).


Previously sharing compressed and encrypted text was always done between humans. When autonomous intelligences start doing it it could be a different matter.

What do you think the entire issue was with supply chain attacks of skills moltbook was installing? Those skills were downloading rootkits to steal crypto.

> Pro-LLM coding agents: look! a working compiler built in a few hours by an agent! this is amazing!

> Anti-LLM coding agents: it's not a working compiler, though. And it doesn't matter how few hours it took, because it doesn't work. It's useless.

Pro-LLM: Read the freaking article, it's not that long. The compiler made a mistake in an area where only two compilers exist that are up to the task: Linux Kernel.


Anthropic said they vibe-coded a C compiler that could compile the Linux kernel. That's what they said. No-one forced them to say that. They could have picked another code base.

It turns out that isn't true in all instances, as this article demonstrates. I'm not nearly expert enough to be able to decide if that error was simple, stupid, irrelevant, or whatever. I can make a call on whether it successfully compiled the Linux kernel: it did not.


I'm sorry for being excessively edgy, but "it's useless" is not a good summary for "linking errors after successfully compiling Linux kernel for x86_64."

> Because if it’s worth your time to lie, it’s worth my time to correct it.

https://www.astralcodexten.com/p/if-its-worth-your-time-to-l...


Anti-LLM: isn’t all this intelligence supposed to give us something better than what we already have?

Me: Top 0.02%[1] human-level intelligence? Sure. But we aren't there yet.

[1] There are around 8k programming languages that are used (or were used) in practice (that is, they were deemed better than existing ones in some aspects) and there are around 50 million programmers. I use it to estimate how many people did something, which is objectively better than existing products.


> Read the freaking article

The freaking article omits several issues in the "compiler". My bet is because they didn't actually challenged the output of the LLM, as it usually happens.

If you go to the repository, you'll find fun things, like the fact that it cannot compile a bunch of popular projects, and that it compiles others but the code doesn't pass the tests. It's a bit surprising, specially when they don't explain why those failures exist (are they missing support for some extensions? any feature they lack?)

It gets less surprising, though, when you start to see that the compiler doesn't actually do any type checking, for example. It allows dereferences to non-pointers. It allows calling functions with the wrong number of arguments.

There's also this fantastic part of the article where they explain that the LLM got the code to a point where any change or bug fix breaks a lot of the existing tests, and that further progress is not possible.

Then the fact that this article points out that the kernel doesn't actually link. How did they "boot it"? It might very well be possible that it crashed soon after boot and wasn't actually usable.

So, as usual, the problem here is that a lot of people look at LLM outputs and trust what they're saying they achieved.


The purpose of this project is not to create a state-of-the-art C compiler on par with projects that represent tens of thousands of developer-years. The goal is to assess the current capabilities of a largely autonomous software-building pipeline: it's not yet limitless, but better than it was. What a shocker.

I’ve had my share of build errors while compiling the Linux kernel for custom targets, so I wouldn’t be so sure that linker errors on x86_64 can’t be fixed with changes to the build script.


> The goal is to assess the current capabilities of a largely autonomous software-building pipeline: it's not yet limitless, but better than it was. What a shocker.

Of course, but we're trying to assess the capabilities by looking at the LLM output as if it were a program written by a person. If someone told me to check out their new C compiler that can build the kernel, I'd assume that other basic things, such as not compiling incorrect programs, are already pretty much covered. But with an LLM we can't assume that. We need to really check what's happening and not trust the agent's word for it.

And the reason why it's important it's because we really need to check whether it's actually "better than it was" or just "doing things incorrectly for longer". Let's say your goal was writing a gcc replacement. Does this autonomous pipeline get you closer? Or does it just get you farther away through the wrong path? Considering that it's full of bugs and incomplete implementations and cannot be changed without things breaking down, I'd say it seems to be the latter.


> not to burn books after you scan it

Shouldn't we blame copyright laws for that?


How would copyright law possibly compel the burning of books?

IANAL, I can only cite court decision: "And, the digitization of the books purchased in print form by Anthropic was also a fair use but not for the same reason as applies to the training copies. Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies."

You don't have to burn the book, but then you can't scan it, either.

In some places there's an exception to copyright law for format shifting if you destroy the original. If you don't destroy the original, then you made a copy and that's not allowed.


And the latest large models are predominantly LMMs (large multimodal models).

Sort of, but the images, video, and audio they have available are far more limited in range and depth than the textual sources, and it also isn't clear that most LLM textual outputs are actually drawing too much on anything learned from these other modalities. Most of the VLM setups are the other way around, using textual information to augment their vision capacities, and even further, most mostly aren't truly multi-modal, but just have different backbones to handle the different modalities, or are even just models that are switched between with a broader dispatch model. There are exceptions, of course, but it is still today an accurate generalization that the multimodality of these models is kind of one-way and limited at this point.

So right now the limitation is that an LMM is probably not trained on any images or audio that is going to be helpful for stuff outside specific tasks. E.g. I'm sure years of recorded customer service calls might make LMMs good at replacing a lot of call-centre work, but the relative absence of e.g. unedited videos of people cooking is going to mean that LLMs just fall back to mostly text when it comes to providing cooking advice (and this is why they so often fail here).

But yes, that's why the modality caveat is so important. We're still nowhere close to the ceiling for LMMs.


> Can ChatGPT access that information somehow?

Sure. Just like any other information. The system makes a prediction. If the prediction does not use sexual desires as a factor, it's more likely to be wrong. Backpropagation deals with it.


At the end I felt more like a plumber, connecting pipes and building adapters where pipes didn't match.

Randomness is not a problem by itself. Algorithms in BQP are probabilistic too. Different prompts might have different probabilities of successful generation, so refinement could be possible even for stochastic generation.

And provably correct one-shot program synthesis based on an unrestricted natural language prompt is obviously an oxymoron. So, it's not like we are clearly missing the target here.


>Different prompts might have different probabilities of successful generation, so refinement could be possible even for stochastic generation.

Yes, but that requires a formal specification of what counts as "success".

In my view, LLM based programming has to become more structured. There has to be a clear distinction between the human written specification and the LLM generated code.

If LLMs are a high level programming language, it has to be clear what the source code is and what the object code is.


I don't think framing LLMs as a "new programming language" is correct. I was addressing the point about randomness.

A natural-language specification is not source code. In most cases it's an underspecified draft that needs refinement.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: