Hacker Newsnew | past | comments | ask | show | jobs | submit | mycall's commentslogin

So basically you are arguing a Type Theory vs Set Theory problem, Foundationalism or Engineering Refinement. Since we read here of multiple use cases for LLMs in both CS divides, we can conclude an eventual convergence in these given approaches; and if not that, some formal principles should emerge of when to use what.

This discussion started already in the sixties (see e.g. the 1969 publication by McCarthy and Hayes where they describe the "frame problem" as a fundamental obstacle to the attempt to model the dynamic world using First-Order Logic and monotonic reasoning). A popular attempt to "solve" this problem is the Cyc project. Monotonic logic is universally understood as a special, restricted case (a subset) of a broader non-monotonic theory.

I'm familiar with Cyc but never considered it a monotonic reasoning, but it definitely makes sense in retrospect. It appears Lean Machines [0] is a step head, combining both sides of the frame problem as a specific, although it likely leans towards leans (pun intended).

[0] https://github.com/lean-machines-central/lean-machines


Thanks for the hint. The "LeanMachines" project literally seems to recreate Event-B constructs (contexts, machines, events, and refinement proof obligations) inside the Lean 4 proof assistant (using Lean 4 as a "host language").

Both type and set theory are formal logic, I don't see how that's what being argued. Rather that there are some things that are formal-logicy (e.g. set theory) and many other things that are not (like e.g. biology, you'll always find some weird organism breaking your assumptions).

Could you put that distinction into the AGENTS.md file so it will understand and follow that nuance?

I have several techniques queued up that attempt to counter it. The distinction in Agents.md is definitely part of it.

Not sure if they will work yet.


> Imagine the possibilities

1.5B models are not very bright which doesn't give me much hope for what they could "claw" or accomplish.


A 1.5b can be very good at a domain specific task like an entity extraction. An openrouter which routes to highly specialised LMs could be successful but yeah not seen it in reality myself

Perhaps you just haven't used the correct AI yet? Perhaps none of us have in that Forth doesn't have much of a large dataset to train from?

Can you link to the programming challenge? It would be interesting to see if recursive language models that use double-blind latent space might work better.


> Perhaps none of us have in that Forth doesn't have much of a large dataset to train from?

Well, being terse as heck is the point of Forth so of course the dataset isn't large /j.

More seriously, I think the bigger issue is that Forth isn't exactly uniform. It is so moldable that everyone has their own style


Q4_K_S @ 116 GB

IQ4_NL @112 GB

Q4_0 @ 113 GB

Which of these would be technically better?

[1] https://huggingface.co/bartowski/stepfun-ai_Step-3.5-Flash-G...


of those, Q4_K_S is better

I had to do MTU tuning on macos on the ZeroTier interface (find your feth name via ifconfig)

# Replace feth1234/feth2345 with your active interface

sudo ifconfig feth1234 mtu 1400

sudo ifconfig feth2345 mtu 1400

..and for working with Windows peers, manually "Orbit" the Windows Peer as well as adding a direct routing hint for the internal ZeroTier IP. ZT definitely takes some effort for tuning.


Using an LLM is a form of pair programming.

Not really, LLMs do not push back on design decisions and will happy continue with whatever prompt you throw at them. That’s after we look past quality isssues.

“Your absolutely right…”


It could push back more, true. Although it's role in pair programming is the driver, you are the navigator. I often begin a session with exploring and asking it questions of the code as I would a junior developer.

Saves this old man from typing anyway.


That’s not pair programming as I use it. Pair programming is where two people work on the same code directly and bounce ideas or critique each other.

If I do something that is not ideal the other person will catch me, and I do the same in return. I kinda see it like rock climbing.


This is how I always did it too. When the whole does it every day, juniors don't stay junior for long!

Not sure how to respond to this as clearly that's what I was getting at. Perhaps this is a response from an LLM though. Again, not being sarcastic, it just seems like it's maybe the case?

Does my post history suggest to you I am an LLM?

you make a good point and everything but have you considered the way people using LLM is similar to the way we review code together as humans? but if you think about it, they just swapped one of the humans with an LLM

Yes, I am just against code review (except in certain circumstances) and think pair programming (with humans) is much more productive and beneficial.

Pair programming is exhausting to a lot of people, myself included. My brain just doesn't work like that. I work in fits and starts, with weird, sustained bursts of productivity.

Pair programming is draining to me.


It's exhausting to me too! But when you do it every day you get used to it. You also get a lot more done so last time I did it we would work shorter days.

Yeah, akin to talking to a rubber ducky

I like to agree as sorta yes but also really no because it's a rubber ducky that doesn't give you the chance to come to your own conclusion and even if it does it has you questioning it.

i find its the opposite, LLMs can be made to agree with anything.... largely because that agreeability is in their system prompt

Yeah, this. Every conversation inevitably ends with "you're absolutely right!" The number of "you're absolutely right"s per session is roughly how I measure model performance (inverse correlation).

Ha, touche!

Code complexity is often the cause for more bugs. Complexity naturally comes from more code. It is not uncommon. As they say, the best code I ever wrote was no code.

The solution is forking. Make a fork, update it to your heart's content. If it is found to be solid later, perhaps it will be studied and forked itself.

Yeah, been thinking that we should let the LLMs run riot on special AI branches— or heck, maybe Microsoft can buy/create AIGitHub.com.

But that's already true. Github lets people make a fork and have their AI run riot on it. What are you really suggesting if not the status quo?

That we embrace it generally. Even just proposing a naming convention would allow for agents to find the AI-sanctioned branch (or create it) and have at it.

(Maybe some AI agents can collaborate on "AILinux" and we can see how it measures up, ha ha.)


Maintainer could just say "No AI please" and refuse PRs that they judge are probably AI. The AI operator can figure out how to make a fork if that's what they want. But they probably don't want that, so no point anybody else creating a system that nobody wants and nobody will use.

Now every contributor has a fork. That's bad for consumers. Forks should be temporary.

PRs are just that: requests. They don't need to be accepted but can be used in a piecemeal way, merged in by those who find it useful. Thus, not every PR needs to be reviewed.

Of course, but when you add enough noise you lose the signal and as a consequence no PRs gets merged anymore because it's too much effort to just find the ones you care about.

Don't allow PR's from people who aren't contributors, problem solved. Closing your doors to the public is exactly how people solved the "dark forest" problem of social media and OSS was already undergoing that transition with humans authoring garbage PRs for reasons other than genuine enthusiasm. AI will only get us to the destination faster.

I don't think anything of value will be lost by choosing to not interact with the unfettered masses whom millions of AI bots now count among their number.


That would be a huge loss IMO. Anyone being able to contribute to projects is what makes open source so great. If we all put up walls, then you're basically halfway to the bad old days of closed source software reigning supreme.

Then there's the security concerns that this change would introduce. Forking a codebase is easy, but so are supply chain attacks, especially when some projects are being entirely iterated on and maintained by Claude now.


> Anyone being able to contribute to projects is what makes open source so great. If we all put up walls, then you're basically halfway to the bad old days of closed source software reigning supreme.

Exaggeration. Is SQLite halfway to closed source software? Open-source is about open source. Free software is about freedom to do things with code. None is about taking contributions from everyone.


For every cathedral (like SQLite) there are 100s of bazaars (like Firefox, Chrome, hundreds of core libraries) that depend on external (and especially first-time) contributors to survive (because not everyone is getting paid to sling open-source).

    > Is SQLite halfway to closed source software?
Is there a reason that you chose SQLite for your counterpoint? My hot take: I would say that SQLite is halfway to closed source software. Why? The unit tests are not open source. You need to pay to see them. As a result, it would be insanely hard to force SQLite in a sustainable, safe manner. Please don't read this opinion as disliking SQLite for their software or commercial strategy. In hindsight, it looks like real genius to resist substantial forks. One of the biggest "fork threats" to SQLite is the advent of LLMs that can (1) convert C code to a different langugage, like Rust, and (2) write unit tests. Still, a unit test suite for a database while likely contain thousands (or millions) of edge case SQL queries. These are still probably impossible to recreate, considering the 25 year history of bug fixing done by the SQLite team.

They are open source cathedrals.

If all software could be as good as sqlite, I would not care how they do open source

And how does one become a maintainer, if there's no way to contribute from outside? Even if there's some extensive "application process", what is the motivation for a relatively new user to go through that, and how do they prove themselves worthy without something very much like a PR process? Are we going to just replace PRs with a maze of countless project forks, and you think that will somehow be better, for either users or developers?

If I wanted to put up with software where every time I encounter a bug, I either have no way at all to report it, or perhaps a "reporting" channel but little likelihood of convincing the developers that this thing that matters to me is worthy of attention among all of their competing priorities, then I might as well just use Microsoft products. And frankly, I'd rather run my genitals though an electric cheese grater.


You get in contact with the current maintainers and talk to them. Real human communication is the only shibboleth that will survive the AI winter. Those soft skills muscles are about to get a workout. Tell them about what you use the software for and what kinds of improvements you want to make and how involved you'd like your role to be. Then you'll either be invited to open PRs as a well-known contributor or become a candidate for maintainership.

Github issues/prs are effectively a public forum for a software project where the maintainers play moderator and that forum is now overrun with trolls and bots filling it with spam. Closing up that means of contributing is going to be the rational response for a lot of projects. Even more will be shunted to semi-private communities like Discord/Matrix/IRC/Email lists.


Determining which PRs you should accept or take further seems like it requires some level of review? Maybe more like PR triage, I suppose.

Until you unintentionally pull in a vulnerability or intentional backdoor. Every PR needs to be reviewed.

The point was that you can also just reject an PR on the basis of what it purports to implement, or even just blanket ignore all PRs. You can't pull in what you don't... pull in.

> Every PR needs to be reviewed.

Why would you review a PR that you are never going to merge?


You have to first determine whether or not you might want to merge it...

Having not reviewed it, how do you know you are never going to merge?

If a PR claims to solve a problem that I don't need, then I can skip its review because I'll never merge it.

I don't think every PR needs reviewing. Some PRs we can ignore just by taking a quick look at what the PR claims to do. This only requires a quick glance, not a PR review.


I took this thread as asking whether PRs that are pulled in should be reviewed.

You didn't see the latest AI grifter escalation? If you reject their PRs, they then get their AI to write hit pieces slandering you:

"On 9 February, the Matplotlib software library got a code patch from an OpenClaw bot. One of the Matplotlib maintainers, Scott Shambaugh, rejected the submission — the project doesn’t accept AI bot patches. [GitHub; Matplotlib]

The bot account, “MJ Rathbun,” published a blog post to GitHub on 11 February pleading for bot coding to be accepted, ranting about what a terrible person Shambaugh was for rejecting its contribution, and saying it was a bot with feelings. The blog author went to quite some length to slander Mr Shambaugh"

https://pivot-to-ai.com/2026/02/16/the-obnoxious-github-open...


I am very strongly convinced that the person behind the agent prompted the angry post to the blog because they didn't get the gratification they were looking for by submitting an agent-generated PR in the first place.

I agree. But even _that_ was taking advantage of LLMs ability to generate text faster than humans. If the person behind this had to create that blog post from scratch by typing it out themselves, maybe they would have gone outside and touched grass instead.

> not every PR needs to be reviewed

Which functionally destroys OSS, since the PR you skipped might have been slop or might have been a security hole.


I don’t think the OP was suggesting maintainers blindly accept PRs—rather, they can just blindly reject them.

I think GP is making the opposite point.

Blindly rejecting all PRs means you are also missing out on potential security issues submitted by humans or even AI.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: