Hacker Newsnew | past | comments | ask | show | jobs | submit | sampullman's commentslogin

If natural language is used to specify work to the LLM, how can the output ever be trusted? You'll always need to make sure the program does what you want, rather than what you said.

>"You'll always need to make sure the program does what you want, rather than what you said."

Yes, making sure the program does what you want. Which is already part of the existing software development life cycle. Just as using natural language to specify work already is: It's where things start and return to over and over throughout any project. Further: LLM's frequently understand what I want better than other developers. Sure, lots of times they don't. But they're a lot better at it than they were 6 months ago, and a year ago they barely did so at all save for scripts of a few dozen lines.


That's exactly my point, it's a nice tool in the toolbox, but for most tasks it's not fire-and-forget. You still have to do all the same verification you'd need to do with human written code.

Just create a very specific and very detailed prompt that is so specific that it starts including instructions and you came up with the most expensive programming language.

It's not great that it's the most expensive (by far), but it's also by far the most expressive programming language.

How is it more expressive? What is more expressive than Turing completeness?

This is a non-sequitur. Almost all programming languages are Turing complete, but I think we'd all agree they vary in expressivity (e.g. x64 assembly vs. TypeScript).

By expressivity I mean that you can say what you mean, and the more expressive the language is, the easier that is to do.

It turns out saying what you mean is quite easy in plain English! The hard part is that English allows a lot of ambiguity. So the tradeoffs of how you express things are very different.

I also want to note how remarkable it is that humans have built a machine that can effectively understand natural language.


You trust your natural language instructions thousand times a day. If you ask for a large black coffee, you can trust that is more or less what you’ll get. Occasionally you may get something so atrocious that you don’t dare to drink, but generally speaking you trust the coffee shop knows what you want. It you insist on a specific amount of coffee brewed at a specific temperature, however, you need tools to measure.

AI tools are similar. You can trust them because they are good enough, and you need a way (testing) to make sure what is produced meet your specific requirements. Of course they may fail for you, doesn’t mean they aren’t useful in other cases.

All of that is simply common sense.


More analogy.

What’s to stop the barista putting sulphuric acid in your coffee? Well, mainly they don’t because they need a job and don’t want to go to prison. AIs don’t go to prison, so you’re hoping they won’t do it because you’ve promoted them well enough.


* prompted

> All of that is simply common sense.

Is that why we have legal codes spanning millions of pages?


The person I'm replying to believes that there will be a point when you no longer need to test (or review) the output of LLMs, similar to how you don't think about the generated asm/bytecode/etc of a compiler.

That's what I disagree with - everything you said is obviously true, but I don't see how it's related to the discussion.


I don't necessarily think we'll ever reach that point and I'm pretty sure we'll never reach that point for some higher risk applications due to natural language being ambiguous.

There are however some applications where ambiguity is fine. For example, I might have a recipe website where I tell a LLM to "add a slider for the user to scale the number of servings". There's a ton of ambiguity there but if you don't care about the exact details then I can see a future where LLMs do something reasonable 99.9999% of the time and no one does more than glance at it and say it looks fine.

How long it is until we reach that point and if we'll ever reach that point is of course still up for debate, but I dnt think it's completely unrealistic.


That's true, and I more or less already use it that way for things like one off scripts, mock APIs, etc.

I don't think the argument is that AI isn't useful. I think the argument is that it is qualitatively different from a compiler.

I did not find this to be the case, except with a few low quality vendors we ended up dropping.

It was mostly the same as anywhere else, you go talk to them in person, tour their facilities/processes, and see what else they've built.

I was warned strongly about IP theft and cost cutting, but didn't find that expectation quite met reality. It may have been that our products were mostly un-copyable, and we specified everything precisely, or were just lucky.


I did this for a while, it's pretty good but I occasionally came across dependencies that were difficult to install in containers, and other minor inconveniences.

I ended up getting a mini-PC solely dedicated toward running agents in dangerous mode, it's refreshing to not have to think too much about sandboxing.


I totally agree with you. Running a cheapo mac mini with full permissions with fully tracked code and no other files of importance is so liberating. Pair that with tailscale, and being able to ssh/screen control at any time, as well as access my dev deployments remotely. :chefs kiss:


why a mac mini rather than a cloud vps


I use a new Ryzen based mini PC instead of Mac mini, but the reasoning is the same. For the amount of compute/memory it pays for itself in less than a year, and the lower latency for ssh/dev servers is nice too.


One less company to give your code to.


Regular WD40 should not be used as bearing lubricant!


Exactly!


nit: the last letter of PCB stands for "board"


Do you do this in a professional setting? I'm curious because I did some embedded/uC work about 10 years ago and considering the state of C/C++ SDKs (and IDE support) at the time, I would have expected it to take decades for Rust to get a foothold.


Yep; do it at work (Security-related sensors for a DoD contractor) as well as my own small business. It doesn't have a foothold, and may not ever; we will see. I think a lot of the embedded rust content you see online is makers who are more interested in doing tricks with the ownership system and Async. So, I am an exception, but... I do recommend this workflow despite its lack of popularity!

I just like rust for the overall language and tooling. (For example, the workflow I described above); don't really care about the memory safety aspect to the degree it's often presented.

The biggest downside is I have to do a lot of leg work which wouldn't be required if done in C or C++. E.g. implementing hardware interfaces from datasheets and RMs. Sometimes there will be a Rust lib available, but in my experience they are rarely in a usable state.


As a hobbyist who's written and is working on a couple of async HALs my take is that Rust is well suited to embedded work but yeah there are hurdles. It's immature so while things like Embassy are a joy to work with, they're missing a lot of (sometimes seemingly basic) features.


The mainland and Hong Kong still have significantly different visa policies. I'm not sure if it's changed at all since the handover, except for mainlanders entering HK.


Yes I know, but HK is still part of the PRC and there are people who cannot travel to the PRC.

It's not about (in)ability to obtain a visa.


If I was letting some random person rent one of my servers without oversight, I'd sure want to see some ID first.


I don't think it's naive, I often do the same in Vue. A pretty useful subset of vue-router can be implemented in less than a tenth of the bundle size.


I don't think 1, 2, and 4 apply to many high profile open source projects.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: