Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

it's not about the terminal, but about decoupling yourself from looking at the code. The Claude app lets you interact with a github repo from your phone.


This is not the way

these agents are not up to the task of writing production level code at any meaningful scale

looking forward to high paying gigs to go in and clean up after people take them too far and the hype cycle fades

---

I recommend the opposite, work on custom agents so you have a better understanding of how these things work and fail. Get deep in the code to understand how context and values flow and get presented within the system.


> these agents are not up to the task of writing production level code at any meaningful scale

This is obviously not true, starting with the AI companies themselves.

It's like the old saying "half of all advertising doesn't work; we just don't which half that is." Some organizations are having great results, while some are not. From the multiple dev podcasts I've listened to by AI skeptics have had a lightbulb moment where they get AI is where everything is headed.


Not a skeptic, I use AI for coding daily and am working on a custom agent setup because, through my experience for more than a year, they are not up to hard tasks.

This is well known I thought, as even the people who build the AIs we use talk about this and acknowledge their limitations.


I'm pretty sure at this point more than half of Anthropic's new production code is LLM-written. That seems incompatible with "these agents are not up to the task of writing production level code at any meaningful scale".


how are you pretty sure? What are you basing that on?

If true, could this explain why Anthropics APIs are less reliable than Gemini's? (I've never gotten a service overloaded response from Google like I did from Anthropic)


Quoting a month old post: https://www.lesswrong.com/posts/prSnGGAgfWtZexYLp/is-90-of-c...

  My current understanding (based on this text and other sources) is:
  - There exist some teams at Anthropic where around 90% of lines of code that get merged are written by AI, but this is a minority of teams.
  - The average over all of Anthropic for lines of merged code written by AI is much less than 90%, more like 50%.
> I've never gotten a service overloaded response from Google like I did from Anthropic

They're Google, they out-scale everyone. They run more than 1.3 quadrillion tokens per month through LLMs!


You cannot clean up the code, it is too verbose. That said, you can produce production ready code with AI, you just need to put up very strong boundaries and not let it get too creative.

Also, the quality of production ready code is often highly exaggerated.


I have AI generated, production quality code running, but it was isolated, not at scale or broad in view / spanning many files or systems

What I mean more is that as soon as the task becomes even moderately sized, these things fail hard


> these agents are not up to the task of writing production level code at any meaningful scale

I think the new one is. I could be the fool and be proven wrong though.


It's marginally better, no where close to game changing, which I agree will require moving beyond transformers to something we don't know yet


Interesting. Tell me more.


https://apps.apple.com/us/app/claude-by-anthropic/id64737536...

Has a section for code. You link it to your GitHub, and it will generate code for you when you get on the bus so there's stuff for you to review after you get to the office.


Thanks. Still looking for some kind of total code by phone thing.


The app version is iPhone only, you don’t get Code in the Android app, you have to use a web browser.

I use it every day. I’ll write the spec in conversation with the chatbot, refining ideas, saying “is it possible to …?” Get it to create detailed planning and spec documents (and a summary document about the documents). Upload them to Github and then tell Code to make the project.

I have never written any Rust, am not an evangelist, but Code says it finds the error messages super helpful so I get it to one shot projects in that.

I do all this in the evenings while watching TV with my gf.

It amuses me we have people even this thread claiming what it already does is something it can’t do - write working code that does what is supposed to.

I get to spend my time thinking of what to create instead of the minutiae of “ok, I just need 100 more methods, keep going”. And I’ve been coding since the 1980 so don’t think I’m just here for the vibes.



Can you run the apps without going through Apple? Do you need a developer account?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: