Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One thing I think people confuse with context is they see an LLM has say 400k context and think their codebase is way bigger than that, how can it possibly work. Well, do you hold a 10 million line codebase in your head at once? Of course not. You have an intuitive grasp of how the system is built and laid out, and some general names of things, and before you make a change, you might search through the codebase for specific terms to see what shows up. LLMs do the same thing. They grep through the codebase and read in only files with interesting / matching terms and only the part of the file thats relevant, in much the same way you would open a search result and only view the surrounding method or so. The context is barely used in these scenarios. Context is not something that’s static, it’s built dynamically as the conversation progresses via data coming from your system (partially through tool use).

I frequently use LLMs in a VS Code workspace with around 40 repos, consisting of microservices, frontends, nuget and npm packages, IaC, etc. altogether its many millions of lines of code. and I can ask it questions about anything the codebase and it has no issues managing context. I do not even add files manually to context (this is worse actually because it puts the entire file into context even if it’s not all used). I just refer to the files by name and the LLM is smart enough to read them in as appropriate. I have a couple JSON files that are megs of configuration, and I can tell it to summarize / extract examples out of those files and it’ll just sample sections to get an overview.



Yes, I do have a map of the code in my head of any code base I work on. I know where most of the files are of the main code paths and if you describe the symptoms of a bug I can often tell you the method or even the line that's probably causing it if it's a 'hot' path.

Isn't that what we mean by 'learning' a codebase? I know my ability is supercharged compared to most devs, but most colleagues have it to some extent and I've met some devs with an even more impressive ability for it than me so it's not like I'm a magic unicorn. Ironically, I have a terrible memory for a lot of other things, especially 'facts'.

You can sorta make a crappy version of that for AI agents with agent files and skills.


There’s a company called driver.ai whose idea is to parse your codebase and provide the “map” (navigation of code structure and connectivity) to LLMs. (I haven’t tried it.)


> You have an intuitive grasp of how the system is built and laid out,

Because they are human, intuition is a human trait, not an LLM code grinder trait.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: