Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I took a look at a project I maintain[0], and wow. It's so wrong in every section I saw. The generated diagrams make no sense. The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

I hope actual users never see this. I dread thinking about having to go around to various LLM generated sites to correct documentation I never approved of to stop confusing users that are tricked into reading it.

[0]: https://deepwiki.com/blopker/codebook



I just tried it on several of my repos and I was rather impressed.

This is another one of those bizarre situations that keeps happening in AI coding related matters where people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently.


> at the same thing

But you’re not looking at the same thing — you’re looking at two completely different sets of output.

Perhaps their project uses a more obscure language, has a more complex architecture, resembles another project that’s tripping up the interpretation of it. You have have excellent results without it being perfect for everything. Nothing is perfect and it’s important for people making these things to know how, right?

In my career I’ve never seen such aggressive dismissal of people’s negative experiences without even knowing if their use case is significantly different.


Which repos worked well? I've had the same experience as op- unhelpful diagrams and bad information hierarchy. But I'm curious to see examples of where it's produced good output!


> people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently

React vs other frameworks (or no framework). Object oriented vs functional. There's loads of examples of this that predate AI.


I dont think it's quite the same. The cases you mention are more like two alternative but roughly functionally equivalent things. People still argue and use both, but the argument is different. Even if people don't explicitly acknowledge it, at some level they understand it's a difference in taste.

This feels to me more like the horses vs cars thing, computers vs... something (no computers?), crypto vs "dollar-pegged" money, etc. It's deeper. I'm not saying the AI people are the "car" people, just that...there will be one opinion that will exist in 5-20 years, and the other will be gone. Which one... we'll see.


> People still argue and use both, but the argument is different

React vs no framework is at least in the same ballpark as AI vs no AI. Some people are determined to prove to the world that React/AI/functional programming solves everything. Some people are determined to prove the opposite. Most people just quietly use them without feeling like they need to prove anything.


This is such an apples to oranges comparison that it makes me suspicious of your motives here.

Bad documentation full of obvious errors and nonsense is very different to having an opinion on OO vs Functional programming.

Even that sentence sounds insane because who would ever compare the two?!


You could link your docs so we can compare them to OP's docs.

No need to guess.



I have a fairly large code base that has been developed over a decade that deepwiki has indexed. The results are mixed but how they are mixed gives me some insight into deepwiki's usefulness.

The code base has a lot of documentation in the form of many individual text files. Each describe some isolated aspect of the code in dense, info-rich and not entirely easily consumable (by humans) detail. As numerous as these docs are, the code has many more aspects that lack explicit documentation. And there is a general lack of high-level documentation that tie each isolated doc into some cohesive whole.

I formed a few conclusions about the deepwiki-generated content: First, it is really good where it regurgitates information from the code docs while being rather bad or simply missing for aspects not covered by the provided docs. Second, deepwiki is so-so for providing a high layer of documentation that sort of ties things together. Third, it is highly biased about the importance of various aspects by their code docs coverage.

The lessons I take from this are: deepwiki does better ingesting narrative than code. I can spend less effort on polishing individual documentation (not worrying about how easy it is for humans to absorb). I should instead spend that effort to fill in gaps, both details and to provide higher-level layers of narrative to unify the detailed documentation. I don't need to spend effort on making that unification explicit via sectioning, linking, ordering, etc as one may expect for a "manual" with a table of contents.

In short, I can interpret deepwiki's failings as identifying gaps that need filling by humans while leaning on deepwiki (or similar) to provide polish and some gap putty.


If documenting the why rather than the how you often end up tying high level concepts together.

E.g. If you describe how the user service exists you wont necessarily capture where it is used.

If you document why the user service exists you will often mention who or what needs it to exist, the thing that gives it a purpose. Do this throughout and everything ends up tied together at a higher level.


> The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

The point of the wiki is to help people learn the codebase so they can possibly contribute to the project, not for end users. It absolutely should explain implementation details. I do agree that it goes overboard with the diagrams. I’m curious, I’ve seen other moderately sized repo owners rave about how DeepWiki did very well in explaining implementation details. What specifically was it getting wrong about your code in your case? Is it just that it’s outdated?


I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.


>I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.

There is a folder for a VS Code extension here[0]. It seems to have a README with installation instructions. There is also an extension.ts file, which seems to me to be at least the initial prototype for the extension. Did you forget that you started implementing this?

[0] https://github.com/blopker/codebook/blob/c141f349a10ba170424...


This thread should end up in the hall of fame, right next to the Dropbox one.

From a fellow LLM-powered app builder, I wish you best of luck!


Yeah, this is a thread worth saving. Even just as an example of multiple people who can't read as well as an LLM.


Plot twist, OP has a doc mentioning it as unreleased.


In that folder is CHANGELOG.md[0] that indicates that this is unreleased. I'd say that including installation instructions for an unreleased version of the extension is exactly the issue that is being flagged.

[0] https://github.com/blopker/codebook/blob/main/vscode-extensi...


You are going to want to reread the file you are quoting buddy. That changelog is indicative that the extension has been released. The Unreleased section seems to list features that are not yet included in the released version of the VS Code extension, and the future plans are features that have not been developed yet.


here the maintainer says it doesn't exist. there's basically no way another interpretation is "more correct". presence or files can be not intended for use, deprecated, internal, WIP, etc. this is why we need maintainers.


Maintainers are not gods, and don't get to rewrite plainly true facts. In the Changelog, it actually says it is a "Initial release of Codebook VS Code extension".


compared to an llm they are an authoritative source...


What a plot twist


It’s funny, I accidentally put a link to the commit instead of the current repo file because I was investigating whether or not he committed it versus he recently took over the project and didn’t realize the previous owner had started one. But he is the one who actually committed the code. I guess LLMs are so good now that they’re stopping developers from hallucinating about code they themselves wrote.


I brought up this issue because I thought it illustrated my previous points nicely.

Yes, there is a VS Code folder in that repo. However, it doesn't exist as an actual extension. It's an experiment that does not remotely work.

The LLM generated docs has confidently decided that not only does it exist, but it is the primary installation method.

This is wrong.

Edit: I've now had to go into the Readme of this extension to add a note to LLMs explicitly to not recommend it to users. I hate this.


Is it possible that a random person who discovered your repo from Google search would make the same mistake the LLM did and assume it works and not realize it was an unfinished experiment?


Yes, and so the value of the persons opinions on the repo is low. Far lower than real documentation written by someone who knows more, that would not have made that mistake.

The value proposition here is that these llm docs would be useful, however in this case they were not.


>Far lower than real documentation written by someone who knows more, that would not have made that mistake.

But his own documentation did said that there was a VSCode extension, with installation instructions, a README, changelog, etc. From what he said, it doesn't even compile or remotely work. It would be extremely aggravating to attempt to build the project with the maintainer's own documentation, spend an hour trying to figure out what's wrong, and then contact the maintainer for him to say, "oh yeah, that documentation not correct, that doesn't even compile even though I said it did 2 months ago lol." It is extremely ironic that he is so gungho about DeepWiki getting this wrong.


Yes, this is my point. It seems like the creator was a little bit lazy to create such a full fledged readme.md with so much polish but -entirely neglect to mention the whole thing is broken and unfinished-.

That seems about as annoying as a random wiki mis-explaining your system.

That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private.


This.

The WIP code was committed with the expectation that very few people would see it because it was not linked anywhere in the main readme. It's a calculated risk, so that the code wouldn't get out of date with main. The risk changed when their LLM (wrongly) decided to elevate it to users before it was ready.

It's clear DeepWiki is just a sales funnel for Devin, so all of this is being done in bad faith anyway. I don't expect them to care much.


>That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private

This is true, and the only reason for this was more so his dismissive view of DeepWiki than a criticism of the project itself or of the author as a programmer. LLMs hallucinate all the time, but there is usually a method to the way they do so. Particularly, for it to just say a repo had a VSCode extension portion with nothing pointing to it would not be typical at all for an LLM like DeepWiki.


Wow. Better advertisement for LLM in three comments than anything OpenAI could come up with.


It might be internal, unfinished, a prototype, in testing and not yet for public use. It might exist but do something else.

This is not an ad for LLMs. If you think this is good, you should probably not ever touch code that humans interact with.


I fear the consequences will be even darker:

- Users are confused by autogenerated docs and don’t even want to try using a project because of it

- Real curated project documentation is no longer corrected by users feedback (because they never reach it)

- LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)


> LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)

On this, I think, we should have some kind of AI-generated meta-tag, like this: https://github.com/whatwg/html/issues/9479


I wonder what incentives for adherence to the use of this meta-tag might exist? For example, imagine I send you my digital resume and it has an AI-generated footer tag on display? Maybe a bad example- I like the idea of this in general, but my mind wanders to the fact that large entities completely ignored the wishes of robots.txt when collecting the internet's text for their training corpuses


Large entities aside, I would use this to mark my own generated content. Would be even more helpful if you could get the LLM to recognise it which would allow you to prevent ouroboros situations.

Also, no one is reading your resume anymore and big corps cannot be trusted with any rule as half of them think the next-word-machine is going to create God.


I went to the lodash docs and asked about how I'd use the 'pipeline' operator (which doesn't exist) and it correctly pointed out that pipeline isn't a thing, and suggested chain() for normal code and flow() for lodash fp instead. That's pretty much spot on. If I was guessing I'd suggest that the base model has a lot more lodash code examples in the training data, which probably makes a big difference to the quality of the output.


The lack of a pipeline operator in JS (and JS libraries like lodash) has also been discussed online a lot.


Exactly the point. If there's a lot of data in the training set the results will be better.


I guess I'm trying to emphasize the distinction between information in the repo (code) vs. information elsewhere (discussions) that the model looks at.


> It's so wrong in every section I saw.

Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.

For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.


At some point it will be less wrong (TM) and it'll be helpful. Feels generally like a good bet.


Will it though?

Fundamentally this is an alignment problem.

There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.

When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.

Do we have any reason to believe alignment will be solved any time soon?


Why should this be an issue? We are producing more and more correct training data and at some point the quality will be sufficient. To me its not clear what speaks against this.


Look up AI safety and THE aligment problem.

This isnt a matter of training data quality.


We don’t expect 100% reliability from humans-humans will slack off, steal, defraud, harass each other, sell your source code to a foreign intelligence service, turn your business behind your back into a front for international drug cartels-some of that is very low probability, but never zero probability-so is it really a problem if we can’t reduce the probability to literally zero for AIs either?


Humans have incentives to not do those things. Family. Jail. Money. Food. Bonuses. Etc.

If we could align an AI with incentives in the same way we can a person then youd have a point.

So far alignment research is hitting dead ends no matter what fake incentives we try to feed an AI.


Can you remind me of the link between alignment and writing accurate documentation? Honestly don't understand how they are linked.


You want the ai aligned with writing accurate documentation, not aligned with a goal thats near but wrong, e.g. writing accurate sounding documentation.


I tried it on a big OCaml project (https://deepwiki.com/libguestfs/virt-v2v) and it seems correct albeit very superficial. It helps that the project is extensively documented and the code well commented, because my feeling is that it's digesting those code comments along with the documentation to produce the diagrams. It seems decent as a starting point to understanding the shape of the project if I'd never seen it before. This is the sort of thing you could do yourself but it might take an hour or more, so having it done for you is a productivity gain.


Likewise, I tested this with a project we're using at work (https://deepwiki.com/openstack/kayobe-config) and at first it seems rather impressive until you realize the diagrams don't actually give any useful understanding of the system. Then, asking it questions, it gave useful seeming answers but which I knew were wholly incorrect. Worse than useless: disorienting and time-wasting.


> I hope actual users never see this

I have bad news for you, this website has been appearing near the top of the search results for some time now. I consciously avoid clicking on it every time.


Please don't correct the AI documentation. Just let those projects die as they deserve.


What model did you use?


> The text sections take implementation details that don't matter and present them to the user like they need to know them.

Yeah this seems to be a recurring issue on each of the repos I've tried. Some occasionally useful tables or diagrams buried in pages of distracting irrelevant slop.


This is made by “Devin” I believe.


they will

it's the first result on google for just about anything technical I search for




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: