clharman's comments

clharman · 2026-01-29T19:27:46 1769714866

I'm not sure it's astroturfed exactly; but the hype is not coming from technical professionals. Like you find a linkedin post with a thousand likes about this or similar projects, and everybody is either #opentowork or ~~Agentic Head of AI Brainstorming at My Bedroom~~

Also clawdbot is objectively a pretty inconvenient way to hook Claude Code up to a chat app. I made a bare-bones one that takes 2 minutes to run with npx: https://github.com/clharman/afk-code

indigodaddy · 2026-01-30T00:15:21 1769732121

So if I have CC running say on a VPS then that's where your thing needs to run too right?

clharman · 2026-01-30T01:04:56 1769735096

Correct!

clharman · 2025-07-28T19:37:54 1753731474

Pretty sure they are still losing money on it, which is great for us. And these limits wouldn't even be happening if there weren't people bragging about having their CC running constantly for 30 hours writing 2 million lines of (doubtless bad) code. And sharing accounts to try to get even MORE usage. It's all that swarm guy tbh and he's proud of it.

clharman · on Sept 14, 2023

For serious implementations, frameworks are not very helpful, even LangChain. All the components provide good SDKs/APIs, having a bunch of "integrations" doesn't add any real value.

If you know what you want to build, building from scratch is easier than you think. If you're tinkering on the weekend, then maybe the frameworks are helpful.

lmeyerov · on Sept 14, 2023

Yeah as soon as we write the word 'thread' or thinking about LLM API concurrency control across many user requests, all frameworks we tried are basically a wall instead of an accelerator. For a single user demo video on Twitter or a low-traffic streamlit POC to get a repo with lots of star gazers, they work quite well, and that's not far from what someone needs for an internal project with a small userbase. Just once this is supposed to be infra for production-grade software, the tools we have tried so far are still prioritizing features over being a foundation.

clharman · on Sept 14, 2023

Vector is better for some use cases (open-domain, more conversational data) and term-based search is better for others (closed-domain, more keyword-based).

I've found that internal enterprise projects tend to be very keyword based, and vector search often produces weird, head-scratcher results that users hate - whereas term-based search does a better job of capturing the right terms, if you do the proper synonym/abbreviation expansions.

That said, I use them both, usually with vector search as a fallback after the initial keyword-based RAG pass

clharman · on Sept 14, 2023

You need a vector db because all the vector db companies need customers...

You definitely do need information retrieval. It just shouldn't be limited to vector dbs. Unfortunately vector db companies and the VCs that back them have flooded the internet with propaganda suggesting vector db is the only choice. https://colinharman.substack.com/p/beware-tunnel-vision-in-a...

For most serious use cases, you'll have far too much data to fit into 1 (or several) inference contexts.

clharman · on Aug 19, 2023

Yeah that exact use case keeps coming up in my work - populating forms or structured documents based on retrievable information is possibly the best broad enterprise application category in terms of bang for buck!

clharman · on Aug 19, 2023

Interesting that now, in contrast, startups are desperate to claim any specificity at all for differentiation, with the general solutions being so powerful

clharman · on Aug 19, 2023

@rafaelero I'm working on a blogpost (https://colinharman.substack.com/) to demonstrate this fact since I get a lot of tiresome questions like "why don't you just train instead of retrieving"

Do you have any scripts you could share for the training/eval process? Would love to credit you in the post

clharman · on June 21, 2023

Vector search is not always the right tool. So why do VCs pretend that it is?