Hacker Newsnew | past | comments | ask | show | jobs | submit | SamDc73's commentslogin

I highly suspect he might even consider Anthropic since they enforced restrictions at some point on OpenClaw form using there APIs

yes that's the blunder I'm talking about

I switched from using YouTube to invidious mainly because they don't support shorts and blocked YouTube on the DNS level, it's a bit slower, but I know I won't be sucked into doom-scrolling

I mean they are only running a small version of codex can they run the full one? Or the technology isn't there yet?

1000 tokens/sec for a highly specialised model is where we are going to see agents requiring.

Dedicated knowledge, fast output, rapid iteration.

I have been trying out SMOL models as coding models don't need to the full corpus of human history.

My most recent build was good but too small.

I am thinking of a model that is highly tuned to coding and agentic loops.


This is model 12188, which claims to rival SOTA models while not even being in the same league.

In terms of intelligence per compute, it’s probably the best model I can realistically run locally on my laptop for coding. It’s solid for scripting and small projects.

I tried it on a mid-size codebase (~50k LOC), and the context window filled up almost immediately, making it basically unusable unless you’re extremely explicit about which files to touch. I tested it with a 8k context window but will try again with 32k and see if it becomes more practical.

I think the main blocker for using local coding models more is the context window. A lot of work is going into making small models “smarter,” but for agentic coding that only gets you so far. No matter how smart the model is, an agent will blow through the context as soon as it reads a handful of files.


The small context window has been a recognized problem for a while now. Really only Google has the ability to use a good long context window


you should look into using subagents, which each have their own context window and don't pollute the main one


What are you talking about? Qwen3-Coder-Next supports 256k context. Did you wanted to say that you don't have enough memory to run it locally yourself?


Yes!

I tried to go as far as 32k on the context window but beyond that it won't be usable on my laptop (Ryzen AI 365, 32gb RAM and 6gb of VRAM)


You need minimum ie. 2x 24G GPUs for this model (you need 46GB minimum).


I was waiting for someone to say "this is what happens when you vibe code"



https://codose.ai

Still experimenting with different ways to make learning easier using LLMs.

I put together Codose as a tool where you paste a link to an Exercism or LeetCode problem, and it spins up a code editor with an AI tutor that walks you through the solution step by step, with mini lessons along the way when you need them.

You can try it without signing up but I’m on the Google AI Studio free tier right now, so I’m not sure how many uses can it handle


Nice one.

But the graph view seems broken?

And for the data, it would be nice to have the original URL for each comment as a reference.


Thanks! Is it that you can't see the graph view at all? It uses DeckGL which might be a bit too fancy and may not work on all browsers possibly. It works on Chrome/Safari on my iPhone and Chrome on Arch.

I didn't realize that you actually could provide a working link back to Hacker News but it seems HN does support that. Thanks, I will give that a try!


https://imgur.com/a/EmtADiA

This how the graph looks like, not clustered by tag or anything ... (I was expecting a view like Logseq or Obsidian)


Oh ok. Yeah, it is just not really complete yet. What this does is displays a UMAP of the embeddings for the selected feature and you can mouse over for labels. It isn't identifying clusters itself yet, finding exemplars, etc. Right now, points that are close to each other are just semantically close - as judged by UMAP.

That being said, there isn't much data for this month yet. If you look at last month's data (https://nthesis.ai/public/e4883705-ec05-4e7a-83ac-6b878cc1e8...) , clusters are more apparent (particularly if you view the tags instead of summary).


Nice project.

One thing I’ve always disliked about RSS (and this could actually fix it) is duplicates. When a new LLM model drops for example there are like ~5 blogs about it in my RSS feed saying basically the same thing, and I really only need to read one. Maybe you could collapse similar articles by topic?

Also, would be nice to let users provide a list of feed URLs as a variable instead of hardcoding them.


Looking at the console messages with the LLM reasoning, it does seem to work quite nicely for deduplication. Your example is probably even a lot easier than news articles, where you can have many articles from different viewpoints about the same event.

I don't actually plan to run this as a service so there's some things hard-coded and the setup is a bit difficult as you need an API key and a proxy. Currently it's just experimentation, although if it works well, I'll probably use it personally.


> If you want a half-decent model, at least look at NYC

I don't have the data, honestly, but isn't NYC (and it's surrounding cities/suburbs) more dense than the Bay Area?

In SF (the city) transpiration is quite decent because it's dense; single family houses and public transpiration together is a very very tricky to pull; you have to choose one or the other and most people would rather live in a family house than an apartment/condo with good transpiration


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: