Hacker Newsnew | past | comments | ask | show | jobs | submit | ashirviskas's commentslogin

Apple made lower than 16GB M3 models? Man, can't wait till the cheapest model is at least 128GB.

Yeah, M4 was the generation when the minimum got bumped up to 16GB.

Another European chiming in, I enjoyed OPs article.

Do you also write your bytecode by human hands? At which abstraction layer do we draw the line?

> it has, but python being single threaded (until recently) didn't make it an attractive choice for CLI tools.

You probably mean GIL, as python has supported multi threading for like 20 years.

Idk if ranger is slow because it is written in python. Probably it is the specific implementation.


> You probably mean GIL

They also probably mean TUIs, as CLIs don't do the whole "Draw every X" thing (and usually aren't interactive), that's basically what sets them apart from CLIs.


Even my CC status line script enjoyed a 20x speed improvement when I rewrote it from python to rust.

It’s surprising how quickly the bottleneck starts to become python itself in any nontrivial application, unless you’re very careful to write a thin layer that mostly shells out to C modules.

It is only 4 years old

technical preview in June 2021. I was using it for a bit before that as an internal employee. so they may have rounded up slightly or also were an internal beta test

side note, I’ve been trying to remember when it launched internally if anybody knows. I feel like it was pre-COVID, but that’s a long timeline from internal use to public preview


Yes, the technical preview of Github Copilot. I rounded up.

fair enough! the jump from that to ChatGPT’s launch (which I didn’t find that interesting), to gpt-4, to Claude Code/Codex CLI, to Gemini 3/Opus 4.5/GPT 5.2 has been insane in such a short time. I’m excited (since the release of the Codex CLI especially: https://dkdc.dev/posts/modern-agentic-software-engineering/)

Keyboard autocomplete?

I wonder what if we just crammed more into the "tokens"? I am running an experiment of replacing discrete tokens with embeddings + small byte encoder/decoder. That way you can use embedding space much more efficiently and have it contain much more nuance.

Experiments I want to build on top of it:

1. Adding lsp context to the embeddings - that way the model could _see_ the syntax better, closer to how we use IDEs and would not need to read/grep 25k of lines just to find where something is used. 2. Experiments with different "compression" ratios. Each embedding could encode a different amount of bytes and we would not rely on a huge static token dictionary.

I'm aware that papers exist that explore these ideas, but so far no popular/good open source models employ this. Unless someone can prove me wrong.


I found a few papers in this direction with perplexity like this one https://ceur-ws.org/Vol-4005/paper1.pdf and it doesn't seem to be that relevant for now.

The progress of a handful models seem to be so much better (because limited compute, we have only a handful of big ones, i presume) that these finetunings are just not yet relevant.

I'm also curious if a english java + html + css + javascript only model would look like in size and speed for example.

Unfortunate whenever i ask myself the question of finetunging tokens (just a few days ago this question came up again), deep diving takes too much time.

Claude only got lsp support in november i think. And its not even clear to me to what extend. So despite the feeling we are moving fast, tons of basic ideas haven't even made it in yet


if you have a corpus of code snippets to train the manifold (Laplacian) on (and a good embedding model), it is definitely possible to try something like this.

There’s many examples of noisily encoding a large embedding vocabulary. This sounds a bit like T-free or H-net? Or BLT?

One of the main issues with lines of work around this are that you end up trading embedding parameters for active parameters. This is rarely a good trade-off for the sake of compute.


Isn't this just an awkward way of adding an extra layer to the NN, except without end-to-end training?

Models like Stable Diffusion sort of do a similar thing using Clip embeddings. It works, and it's an easy way to benefit from the pre-training Clip has. But for a language model it would seemingly make more sense to just add the extra layer.


I mean this is exactly what it is. Just a wrapper to replace the tokenizer. That is exactly how LLMs can read images.

I'm just focusing on different parts


Not an expert in the space, but I’m not sure you need to modify tokens to get the model to see syntax, you basically get that exact association from attention.

You get that association that is relevant to your project only if you can cram the whole codebase. Otherwise it is making rough estimates and some of the time that seems to be where the models fail.

It can only be fully resolved with either infinite context length, or doing it similar to how humans do it - add some LSP "color" to the code tokens.

You can get a feel of what LLMs deal with when you try opening 3000 lines of code in a simple text editor and try to do something. May work for simple fixes, but not whole codebase refactors. Only ultra skilled humans can be productive in it (using my subjective definition of "productive")


Well, using Claude Pro/Max Calude Code api without Claude Code, instead of their actual API they monetize goes against their ToS.

I don't like it too, but it is what it is.

If I gave free water refils if you used my brand XYZ water bottle, you should not cry that you don't get free refills to your ABC branded bottle.

It may be scummy, but it does make sense.


You should never use GIF anymore, it is super inefficient. Just do video, it is 5x to 10x more efficient.

https://web.dev/articles/replace-gifs-with-videos


There's odd cases where it still has uses. When I was a teacher, some of the gamifying tools don't allow video embeds without a subscription, but I wanted to make some "what 3D operation is shown here" questions with various tools in Blender. GIF sizes were pretty comparable to video with largely static, less-than-a-second loops, and likely had slightly higher quality with care used to reduce color palette usage.

But I fully realize, there are vanishingly few cases with similar constraints.


For those you can often use animated WebP, or even APNG. They all have close to universal support and are usually much smaller.


If you need animated images in emails or text messages, GIF is the only supported format that will play the animation. Because of the size restrictions for these messaging systems the inefficient compression of GIFs is a major issue.


I am not sure "need" is the right word here.


AVIF works here also. Discord started supporting it for custom emoji.


Videos and images are treated very differently by browsers and OS:es. I'm guessing the better suggestion would be to use apng or animated avif if you are looking for a proper gif alternative.


Do browsers support progressive enhancement from gif to animated avif without javascript? The royally messed that up for animated webp.


Yes, by using the <picture> element with <source> elements declaring the individual formats with the last one being a regular <img> with the gif.

Or you could use content-negotiation to only send avif when it's supported, but IMO the HTML way with <picture> is perhaps clearer for the client and end user.

I think the webp problem was due to browsers supporting webp but not supporting animation, transparency or other features, so content negotiation based on mime types (either via <picture> or HTTP content-negotiation) did not work properly. Safari 16.1-16.3 has the same problem with AVIF, but that is a smaller problem than it was with webp.


So I guess that's a no - avif support does not necessarily mean animated avif support.


I covered this in my comment:

> Safari 16.1-16.3 has the same problem with AVIF, but that is a smaller problem than it was with webp.


Unfortunately browser vendors didn't want to support silent looping videos in <img> tags so gif stays relevant.


only if looping information is stored inside the container.


First, so best in this?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: