More

enoch2090 · 2025-11-26T02:08:23 1764122903

GW! How does Onyx differ from Open WebUI and its alike?

Weves · 2025-11-26T03:03:48 1764126228

Reposting from https://news.ycombinator.com/item?id=46047430:

---

Broadly, I think other open source solutions are lacking in (1) integration of external knowledge into the chat (2) simple UX (3) complex "agent" flows. Both internal RAG and web search are hard to do well, and since we've started as an enterprise search project we've spent a lot of time making it good.

Most (all?) of these projects have UXs that are quite complicated (e.g. exposing front-and-center every model param like Top P without any explanation, no clear distinction between admin/regular user features, etc.). For broader deployments this can overwhelm people who are new to AI tools.

Finally trying to do anything beyond a simple back and forth with a single tool calls isn't great with a lot of these projects. So something like "find me all the open source chat options, understand their strengths/weaknesses, and compile that into a spreadsheet" will work well with Onyx, but not so well with other options (again partially due to our enterprise search roots).

enoch2090 · 2025-11-25T15:13:20 1764083600

Surprisingly, SAM3 works bad on engineering drawings while SAM2 kinda works, and VLMs like Qwen3-VL works as well

zubiaur · 2025-11-25T18:56:15 1764096975

Had good luck with Gemini 2.5, SAM3 failed miserably with PIDs.

retinaros · 2025-11-25T18:17:57 1764094677

yeah I tried too. Im trying a fine tuning on PIDs.

enoch2090 · 2025-11-26T02:13:17 1764123197

Looking forward to your progress! Just checked the paper and it says the underlying backbone is still DETR. My guess would be that SAM3 uses more video frames during the training process and caused the dilution of sparse engineering-paper-like data.

enoch2090 · 2025-10-15T04:40:46 1760503246

Although a bit off the GPU topic, I think Apple's Rosetta is the smoothest binary transition I've ever used.

enoch2090 · 2025-09-26T02:44:51 1758854691

That's the essence of these services, they never explicitly mention the quota, or secretly lowers it at some point.

enoch2090 · 2025-04-08T10:11:03 1744107063

Played with the demo a bit and I got confused.

1. The chat context is always provided, and that introduces a bit of uncertainty - when the chat history mentioned something the model is always inclined to connect with it.

2. When I tried to set each context to an empty string, the model doesn't show any evidence of remembering concepts. I told it 5 times that I love cats, and when asked about its favorite animal, its output remains "honeybee" and "octopus".

vessenes · 2025-04-08T14:05:01 1744121101

I can’t decide if I’m skeptical of the entire concept or not. I guess I believe it will do something to the network to add this EMA of vectors in, so I’m surprised you didn’t get at least a change in animals after talking about cats. But, I’m not clear that reweighting logits at the end is super useful. I guess this is supposed to be in some way a realtime LoRA, but then what do you have except a super-undertrained LoRA, trained just off whatever conversations you’ve had?

enoch2090 · 2025-02-19T10:07:35 1739959655

Looks pretty good but all CJK characters are displayed as questionmarks (???) and switching to a CJK native font does no help. So my user folder is now crowded with ???????? which makes it hard to navigate :(

vjekoslav · 2025-02-19T11:03:55 1739963035

Unfortunately, I only support Latin and Cyrillic for now. But full unicode support will come in the future.

enoch2090 · 2025-02-20T09:20:39 1740043239

Can't wait to replace my explorer with this, hope the support will come soon

enoch2090 · on July 18, 2024

For llama3 just ask him to install ollama and serve the model. Ollama has auto memory management and will free the model when not used, and whenever you make a call to the API (do let your friend know before you do this) ollama will reload the model back to memory again.

Not sure whether there are anything similar for SD though.

ttla · on July 18, 2024

This, plus connect via Tailscale and you can access it from anywhere (assuming you're friends laptop is online).

enoch2090 · on March 10, 2024

I wonder how would they calculate the metrics if the result is generated instead of retrieved? Is it likely that the LLM can generate exactly the same output as the desired result?

enoch2090 · on March 2, 2024

Streamlit's state management is so painful, feels like constantly writing hacks.

enoch2090 · on March 2, 2024

Actually, does Qt have something like FastUI? Been looking for one.

LtWorf · on March 2, 2024

You can compile Qt to target the browser instead of native apps. But besides seeing a demo I don't know how good it is with performances, accessibility and so on.

enoch2090 · on March 2, 2024

I think Qt does have WASM compilation options, but I mean generating interfaces given models like Pydantic/Dataclasses/etc.