Hacker Newsnew | past | comments | ask | show | jobs | submit | enoch2090's commentslogin

GW! How does Onyx differ from Open WebUI and its alike?


Reposting from https://news.ycombinator.com/item?id=46047430:

---

Broadly, I think other open source solutions are lacking in (1) integration of external knowledge into the chat (2) simple UX (3) complex "agent" flows. Both internal RAG and web search are hard to do well, and since we've started as an enterprise search project we've spent a lot of time making it good.

Most (all?) of these projects have UXs that are quite complicated (e.g. exposing front-and-center every model param like Top P without any explanation, no clear distinction between admin/regular user features, etc.). For broader deployments this can overwhelm people who are new to AI tools.

Finally trying to do anything beyond a simple back and forth with a single tool calls isn't great with a lot of these projects. So something like "find me all the open source chat options, understand their strengths/weaknesses, and compile that into a spreadsheet" will work well with Onyx, but not so well with other options (again partially due to our enterprise search roots).


Surprisingly, SAM3 works bad on engineering drawings while SAM2 kinda works, and VLMs like Qwen3-VL works as well


Had good luck with Gemini 2.5, SAM3 failed miserably with PIDs.


yeah I tried too. Im trying a fine tuning on PIDs.


Looking forward to your progress! Just checked the paper and it says the underlying backbone is still DETR. My guess would be that SAM3 uses more video frames during the training process and caused the dilution of sparse engineering-paper-like data.


Although a bit off the GPU topic, I think Apple's Rosetta is the smoothest binary transition I've ever used.


That's the essence of these services, they never explicitly mention the quota, or secretly lowers it at some point.


Played with the demo a bit and I got confused.

1. The chat context is always provided, and that introduces a bit of uncertainty - when the chat history mentioned something the model is always inclined to connect with it.

2. When I tried to set each context to an empty string, the model doesn't show any evidence of remembering concepts. I told it 5 times that I love cats, and when asked about its favorite animal, its output remains "honeybee" and "octopus".


I can’t decide if I’m skeptical of the entire concept or not. I guess I believe it will do something to the network to add this EMA of vectors in, so I’m surprised you didn’t get at least a change in animals after talking about cats. But, I’m not clear that reweighting logits at the end is super useful. I guess this is supposed to be in some way a realtime LoRA, but then what do you have except a super-undertrained LoRA, trained just off whatever conversations you’ve had?


Looks pretty good but all CJK characters are displayed as questionmarks (???) and switching to a CJK native font does no help. So my user folder is now crowded with ???????? which makes it hard to navigate :(


Unfortunately, I only support Latin and Cyrillic for now. But full unicode support will come in the future.


Can't wait to replace my explorer with this, hope the support will come soon


For llama3 just ask him to install ollama and serve the model. Ollama has auto memory management and will free the model when not used, and whenever you make a call to the API (do let your friend know before you do this) ollama will reload the model back to memory again.

Not sure whether there are anything similar for SD though.


This, plus connect via Tailscale and you can access it from anywhere (assuming you're friends laptop is online).


I wonder how would they calculate the metrics if the result is generated instead of retrieved? Is it likely that the LLM can generate exactly the same output as the desired result?


Streamlit's state management is so painful, feels like constantly writing hacks.


Actually, does Qt have something like FastUI? Been looking for one.


You can compile Qt to target the browser instead of native apps. But besides seeing a demo I don't know how good it is with performances, accessibility and so on.


I think Qt does have WASM compilation options, but I mean generating interfaces given models like Pydantic/Dataclasses/etc.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: