Hacker Newsnew | past | comments | ask | show | jobs | submit | kartoolOz's commentslogin

It's very hyper-parameter dependent, and in my testing didn't provide comparable performance to maxsim.


"कर्मण्येवाधिकारस्ते मा फलेषु कदाचन | मा कर्मफलहेतुर्भूर्मा ते सङ्गोऽस्त्वकर्मणि" - Bhagvad gita, chapter 2, verse 47.

You have a right to perform your prescribed duties, but you are not entitled to the fruits of your actions. Never consider yourself to be the cause of the results of your activities, nor be attached to inaction.


WebArena does this really well, called the "accessibility_tree" https://github.com/web-arena-x/webarena/blob/main/browser_en...


The Info extraction and Question Answering metrics are far worse than transformers though.

They also say that in the blog "However, both Based and Mamba still underperform the strongest Transformer baseline, sometimes by large margins. This is consistent with our “no free lunch” observation above"


Alphacode 2 technical paper claims to solve 43% of problems (77 problems from 12 codeforce competitions) performing 85%ile on all human participants.

Caveat is deep in the technical paper,

1) generates 1m candidates from N different prior models (fine tuned on previous codeforces)

2) throws away 95%ile candidates (doesn't fit the test case + no compile)

3) groups semantically similar candidates

4) scores candidates from each group (Based on another scoring model, probably latency + descriptions etc)

5) Picks top 10

Makes 10 submissions and finally gets the score ..

sure this is how humans solve problems ... totally awed by AGI /s


10 submissions is rather arbitrary. Why not just do 1 billion submissions? AI magic!


Technical report: https://storage.googleapis.com/deepmind-media/gemini/gemini_... Nano-2 is 3.25b, and as per figure 3, nano-2 is roughly 0.6-0.8 as good as pro, and ultra is 1.05-1.3 as good as pro.

Roughly that should put gemini ultra in the sub 100b range?


Those calculations definitely do not scale linearly


Hi, Thanks for Open sourcing the code! I was trying to reuse the code especially the dynamic quantization per channel (int8 on gpu) but couldn't get it to work, i also checked out torchao package but it looks like it has dependency on the nightly channel and SAM's dynamic implementation with triton has other issues, is there any clean implementation of int8 dynamic post-training quantization that you can point too ?


What’s the issue with getting int8 dynamic quantization to work? As in, you’re unable to get it to quantize or to run with speedups?


Would see great improvments in retrieval accuracy by finetuning e5-base-v2 or the newer leaders on mteb benchmark.


Definitely. I prefer the sentence-transformers ones since they have been fine-tuned on codesearchnet. I'm also really excited about the latest gte models by Alibaba, their smallest model is the size of MiniLM L6 but beats MPNet.


Vespa.ai is pretty crazy too, a bit unkown we run a huge vespa cluster serving 1k+ queries with <100ms latency ...


Hindu - is someone who practices Hinduism and identifies as such, its not a language, that's Hindi.


Sorry, my bad. I should have double checked. Thanks for pointing it out.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: