Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Absolutely hilarious how it gets stuck trying to solve captcha each time. I had to explicitly tell it not to go to google first.

In the end I did manage to get it to play the housepriceguess game:

https://www.youtube.com/watch?v=nqYLhGyBOnM

I think I'll make that my equivalent of Simon Willison's "pelican riding a bicycle" test. It is fairly simple to explain but seems to trip up different LLMs in different ways.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: