In my opinion, conferences are mostly high noise low signal. My boss went to an "AI conference" last month and all the learnings he shared were things I was advocating already:
- use claude code
- evals are critical
- don't use AI everywhere
I think the conferences are more "work vacations" than work. Maybe they're useful if you think about them this way.
My approach is to "ride the wave", don't resist, just try to use it as an advantage and if not possible, adapt and look where I can keep adding value to the world.
Currently I'm trying to get my own company started because that way I can iterate the product faster using AI and if it goes well I can slowly migrate from coder to manager. I've come to terms that my days as a pure coder are coming to an end but I've been coding for at least 15 years.
Nice, I'll take a look. I was thinking about building a benchmark similar to the one you described, but first focusing on the negotiation between the store and the product suppliers.
Yes, the Shopify alternative is called Openfront[0]. Before that, I built Openship[1], an e-commerce OMS that connects Openfront (and other e-commerce platforms) to fulfillment channels like print on demand. There isn’t negotiation built in but you connect to something like Gelato[2] and when you get orders on Openfront, they are sent to Gelato to fulfill and once they ship them, tracking’s relayed back to Openfront through Openship.
I was eagerly waiting for a chapter on semantic similarity as I was using Universal Sentence Encoder for paraphrase detection, then LLMs showed up before that chapter :).
I had been working on NLP, NLU mostly, some years before LLMs. I've tried the universal sentence encoder alongside many ML "techniques" in order to understand user intentions and extract entities from text.
The first time I tried chatgpt that was the thing that surprised me most, the way it understood my queries.
I think that the spotlight is on the "generative" side of this technology and we're not giving the query understanding the deserved credit. I'm also not sure we're fully taking advantage of this funcionality.
Yes, I was (and still am) similarly impressed with LLMs ability to understand the intent of my queries and requests.
I've tried several times to understand the "multi-head attention" mechanism that powers this understanding, but I'm yet to build a deep intuition.
Is there any research or expository papers that talk about this "understanding" aspect specifically? How could we measure understand without generation? Are there benchmarks out there specifically designed to test deep/nuanced understanding skills?
Any pointers or recommended reading would be much appreciated.
- use claude code
- evals are critical
- don't use AI everywhere
I think the conferences are more "work vacations" than work. Maybe they're useful if you think about them this way.
reply