More

mfalcon · 2026-01-22T12:07:47 1769083667

In my opinion, conferences are mostly high noise low signal. My boss went to an "AI conference" last month and all the learnings he shared were things I was advocating already:

- use claude code

- evals are critical

- don't use AI everywhere

I think the conferences are more "work vacations" than work. Maybe they're useful if you think about them this way.

mfalcon · 2026-01-19T12:25:00 1768825500

The truth is that no one really knows.

My approach is to "ride the wave", don't resist, just try to use it as an advantage and if not possible, adapt and look where I can keep adding value to the world.

Currently I'm trying to get my own company started because that way I can iterate the product faster using AI and if it goes well I can slowly migrate from coder to manager. I've come to terms that my days as a pure coder are coming to an end but I've been coding for at least 15 years.

mfalcon · 2026-01-07T10:56:11 1767783371

Wake me up at Opus 12

mfalcon · 2025-12-30T15:27:08 1767108428

Would be nice to use as an interface to interact with Claude Code.

mfalcon · 2025-12-28T01:58:31 1766887111

Nice, I'll take a look. I was thinking about building a benchmark similar to the one you described, but first focusing on the negotiation between the store and the product suppliers.

Does your software also handle this type of task?

theturtletalks · 2025-12-28T02:54:43 1766890483

Yes, the Shopify alternative is called Openfront[0]. Before that, I built Openship[1], an e-commerce OMS that connects Openfront (and other e-commerce platforms) to fulfillment channels like print on demand. There isn’t negotiation built in but you connect to something like Gelato[2] and when you get orders on Openfront, they are sent to Gelato to fulfill and once they ship them, tracking’s relayed back to Openfront through Openship.

0. https://github.com/openshiporg/openfront

1. https://github.com/openshiporg/openship

2. https://www.gelato.com

mfalcon · 2025-12-15T16:38:25 1765816705

I was eagerly waiting for a chapter on semantic similarity as I was using Universal Sentence Encoder for paraphrase detection, then LLMs showed up before that chapter :).

mfalcon · 2025-12-08T15:29:22 1765207762

I had been working on NLP, NLU mostly, some years before LLMs. I've tried the universal sentence encoder alongside many ML "techniques" in order to understand user intentions and extract entities from text.

The first time I tried chatgpt that was the thing that surprised me most, the way it understood my queries.

I think that the spotlight is on the "generative" side of this technology and we're not giving the query understanding the deserved credit. I'm also not sure we're fully taking advantage of this funcionality.

ivansavz · 2025-12-08T18:40:05 1765219205

Yes, I was (and still am) similarly impressed with LLMs ability to understand the intent of my queries and requests.

I've tried several times to understand the "multi-head attention" mechanism that powers this understanding, but I'm yet to build a deep intuition.

Is there any research or expository papers that talk about this "understanding" aspect specifically? How could we measure understand without generation? Are there benchmarks out there specifically designed to test deep/nuanced understanding skills?

Any pointers or recommended reading would be much appreciated.

mfalcon · 2025-09-04T03:47:06 1756957626

You can evaluate with your programming language of choice.

mfalcon · 2025-09-04T03:44:17 1756957457

Good idea for a follow up post :)

mfalcon · 2025-09-04T03:43:54 1756957434

Yes, and these problems are more present in the first iterations, when you are still trying to get a good enough agent behaviour.

I'm still thinking about good ways to mitigate this issue, will share.