>> you can use it to explore the thought processes of why someone might pick a particular set
The danger I see here, is that if you ask an LLM to explain the thought processes, it will never say “I don’t know”. It will instead describe some thought processes associated with coffee grinders. It may say something like “this grinder has fine grain controls that allow customizing the size of grind.” …which that particular grinder doesn’t have at all…but that’s a thing people write about when choosing grinders. The frustration is that 90% of the answer will be accurate, but somewhere in all the sentences is a hallucination, treated with the exact same authority as the rest of the answer.
To be really specific, it identified a La Marzocco Linea Mini and incorrectly, a La Marzocco Swift grinder. I cross-checked with Google and the Swift was incorrect. ChatGPT then suggested it could be a Mahlkönig EK43 or Nueva Simonelli Mythos.
Mahlkönig EK43 was the correct answer, but the combo is unusual, because people will usually get a couple of La Marzoccos from a supplier who holds both, and Mahlkönig is an unseen brand here. Why go to the trouble? La Marzocco makes good enough grinders.
With further interrogation, ChatGPT was absolutely insistent on it being a EK43, a limited edition model called The Icon, which was notable for its white color and gold trimmings. This kind of precision is easy to verify, but it's not a detail that comes up in a Google search for the EK43.
The correct answer was that the coffee shop's owner's mentor was from Vietnam, where the EK43 was more common. It was particularly good for making complex latte art like unicorns as it has a precision that allowed controlling the acidity of the crema.
But this whole thought process was just wild. I want it to give me all kinds of crazy answers. I want it to have a high miss rate. It's perfect for brainstorming. But you need sufficient expertise to guide ChatGPT to the next answers.
The danger I see here, is that if you ask an LLM to explain the thought processes, it will never say “I don’t know”. It will instead describe some thought processes associated with coffee grinders. It may say something like “this grinder has fine grain controls that allow customizing the size of grind.” …which that particular grinder doesn’t have at all…but that’s a thing people write about when choosing grinders. The frustration is that 90% of the answer will be accurate, but somewhere in all the sentences is a hallucination, treated with the exact same authority as the rest of the answer.
It’s very difficult to QA that type of error.