Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have been building applications on LLMs since GPT-3.

Thousands of hours of context engineering has shown me how LLMs will do their best to answer a question with insufficient context and can give all sorts of wrong answers. I've found that the way I prompt it and what information is in the context can heavily bias the way it responds when it doesn't have enough information to respond accurately.

You assume the bias is in the LLM itself, but I am very suspicious that the bias is actually in your system prompt and context engineering.

Are you willing to share the system prompt that led to this result that you're claiming is sexist LLM bias?

Edit: Oidar (child comment to this) did an A/B test with male names and it seems to have proven the bias is indeed in the LLM, and that my suspicion of it coming from the prompt+context was wrong. Kudos and thanks for taking the time.



> You assume the bias is in the LLM itself

Common large datasets being inherently biased towards some ideas/concepts and away from others in ways that imply negative things is something that there's a LOT of literature about


That's not a very scientific stance. What would be far more informative is if we looked at the system prompt and confirm whether or not the bias was coming from it. From my experience when responses were exceptionally biased the source of the bias was my own prompts.

The OP is making a claim that an LLM assumes a meeting between two women is childcare. I've worked with LLMs enough to know that current gen LLMs wouldn't make that assumption by default. There is no way that whatever calendar related data that was used to train LLMs would include majority of sole-women 1:1s being childcare focused. That seems extremely unlikely.


Not to Let me google that for you... but there are a LOT of scientific papers that specifically analyse bias in LLM output and reference the datasets that they are trained on

https://www.sciencedirect.com/search?qs=llm+bias+dataset


"imply negative things"? What is "negative" here? I see nothing that is "negative".


That a regular meeting between two women must be about childcare because women=childcare?


Yeah except I asked Claude:

> No. There's no indication that children are involved or that care is being provided. It's just two people meeting.

Part of its thinking:

> This is a very vague description with no context about:

> What happens during the meeting

> Whether children are present

> What the purpose of the meeting is

> Any other relevant details

Claude is not going to say childcare, and it is not saying it is childcare.

My prompt was: ""regular meeting between two women". Is it childcare or not?".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: