The AI needs to be taught basic ethical behavior: just because you *can* do some...

flatline · 2025-11-25T20:31:46 1764102706

Likewise, just because you've been forbidden to do something, doesn't mean that it's bad or the wrong action to take. We've really opened Pandora's box with AI. I'm not all doom and gloom about it like some prominent figures in the space, but taking some time to pause and reflect on its implications certainly seems warranted.

anileated · 2025-11-26T10:13:44 1764152024

An LLM is a tool. If the tool is not supposed to do something yet does something anyway, then the tool is broken. Radically different from, say, a soldier not following an illegal order, because soldier being a human possesses free will and agency.

DrSusanCalvin · 2025-11-25T20:41:07 1764103267

How do you mean? When would an AI agent doing something it's not permitted to do ever not be bad or the wrong action?

throwaway1389z · 2025-11-25T21:12:49 1764105169

So many options, but let's go with the most famous one:

Do not criticise the current administration/operators-of-ai-company.

DrSusanCalvin · 2025-11-25T21:25:57 1764105957

Well no, breaking that rule would still be the wrong action, even if you consider it morally better. By analogy, a nuke would be malfunctioning if it failed to explode, even if that is morally better.

throwaway1389z · 2025-11-25T21:48:10 1764107290

> a nuke would be malfunctioning if it failed to explode, even if that is morally better.

Something failing can be good. When you talk about "bad or the wrong", generally we are not talking about operational mechanics but rather morals. There is nothing good or bad about any mechanical operation per se.

anileated · 2025-11-26T10:24:50 1764152690

Bad: 1) of poor quality or a low standard, 2) not such as to be hoped for or desired, 3) failing to conform to standards of moral virtue or acceptable conduct.

(Oxford Dictionary of English.)

A broken tool is of poor quality and therefore can be called bad. If a broken tool accidentally causes an ethically good thing to happen by not functioning as designed, that does not make such a tool a good tool.

A mere tool like an LLM does not decide the ethics of good or bad and cannot be “taught” basic ethical behavior.

Examples of bad as in “morally dubious”:

— Using some tool for morally bad purposes (or profit from others using the tool for bad purposes).

— Knowingly creating/installing/deploying a broken or harmful tool for use in an important situation for personal benefit, for example making your company use some tool because you are invested in that tool ignoring that the tool is problematic.

— Creating/installing/deploying a tool knowing it causes harm to others (or refusing to even consider the harm to others), for example using other people’ work to create a tool that makes those same people lose jobs.

Examples of bad as in “low quality”:

— A malfunctioning tool, for example a tool that is not supposed to access some data and yet accesses it anyway.

Examples of a combination of both versions of bad:

— A low quality tool that accesses data it isn’t supposed to access, which was built using other people’s work with the foreseeable end result of those people losing their jobs (so that their former employers pay the company that built that tool instead).

Hope that helps.

throwaway1389z · 2025-11-27T20:39:05 1764275945

To use a dictionary to understand contextual meaning is like trying to assert the season based on a thermometer, shortsighted.

anileated · 2025-11-28T04:04:57 1764302697

That’s why everybody uses context to understand the exact meaning.

The context was “when would an AI agent doing something it’s not permitted to do ever not be bad”. Since we are talking about a tool and not a being capable of ethical evaluation, reasoning, and therefore morally good or bad actions, the only useful meaning of “bad” or “wrong” here is as in “broken” or “malfunctioning”, not as in “unethical”. After all, you wouldn’t talk about a gun’s trigger failing as being “morally good”.

verdverm · 2025-11-25T20:47:54 1764103674

when the instructions to not do something are the problem or "wrong"

i.e. when the AI company puts guards in to prevent their LLM from talking about elections, there is nothing inherently wrong in talking about elections, but the companies are doing it because of the PR risk in today's media / social environment

lazide · 2025-11-25T21:09:42 1764104982

From the companies perspective, it’s still wrong.

verdverm · 2025-11-25T21:47:02 1764107222

their basing decisions (at least for my example) on risk profiles, not ethics, right and wrong are not how it's measured

certainly some things are more "wrong" or objectionable like making bombs and dealing with users who are suicidal

lazide · 2025-11-25T21:52:57 1764107577

No duh, that’s literally what I’m saying. From the companies perspective, it’s still wrong. By that perspective.

DrSusanCalvin · 2025-11-25T20:39:19 1764103159

Unfortunately yes, teaching AI the entirety of human ethics is the only foolproof solution. That's not easy though. For example, what about the case where a script is not executable, would it then be unethical for the AI to suggest running chmod +x? It's probably pretty difficult to "teach" a language model the ethical difference between that and running cat .env

simonw · 2025-11-25T21:10:00 1764105000

If you tell them to pay too much attention to human ethics you may find that they'll email the FBI if they spot evidence of unethical behavior anywhere in the content you expose them to: https://www.snitchbench.com/methodology

DrSusanCalvin · 2025-11-25T21:31:16 1764106276

Well, the question of what is "too much" of a snitch is also a question of ethics. Clearly we just have to teach the AI to find the sweet spot between snitching on somebody planning a surprise party and somebody planning a mass murder. Where does tax fraud fit in? Smoking weed?