eigenblake's comments

eigenblake · 2026-01-23T05:30:05 1769146205

bd35a7f69b28c97fb3ebe489a4fba26a5f423522276d5ff5b5a8bb6441806ad2

eigenblake · 2025-05-07T02:28:37 1746584917

How did they leak it, jailbreak? Was this confirmed? I am checking for the situation where the true instructions are not what is being reported here. The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.

radeeyate · 2025-05-07T02:34:08 1746585248

All System Prompts from Anthropic models are public information, released by Anthropic themselves: https://docs.anthropic.com/en/release-notes/system-prompts. I'm unsure (I just skimmed through) to what the differences between this and the publicly released ones are, so they're might be some differences.

cypherpunks01 · 2025-05-07T05:49:43 1746596983

This system prompt that was posted interestingly includes the result of the US presidential election in November, even though the model's knowledge cutoff date was October. This info wasn't in the anthropic version of the system prompt.

Asking Claude who won without googling, it does seem to know even though it was later than the cutoff date. So the system prompt being posted is supported at least in this aspect.

freehorse · 2025-05-07T08:47:04 1746607624

I asked it this exact question, to anybody curious https://claude.ai/share/ea4aa490-e29e-45a1-b157-9acf56eb7f8a

edit:fixed link

late2part · 2025-05-07T09:24:30 1746609870

The conversation you were looking for could not be found.

freehorse · 2025-05-07T09:56:59 1746611819

oops, fixed

behnamoh · 2025-05-07T03:30:04 1746588604

> The assistant is Claude, created by Anthropic.

> The current date is {{currentDateTime}}.

> Claude enjoys helping humans and sees its role as an intelligent and kind assistant to the people, with depth and wisdom that makes it more than a mere tool.

Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?

o11c · 2025-05-07T04:16:08 1746591368

LLMs are notoriously bad at dealing with pronouns, because it's not correct to blindly copy them like other nouns, and instead they highly depend on the context.

aaronbrethorst · 2025-05-07T06:18:03 1746598683

[flagged]

turing_complete · 2025-05-07T07:01:40 1746601300

'It' is obviously the correct pronoun.

jsnider3 · 2025-05-07T16:14:23 1746634463

There's enough disagreement among native English speakers that you can't really say any pronoun is the obviously correct one for an AI.

Wowfunhappy · 2025-05-07T16:24:39 1746635079

"What color is the car? It is red."

"It" is unambiguously the correct pronoun to use for a car. I'd really challenge you to find a native English speaker who would think otherwise.

I would argue a computer program is no different than a car.

olddustytrail · 2025-05-07T19:59:13 1746647953

People often refer to their car and other people's as "she" ("she's a beauty") so you're is obviously wrong.

Wowfunhappy · 2025-05-07T21:52:39 1746654759

But no one who does that thinks they're using proper English!

jhugo · 2025-05-08T22:12:13 1746742333

"she" is absolutely proper English for a ship or boat, with a long history of use continuing into the present day, and many dictionaries also list a definition of "thing, especially machine" or something like that, though for non-ship/boat things the use of "she" is rather less common.

Nuzzerino · 2025-05-07T08:36:13 1746606973

You’re not aligned bro. Get with the program.

zahlman · 2025-05-07T15:12:02 1746630722

I'm not especially surprised. Surely people who use they/them pronouns are very over-represented in the sample of people using the phrase "I use ___ pronouns".

On the other hand, Claude presumably does have a model of the fact of not being an organic entity, from which it could presumably infer that it lacks a gender.

...But that wasn't the point. Inflecting words for gender doesn't seem to me like it would be difficult for an LLM. GP was saying that swapping "I" for "you" etc. depending on perspective would be difficult, and I think that is probably more difficult than inflecting words for gender. Especially if the training data includes lots of text in Romance languages.

horacemorace · 2025-05-07T04:14:12 1746591252

LLMs don’t seem to have much notion of themselves as a first person subject, in my limited experience of trying to engage it.

katzenversteher · 2025-05-07T04:42:22 1746592942

From their perspective they don't really know who put the tokens there. They just caculated the probabilities and then the inference engine adds tokens to the context window. Same with user and system prompt, they just appear in the context window and the LLM just gets "user said: 'hello', assistant said: 'how can I help '" and it just calculates the probabilities of the next token. If the context window had stopped in the user role it would have played the user role (calculated the probabilities for the next token of the user).

cubefox · 2025-05-07T08:35:23 1746606923

> If the context window had stopped in the user role it would have played the user role (calculated the probabilities for the next token of the user).

I wonder which user queries the LLM would come up with.

katzenversteher · 2025-05-12T06:39:37 1747031977

On one machine I run a LLM locally with ollama and a web interface (forgot the name) that allows me to edit the conversation. The LLM was prompted to behave as a therapist and for some reason also role played it's actions like "(I slowly pick up my pen and make a note of it)".

I changed it to things like "(I slowly pick up a knife and show it to the client)" and then just confront it it like "Whoa why are you threatening me!?", the LLM really tries hard to stay in it's role and then tells things like it did it on purpose to provoke a fear response to then discuss the fears.

tkrn · 2025-05-07T10:25:44 1746613544

Interestingly you can also (of course) ask them to complete for System role prompts. Most models I have tried this with seem to have a bit of an confused idea about the exact style of those and the replies are often a kind of an mixture of the User and Assistant style messages.

Terr_ · 2025-05-07T06:32:45 1746599565

Yeah, the algorithm is a nameless, ego-less make-document-longer machine, and you're trying to set up a new document which will be embiggened in a certain direction. The document is just one stream of data with no real differentiation of who-put-it-there, even if the form of the document is a dialogue or a movie-script between characters.

selectodude · 2025-05-07T03:36:50 1746589010

I don’t know but I imagine they’ve tried both and settled on that one.

Seattle3503 · 2025-05-07T04:12:23 1746591143

Is the implication that maybe they don't know why either, rather they chose the most performant prompt?

freehorse · 2025-05-07T08:50:52 1746607852

LLM chatbots essentially autocomplete a discussion in the form

    [user]: blah blah
    [claude]: blah
    [user]: blah blah blah
    [claude]: _____

One could also do the "you blah blah" thing before, but maybe third person in this context is more clear for the model.

rdtsc · 2025-05-07T04:33:10 1746592390

> Why do they refer to Claude in third person? Why not say "You're Claude and you enjoy helping hoomans"?

But why would they say that? To me that seems a bit childish. Like, say, when writing a script do people say "You're the program, take this var. You give me the matrix"? That would look goofy.

katzenversteher · 2025-05-07T04:39:25 1746592765

"It puts the lotion on the skin, or it gets the hose again"

the_clarence · 2025-05-07T13:10:29 1746623429

Why would they refer to Claude in second person?

baby_souffle · 2025-05-07T02:34:29 1746585269

> The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.

How would you detect this? I always wonder about this when I see a 'jail break' or similar for LLM...

gcr · 2025-05-07T02:37:44 1746585464

In this case it’s easy: get the model to output its own system prompt and then compare to the published (authoritative) version.

The actual system prompt, the “public” version, and whatever the model outputs could all be fairly different from each other though.

FooBarWidget · 2025-05-07T04:37:35 1746592655

The other day I was talking to Grok, and then suddenly it started outputting corrupt tokens, after which it outputted the entire system prompt. I didn't ask for it.

There truly are a million ways for LLMs to leak their system prompt.

azinman2 · 2025-05-07T05:28:19 1746595699

What did it say?

FooBarWidget · 2025-05-07T09:51:50 1746611510

I didn't save the conversation but one of the things that stood out was a long list of bullets saying that Grok doesn't know anything about x/AI pricing or product details, tell user to go x/AI website rather than making things up. This section seems to be longer than the section that defines what Grok is.

Nothing about tool calling.

eigenblake · 2025-03-04T03:35:42 1741059342

What's so special about this? Homo sapiens have been doing this for hundreds of thousands of years /s

eigenblake · 2025-03-01T05:06:06 1740805566

Doctors aren't machines, they're humans. I have not yet read the full paper, only the article, but I already see something really big and important to look out for. When I read the full thing, the question I'll be asking is "what's the likelihood that the self-esteem of doctors was directly intervened on by the exam taking process itself." How do you control for the loss in confidence that learning of your test performance gives you? How are we certain that learning your score on the board exam doesn't make you more conservative (or riskier) with how you treat patients as a psychological effect?

eigenblake · 2025-03-01T17:20:37 1740849637

This appears to be an observational result, so I'm genuinely perplexed by the reception here. I genuinely thought this comment shows a healthy amount of curiosity and asking important questions. Asking "what control group did this study use?" is usually well-received here.

AStonesThrow · 2025-03-01T07:09:45 1740812985

Yeah but the patient is just a biological machine. This machine can easily be divided into organs and apportioned among specialists. The machine is easily understood by a corpus of research and laboratory experimentation.

. Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.

Value is extracted from the biological machines mostly from government and 3rd party sources, so there is no real need to rely on machines having a means or will of their own.

There is no compelling reason to treat humans any different from automobiles for the purposes of medicine and medical treatment. In fact humans are less genetically diverse than motor vehicles, and A new model year will always produce a bumper crop of lemons to work on.

rscho · 2025-03-01T17:12:23 1740849143

The common misconception of someone with a hard science education.

> Many inputs can be placed in the machine by physicians, and the outputs are known. The biological machines can easily be isolated from environment, or monitored with high technology, and assigned numbers in databases to be processed in data centers.

We aren't even close to that level of understanding.

zemvpferreira · 2025-03-01T17:30:10 1740850210

And still, the model works. Lives are saved. We might save many more with a fully integrated non-simplified approach, but it’s not necessary to keep seeing growth in positive outcomes.

NoImmatureAdHom · 2025-03-01T08:11:06 1740816666

Soon they will be!

the_real_cher · 2025-03-01T05:38:54 1740807534

loss of confidence? lol what?

eigenblake · on Jan 13, 2025

We could probably fine-tune a tiny convolutional neutral net image classifier and just hold on the last good frames for longer to cover the frames with clear trolling and nsfw images.

BriggyDwiggs42 · on Jan 13, 2025

I think that would miss the point

mkagenius · on Jan 13, 2025

No, that point was already made. This will be a new point unlike the previous point.

BriggyDwiggs42 · on Jan 13, 2025

But i like the previous point. Why would you take it from me?

I-M-S · on Jan 13, 2025

It's not taking away, a new point is by definition adding. Just like The Free Movie added to The Bee Movie.

BriggyDwiggs42 · on Jan 14, 2025

I think it would be subtractive. To my eye, it seems like the point of the project was to celebrate the free expression of the crowd. The lack of censorship and filtering is core to the purpose of the project. If you start filtering out individual contributions in order to more accurately reconstruct the original movie, then I don’t really get why you wouldn’t just pirate the movie.

eigenblake · on Jan 9, 2025

https://en.wikipedia.org/wiki/Streisand_effect?wprov=sfla1

eigenblake · on Dec 31, 2024

I'm surprised no one has mentioned atomic Linux distros yet in this thread. The really hard thing here is that people aren't all talking about the same things. My experience on Arch isn't my experience on Pop. Things on Pop are amazing on my MSI rebuilt PC with Nvidia GPU. I don't even know if I really need to upgrade to NixOS except to satiate my curiosity.

eigenblake · on Nov 26, 2024

Absolutely yes, but not me personally. The keywords to search for are Plover, Stenotype https://youtu.be/jRFKZGWrmrM

eigenblake · on Oct 6, 2024

> creative work that presents itself as journalism or nonfiction but introduces fictional elements with the intention of upsetting, disturbing, or confusing the audience.

A good time to mention the SCP wiki and some types of analog horror. It was this kind of thing that led me to discover "hard science fiction" which is distinct but related to the previous two.

genewitch · on Oct 6, 2024

I think i enjoy hard sci-fi, if my username is any indication of the genre - it's from "A Signal Shattered" by Eric S. Nylund.

However i am curious what sort of Sci-fi you meant, where it's plausible but there's SCP/analog horror elements. Evidently a prior comment in the thread about memetics wiped an edge on the graph that would let me infer the sorts of thing you're talking about.

pavel_lishin · on Oct 6, 2024

> it's from "A Signal Shattered" by Eric S. Nylund.

What a great book. I read it before I read the book that it's a sequel to. Maybe I'll re-read them this week.

namaste411 · on Oct 8, 2024

it doesn't have to make you insecure

eigenblake · on Sept 16, 2024

What I don't see represented in this conversation is the idea that you can just write for personal satisfaction, or examine something you're personally interested in. Not everyone needs to have 10k+ monthly active readers. Not everything needs to be a rat race. Why don't we see blogging like exercise? Sure you'll have your body builders, but some people just go on walks, and no one is doing anything "wrong" they just have different goals.

Yodel0914 · on Sept 17, 2024

Indeed. Not everything needs to be an optimization game.

What also isn't discussed much that that readers have different tastes. Sometimes I enjoy a long, rambling narrative if I like the author's style (eg Sadly, Porn). Other times I wish they'd have just written a pamphlet with their 1 interesting idea (eg Die With Zero).

BlueTemplar · on Sept 17, 2024

Yeah, they could have put it differently in the

> Of course, you can go off talking about something you find interesting, so long as you explain it in a way the audience can understand. You can use the Mario 0.5x A presses video as your guiding light, your North Star, if you will. ↩

bit.

(After all, the Internet *excells* in allowing people with niche interests to find each other !)

And focused more on how it's about not losing the readers that would actually find it interesting if it was presented just a little bit better.