Hacker Newsnew | past | comments | ask | show | jobs | submit | squeefers's commentslogin

so if they put their linkedin account on their HN account, we can figure out who they are.... genius stuff, AI really is changing the landscape all right

To be clear, we are making a clear concession here that the people weren't truly anonymous. But we did use an LLM to remove any identifying information from HN making them quasi-anonymous, this is more described in the appendix Table 2.

We do also make a more real world like test in section 2. There we use the anthropic interviewer dataset which Anthropic redacted, from the redacted interviews our agent identified 9/125 people based on clues.

The blog post might be more approachable for a quick take: https://simonlermen.substack.com/p/large-scale-online-deanon...


Thanks for that link! I'll put in the top text.

Edit: actually I've re-upped your submission of that link and moved the links to the paper to the toptext instead. Hopefully this will ground the discussion more in the actual study.


But you also relied on people giving away too much personal information about themselves... which won't always be the case.

Yeah my first thought was "of course an LLM can do that, we didn't need a paper to tell us". I would be more impressed if it could do it without that information, such as by analyzing writing styles and other cues that aren't direct PII.

It’s the same thing as theft and locks. Any motivated attacker will overcome any rudimentary obstacle. We still use locks because most opportunistic attackers are the most prevalent.

Even the paper on improved phishing showed that LLMs reduce the cost to run phishing attacks, which made previously unprofitable targets (lower income groups), profitable.

The most common deterrent is inconvenience, not impossibility.


I agree that these accounts probably on average still contain more information than the average pseudonymous account. I think we could try to use the LLM to increasingly ablate more information and see how it performance decays – to be clear we already heavily remove such information, see Table 2 appendix. But I don't expect that to change the basic conclusions.

I also wonder how well the LLM would do with less direction e.g. just ask it to analyze someone's posts and "figure out what city they live in based on everything you know about how to identify someone from online posts".

Over a large enough timeframe (often a couple years at most), almost everyone online gives too much information about themselves. A seemingly innocuous statement can pin you to an exact city and so on.

I would be quite impressed if someone could figure out what city I live in from my 4.5 year old account, but I highly doubt it.

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

It's a pity that you didn't make your point more thoughtfully because it's one of the few comments in the thread so far that has anything to do with the actual paper, and even got a response from one of the authors. That's good! Unfortunately, badness destroys goodness at a higher rate than goodness adds it...at least in this genre.


That's what I'm wondering, since my linkedin profile is indeed linked to in my HN profile.

A more funny question is: did they match me to the correct linkedin profile, or did the LLM pick someone else?


looooooool

its technically an IDE, but harness makes it sound new and fancy.

minimal indeed. why are we regressing back to terminals now? ive seen this in the rust world mainly

Doesn't need a terminal: run it in RPC mode to send/receive JSON over stdio. That's how the pi-coding-agent Emacs package works, which is the only way I've ever used Pi.

It seems pretty well done: when I added permission requests to the `bash` tool, the "Are you sure y/N" requests started appearing just like they were native to Emacs.


Some of us never left the terminal. Welcome to the future.

> Karma aside, flooding the comments with a chosen narrative via army of bots seems like it's already happening.

again with the conspiracy theories


People do conspire you know.

> created: 83 days ago

I dunno, I agree. It sounds conspiratorial.

But who knows, maybe even 17 year old accounts are being hijacked by AI now too.


> again with the conspiracy theories

Yeah, right? Not one ever actually turned out to be true!

That conspiracy about billionaires, who supposedly own all of western media, having deliberately created an environment in which anyone who expresses even the remote idea of a conspiracy, gets discreditted, is also not true!

None of them are true!

Not. A. Single. One.

*noms cheese pizza*


> Either people who do this for a living have no clue how to do their job,

how naive. most of the world work to survive, not because its their dream vocation. they probably dont care as much as you do


> We’ve been living in the enshitification economy.

that whiny bullshit about somebody elses website? you dont have to rely on a website or app. either you need their monopoly because you cant do it yourself, or you have options.... in both cases the whining is not needed


sorry but you cant have a domain if google ban it? how does this work?

its as if nonces exist independently of tcp/ip

> you cannot have ambiguity in defining your computation

nobody except for maybe nasa would make software in this scenario.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: