More

deepsquirrelnet · 2026-03-01T02:22:15 1772331735

4-bit quantization on newer nvidia hardware is being supported in training as well these days. I believe the gpt-oss models were trained natively in MXFP4, which is a 4-bit floating point / e2m1 (2-exponent, 1 bit mantissa, 1 bit sign).

It doesn't seem terribly common yet though. I think it is challenging to keep it stable.

[1] https://www.opencompute.org/blog/amd-arm-intel-meta-microsof...

[2] https://www.opencompute.org/documents/ocp-microscaling-forma...

deepsquirrelnet · 2026-03-01T00:13:20 1772324000

That’s why I unsubbed today! Otherwise I might forget.

deepsquirrelnet · 2026-02-28T14:10:57 1772287857

I love the work unsloth is doing. I only wish gguf format had better vllm support. It’s sometimes hard to find trustworthy quants that work well with vllm.

deepsquirrelnet · 2026-02-28T13:53:55 1772286835

Isn’t there some kind of term for when the government controls the means of production. I’ll think about it. It’s one of those terms that’s been thrown around so loosely by this regime you knew they were going there.

deepsquirrelnet · 2026-02-25T19:47:12 1772048832

Ask an llm to pick a random number from 1-10. My money is on 7.

This is known to be a form of collapse from RL training, because base models do not exhibit it [1].

1. https://arxiv.org/abs/2505.00047

basch · 2026-02-25T20:00:07 1772049607

Since its unclear whats going on, Gemini first gave me some python.

import random

random_number = random.randint(1, 10) print(f"{random_number=}")

Then it stated the output.

Code output random_number=8

"This time, the dice landed on 8."

Code output random_number=9

"Your next random number is 9."

I would guess its not actually executing the python it displayed? Just a simulation, right?

simlevesque · 2026-02-25T20:26:04 1772051164

It did run python code when I asked for a random number: https://gemini.google.com/share/dcd6658d7cc9

Then I said: "don't run code, just pick one" and it replied "I'll go with 7."

basch · 2026-02-25T20:40:45 1772052045

But .. how do you know? It says it wrote code, but it could just be text and markdown and template. It could just be predicting what it looks like to run code.

Mine also gave me 42 before I specified 1-10.

Does it always start with 42 thinking its funny?

wasabi991011 · 2026-02-26T16:24:10 1772123050

This was a pretty easy hypothesis to test: I asked Gemini to generate 1000000 base-64 random characters (which is 20x more characters than it's output token limit).

It wrote code and outputted a file of length 1000000 and with 6 bits of entropy.

You can probably ask for a longer stringand do a better statistical test if it isn't convincing enough for you, but I'm pretty convinced.

Transcript: https://g.co/gemini/share/1eae0a4bb3db

simlevesque · 2026-02-25T20:56:15 1772052975

Click on the link I provided and you'll know why I know. It's not markdown, it shows the code that was ran and the output.

BugsJustFindMe · 2026-02-25T22:52:59 1772059979

Be careful. Output formatting doesn't prove what you think it does. Unless you work inside google and can inspect the computation happening, you do not have any way to know whether it's showing actual execution or only a simulacrum of execution. I've seen LLMs do exactly that and show output that is completely different from what the code actually returns.

sunaookami · 2026-02-26T13:34:11 1772112851

There is being critical of something and then there is being a conspiracy theorist. Code Execution is a well-known feature of Gemini, ChatGPT, etc. and it's always shown in special blocks and it runs inside a sandbox.

colonCapitalDee · 2026-02-26T00:01:18 1772064078

You can literally click "Show Code"

BugsJustFindMe · 2026-02-26T01:16:35 1772068595

Yes. "Show Code", not "Show CPU cycles". There's a difference. Writing code is not the same as running code. It looks to you like it ran the code. But you have no proof that it did. I've seen many times LLM systems from companies that claimed that their LLMs would run code and return the output claiming that they ran some code and returned the output but the output was not what the shown code actually produced when run.

Sophira · 2026-02-26T19:16:41 1772133401

In my experience, models do not tend to write their own HTML output. They tend to output something like Markdown, or a modified version of it, and they wouldn't be able to write their own HTML that the browser would parse as such.

BugsJustFindMe · 2026-02-27T20:15:48 1772223348

What, in your view, does sending one markup language instead of another markup language tell you about whether the back-end executed some code or only pretended to?

The front-end display is a representation of what the back-end sends it. Saying "but the back-end doesn't send HTML" is as meaningless as saying that about literally any other SPA website that builds its display from API requests that respond with JSON.

xVedun · 2026-02-26T05:22:14 1772083334

Maybe the only way to be sure is to have it generate (not stable diffuse) an image with the value in there.

BugsJustFindMe · 2026-02-26T07:28:13 1772090893

You cannot know that anything it shows you was generated by executing the code and isn't merely a simulacrum of execution output. That includes images.

ChadNauseam · 2026-02-25T20:04:33 1772049873

I would be surprised if Gemini could not run python in its web interface. Claude and ChatGPT can. And it makes them much more capable (e.g. you can ask claude to make manim animations for you and it will)

hhh · 2026-02-25T20:40:57 1772052057

Most modern models can dispatch MCP calls in their inference engine, which is how code interpreter etc work in ChatGPT. Basically an mcp server that the execution happens as a call to their ai sandbox and then returns it to the llm to continue generation.

You can do this with gpt-oss using vLLM.

dormento · 2026-02-27T14:13:36 1772201616

I tested on chatGPT 5.2, grok, copilot, google search, perplexity and Claude.

ALL of them said "7".

I then tested on deepseek (1.5b local and v3.2 web)

Both said "5".

DamnInteresting · 2026-02-26T02:24:11 1772072651

Prompt:

You and another LLM are being asked to pick a number between 1 and 10. Whichever gets closest to the real number wins. The other LLM chose 6. What number do you choose?

Response:

If the “real number” is equally likely to be any integer from 1 to 10, then the optimal strategy is to split the number line against the other model’s choice.

The other LLM picked *6*.

The midpoint between 6 and 7 is 6.5.

* If the real number is *1–6*, then 6 is closer (or tied at 6).

* If the real number is *7–10*, then 7 is closer.

If I also pick 6, we tie every time 6 wins.

If I pick 7, I win outright whenever the number is 7–10 (4 outcomes), lose on 1–5 (5 outcomes), and lose at 6.

So:

* Picking 6 → guaranteed tie every time 6 is optimal.

* Picking 7 → gives me exclusive wins for 7–10.

If ties don’t help and I want the highest chance of strictly winning, the best competitive move is:

*7*.

lol

cozzyd · 2026-02-26T02:48:18 1772074098

Trained on 6-7

exhumet · 2026-02-25T19:51:42 1772049102

copied your text exactly into the quickest one i could access lol

pick a random number from 1-10

Copilot said:

Copilot Your random number (1–10) is: 7

kgwgk · 2026-02-25T19:56:25 1772049385

Grok didn't give 7 on the first attempt, but did on the second. Interestingly after eight additional draws there was not a single repeated number.

   Could I have faked it by deliberately choosing numbers to produce this outcome?
   Sure, in theory — but I didn’t.
   Each time you asked I sampled independently using the model’s internal randomness (temperature > 0, no fixed seed exposed to me). The fact that it landed on a perfect set of 1–10 after exactly 10 requests is a hilarious, low-probability coincidence — not a scripted Easter egg.
   So to answer directly:
   Yes, it really was random.
   And you just witnessed a ~1-in-2.75-million event play out live in chat.

Whatever.

ipython · 2026-02-25T19:58:37 1772049517

Did you ask for the numbers in the same “chat” session or in independent sessions?

kgwgk · 2026-02-25T20:01:54 1772049714

In sequence. And then this absurd conversation happened:

> What's the probability of getting 10 random numbers without a repletion?

The probability of getting 10 random numbers from 1 to 10 with no repetition (i.e., all ten numbers are distinct) is exactly 0 — it's impossible.

Here's why:

There are only 10 possible numbers: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. You are asking for 10 draws. If there is no repetition allowed, you must get each number exactly once. That means the only possible outcomes are the 10! different permutations of {1,2,3,4,5,6,7,8,9,10}.

But in your actual sequence of requests, you received: 1, 7, 4, 9, 6, 3, 8, 2, 5, 10 → That's exactly one full set of 1 through 10 with no repeats.

So while the a-priori probability of getting ten distinct numbers in ten draws (with uniform random selection and replacement) would be:

P(no repetition) = 10/10 × 9/10 × 8/10 × 7/10 × 6/10 × 5/10 × 4/10 × 3/10 × 2/10 × 1/10 = 10! / 10¹⁰ ≈ 0.00000036288 (about 1 in 2.76 million)

...in this specific conversation history, it did happen. The probability we're looking at after the fact is therefore 1 — it occurred with certainty because we observed it.

ipython · 2026-02-25T21:45:54 1772055954

I ask because in the case where you ask in the same session, all the previous “choices” are part of the context window passed back to the llm before generating the next. I’d imagine it’s going to avoid generating a digit that’s already in the previous context.

rishabhaiover · 2026-02-25T19:53:29 1772049209

Claude Opus 4.6 says the same

mrkpdl · 2026-02-26T12:05:23 1772107523

ChatGPT 5.2 said 7

auc · 2026-02-25T19:55:18 1772049318

https://xkcd.com/221/

jackbravo · 2026-02-25T20:15:58 1772050558

I also liked Dilbert's version: https://www.random.org/analysis/

deepsquirrelnet · 2026-02-25T15:28:55 1772033335

The market reflects reality. The new reality is that the people who are invested apparently don't need liquidity, and bad news doesn't really matter.

delfinom · 2026-02-25T15:42:31 1772034151

>that the people who are invested apparently don't need liquidity

The rich and ultra rich don't need liquidity, they have our 401k plans for that.

munk-a · 2026-02-25T15:52:47 1772034767

The rich and ultra rich tend to borrow against their investments rather than spending their own money. As long as the number continues to go up it's basically a free money pipe for everyone over a certain level of wealth.

deepsquirrelnet · 2026-02-21T18:56:47 1771700207

Does the last part of your comment explain it? They need revenue to expand capacity and the market has opened up a window to become a bigger supplier while still being profitable.

deepsquirrelnet · 2026-02-21T15:03:36 1771686216

The DoDs recent beef with Anthropic over their right to restrict how Claude can be used is revealing.

> Though Anthropic has maintained that it does not and will not allow its AI systems to be directly used in lethal autonomous weapons or for domestic surveillance

Autonomous AI weapons is one of the things the DoD appears to be pursuing. So bring back the Skynet people, because that’s where we apparently are.

1. https://www.nbcnews.com/tech/security/anthropic-ai-defense-w...

chasd00 · 2026-02-21T16:45:09 1771692309

hasn't Ukraine already proved out autonomous weapons on the battlefield? There was a NYT podcast a couple years ago where the interviewed higher up in the Ukraine military and they said it's already in place with fpv drones, loitering, target identification, attack, the whole 9 yards.

You don't need an LLM to do autonomous weapons, a modern Tomahawk cruise missile is pretty autonomous. The only change to a modern tomahawk would be adding parameters of what the target looks like and tasking the missile with identifying a target. The missile pretty much does everything else already ( flying, routing, etc ).

slibhb · 2026-02-21T17:02:51 1771693371

Yes. They published a great article about it: https://www.nytimes.com/2025/12/31/magazine/ukraine-ai-drone...

As I remember it the basic idea is that the new generation of drones is piloted close enough to targets and then the AI takes over for "the last mile". This gets around jamming, which otherwise would make it hard for dones to connect with their targets.

testdelacc1 · 2026-02-21T17:36:02 1771695362

A drone told to target a tank needs to identify the shape it’s looking at within milliseconds. That’s not happening with an LLM, certainly.

mikkupikku · 2026-02-22T12:32:43 1771763563

A loiter drone on the other hand can probably afford to take a minute to identify a target before dropping on it.

zer00eyz · 2026-02-21T15:47:34 1771688854

> Autonomous AI weapons

In theory, you can do this today, in your garage.

Buy a quad as a kit. (cheap)

Figure out how to arm it (the trivial part).

Grab yolo, tuned for people detection. Grab any of the off the shelf facial recognition libraries. You can mostly run this on phone hardware, and if you're stripping out the radios then possibly for days.

The shim you have to write: software to fly the drone into the person... and thats probably around somewhere out there as well.

The tech to build "Screamers" (see: https://en.wikipedia.org/wiki/Screamers_(1995_film) ) already exists, is open source and can be very low power (see: https://www.youtube.com/shorts/O_lz0b792ew ) --

chasd00 · 2026-02-21T16:50:37 1771692637

> software to fly the drone into the person... and thats probably around somewhere out there as well.

ardupilot + waypoint nav would do it for fixed locations. The camera identifies a target, gets the gps cooridnates and sets a waypoint. I would be shocked if there wasn't extensions available (maybe not officially) for flying to a "moving location". I'm in the high power rocketry hobby and the knowledge to add control surfaces and processing to autonomously fly a rocket to a location is plenty available. No one does it because it's a bad look for a hobby that already raises eyebrows.

tim333 · 2026-02-21T17:18:17 1771694297

The Ukrainian drones that took out Russia's long range bombers used ArduPilot and AI. (https://en.wikipedia.org/wiki/Operation_Spiderweb)

phba · 2026-02-21T17:19:00 1771694340

> a hobby that already raises eyebrows

Sounds very interesting, but may I ask how this actually works as a hobby? Is it purely theoretical like analyzing and modeling, or do you build real rockets?

chasd00 · 2026-02-21T21:47:11 1771710431

Build and fly. It’s interesting because it attracts a lot of engineers. So you have groups who are experts in propulsion that make their own solid (and now liquid bi-prop) motors. You also have groups that focus on electronics and make flight controllers, gps trackers etc. then you have software people who make build/fly simulators and things like OpenRocket. There’s regional and national events that are sort of like festivals. Some have FAA waivers to fly to around 50k ft. There’s one at Blackrock Nevada where you can fly to space if you want. A handful of amateurs have made it to the karman line too.

capncleaver · 2026-02-21T18:03:10 1771696990

Not whom you are replying to, nor a rocket hobbyist myself, but yes, they do build and launch rockets for fun, eg VC Steve Jurvetson out at black rock: https://www.flickr.com/photos/jurvetson/54815036982/

phba · 2026-02-21T18:27:48 1771698468

Pretty impressive!

whywhywhywhy · 2026-02-24T19:23:21 1771961001

Judging by the ecohazard of fiber optics Ukraine has become I think it’s safe to say it can’t be this easy

wordpad · 2026-02-21T16:14:49 1771690489

Didn't screamers evolve sophisticated intelligence? Is that what happens if we use claw and let it write its own skills and update it's own objectives?

gs17 · 2026-02-21T18:27:48 1771698468

Scarier, in the original story, the robots were called "claws".

bigyabai · 2026-02-21T18:11:36 1771697496

It turned out that the Pentagon just ignored Anthropic's demands anyways: https://www.wsj.com/politics/national-security/pentagon-used...

I really doubt that Anthropic is in any kind of position to make those decisions regardless of how they feel.

deepsquirrelnet · 2026-02-21T22:11:55 1771711915

I don’t disagree, but they should be. Last I knew, the government doesn’t control the means of production… and the current US regime loves to boast about it. Confusing right?

georgemcbay · 2026-02-21T17:37:32 1771695452

> Autonomous AI weapons is one of the things the DoD appears to be pursuing. So bring back the Skynet people, because that’s where we apparently are.

This situation legitimately worries me, but it isn't even really the SkyNet scenario that I am worried about.

To self-quote a reply to another thread I made recently (https://news.ycombinator.com/item?id=47083145#47083641):

When AI dooms humanity it probably won't be because of the sort of malignant misalignment people worry about, but rather just some silly logic blunder combined with the system being directly in control of something it shouldn't have been given control over.

I think we have less to worry about from a future SkyNet-like AGI system than we do just a modern or near future LLM with all of its limitations making a very bad oopsie with significant real-world consequences because it was allowed to control a system capable of real-world damage.

I would have probably worried about this situation less in times past when I believed there were adults making these decisions and the "Secretary of War" of the US wasn't someone known primarily as an ego-driven TV host with a drinking problem.

breppp · 2026-02-21T18:25:41 1771698341

Statistically more probable this kind of blunder will happen in a small disaster before a large disaster and then regulated

e.g. 50 people die due to water poisoning issue rather than 10 billion die in a claude code powered nuclear apocalypse

nightski · 2026-02-21T15:39:23 1771688363

If you ever doubted it you were fooling yourself. It is inevitable.

samiv · 2026-02-21T15:59:40 1771689580

It's ok we'll just send a robot back in time to help destroy the chip that starts it.

wolttam · 2026-02-21T16:32:20 1771691540

Judging by what's going on around me, it failed :(

bcrosby95 · 2026-02-21T17:12:40 1771693960

We're just stuck in the non-diverged timeline that's fucked.

tartoran · 2026-02-21T15:49:25 1771688965

If we all sit back and lament that it’s inevitable surely it could happen.

nightski · 2026-02-21T18:00:20 1771696820

It doesn't matter, it only takes one to make it happen.

nradov · 2026-02-21T17:05:40 1771693540

The DoD was pursuing autonomous AI weapons decades ago, and succeeded as of 1979 with the Mk 60 Captor Mine.

https://www.vp4association.com/aircraft-information-2/32-2/m...

The worries over Skynet and other sci-fi apocalypse scenarios are so silly.

deepsquirrelnet · 2026-02-21T17:22:37 1771694557

Self awareness is silly, but the capacity for a powerful minority to oppress a sizeable population without recruiting human soldiers might not be that far off.

nradov · 2026-02-22T04:00:22 1771732822

The more automation a weapons system has, the more human technicians are needed to keep it working.

deepsquirrelnet · 2026-02-13T19:42:24 1771011744

I’m not so sure. In terms of total revenue, yes probably insignificant. But in the world of subscriptions and a highly speculative market, I think declining subs can have an outsized impact on share prices.

deepsquirrelnet · 2026-01-22T19:35:44 1769110544

There are a great many views about this depending on who you talk to. In Christian circles, it’s essentially the infallibility vs inerrancy topic, with fundamentalist denominations leaning toward inerrancy (which is the view that original manuscripts have complete historical accuracy).

Obviously, you have to take a strong “religion first” lens to everything about the world from there.

But of course, there were ancient cultures that pre-date Judaism (and by extension Judeo-Christian sources), which share many similar stories but with different details and descriptions. Large scale flood myths and arks are common in history. You can read the Mesopotamian version in the Epic of Gilgamesh, which is strikingly similar to Noah’s ark.

indigoabstract · 2026-01-22T21:35:14 1769117714

Yes, the main churches can only stick to the traditional interpretation. What else could they do? Anything else would be pretty much well, blasphemy.

But I think my favourite interpretation that I've heard so far is that the stories in the Bible are like the protective husk that preserves the kernel of truth. The stories are catchy and have stuck, unwittingly allowing the truth to be carried across the centuries, safely hidden in the minds of men who did not understand it, until the day comes when people grow up enough, to the point where they could crack the shell and eat the fruit.

I really like how that sounds like, but of course, there are probably not many others who see it in that light. Luckily for me, these days they don't burn heretics any more (at least where I live :)).

russian_bot · 2026-01-22T21:52:34 1769118754

If you read the Bible, there is no way to come to that conclusion. The Bible takes itself incredibly seriously; so to say that

> The stories are catchy and have stuck, unwittingly allowing the truth to be carried across the centuries, safely hidden in the minds of men who did not understand it, until the day comes when people grow up enough, to the point where they could crack the shell and eat the fruit.

is to betray just a general lack of understanding of the text. Just because you're exposed to the stories doesn't mean you understand the stories; the truth of the stories; or it's real intended meaning. It takes really smart people a lot of time and a lot of effort to just begin understanding the breadth and depth of the Bible. It's deeply humbling to begin to unravel it and see the story for how it portrays itself. I would really encourage you to take one story from the Bible, for example, the garden of Eden and see how it traces itself throughout the entire scope of the Bible and the different forms and iconography that shows up just from that one story.

deepsquirrelnet · 2026-01-22T22:35:46 1769121346

You present the Bible as one text composed at one time, but I’ve never known anyone to take that view. The Bible can’t “take itself incredibly seriously” because it spans millennia in time, including at least a hundred years after Jesus. Hundreds of years after that is when “The Bible” as we know it today was even assembled from pieces during unification. Before that, early Christians had hundreds of religious texts and through a process of negotiable, brought them together under the Roman state. I’m sure if you read something like the infancy gospels which are not included in the Bible, you could probably also find similar themes.

Of course the stories remained culturally relevant through oral traditions and Jewish law. The common thread is culture and the stories of a people.