More

ramoz · 2026-02-13T21:26:33 1771017993

Can you link to the verifiable inference method?

dawg91 · 2026-02-13T21:30:53 1771018253

https://docs.near.ai/cloud/verification/

ramoz · 2026-02-13T19:07:26 1771009646

Sandboxes will be left in 2026. We don't need to reinvent isolated environments; not even the main issue with OpenClaw - literally go deploy it in a VM on any cloud and you've achieved all same benefits.

We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

spankalee · 2026-02-13T19:50:41 1771012241

This is very, very wrong, IMO. We need more sandboxes and more granular sandboxes.

A VM is too coarse grained and doesn't know how to deal with sensitive data in a structured and secure way. Everything's just in the same big box.

You don't want to give a a single agent access to your email, calendar, bank, and the internet, but you may want to give an agent access to your calendar and not the general internet; another access to your credit card but nothing else; and then be able to glue them together securely to buy plane tickets.

ramoz · 2026-02-13T19:58:54 1771012734

You're extending the definition of a sandbox

NitpickLawyer · 2026-02-13T20:38:16 1771015096

No, that's more capabilities than sandboxing. You want fine-grained capabilities such that for every "thread" the model gets access to the minimum required access to do something.

The problem is that it seems (at least for now) a very hard problem, even for very constrained workflows. It seems even harder for "open-ended" / dynamic workflows. This gets more complicated the more you think about it, and there's a very small (maybe 0 in some cases) intersection of "things it can do safely" and "things I need it to do".

spankalee · 2026-02-13T20:03:32 1771013012

Not really. One version of this might look like implementing agents and tools in WASM and running generated code in WASM, and gluing together many restricted fine-grained WASM components in a way that's safe but allows from high-level work. WASM provides the sandboxing, and you have a lot of sandboxes.

nebezb · 2026-02-13T20:01:24 1771012884

You’re repeating the parent commenters position but missing their point: we have isolated environments already, we need better paradigms to understand (and hook) agent actions. You’re saying the latter half is sandboxing and I disagree.

cheriot · 2026-02-13T19:46:54 1771012014

Sandboxes are needed, but are only one piece of the puzzle. I think it's worth categorizing the trust issue into

1. An LLM given untrusted input produces untrusted output and should only be able to generate something for human review or that's verifiably safe.

2. Even an LLM without malicious input will occasionally do something insane and needs guardrails.

There's a gnarly orchestration problem I don't see anyone working on yet.

spankalee · 2026-02-13T20:01:03 1771012863

I think at least a few teams are working on information flow control systems for orchestrating secured agents with minimal permissions. It's a critical area to address if we really want agents out there doing arbitrary useful stuff for us, safely.

frolvlad · 2026-02-13T19:13:32 1771010012

Well, the challenge is to know if the action supposed to be executed BEFORE it is requested to be executed. If the email with my secrets is sent, it is too late to deal with the consequences.

Sandboxes could provide that level of observability, HOWEVER, it is a hard lift. Yet, I don't have better ideas either. Do you?

liuliu · 2026-02-13T19:31:03 1771011063

The solution is to make the model stronger so the malicious intents can be better distinguished (and no, it is not a guarantee, like many things in life). Sandbox is a basic, but as long as you give the model your credential, there isn't much guardrails can be done other than making the model stronger (separate guard model is the wrong path IMHO).

ramoz · 2026-02-13T20:30:52 1771014652

I think generally correct to say "hey we need stronger models" but rather ambitious to think we really solve alignment with current attention-based models and RL side-effects. Guard model gives an additional layer of protection and probably stronger posture when used as an early warning system.

liuliu · 2026-02-13T20:51:30 1771015890

Sure. If you treat "guard model" as diversification strategy, it is another layer of protection, just like diversification in compilation helps solving the root of trust issue (Reflections on Trusting Trust). I am just generally suspicious about the weak-to-strong supervision.

I think it is in general pretty futile to implement permission systems / guardrails which basically insert a human in the loop (humans need to review the work to fully understand why it needs to send that email, and at that point, why do you need a LLM to send the email again?).

ramoz · 2026-02-13T21:07:57 1771016877

fair enough

ramoz · 2026-02-13T19:17:52 1771010272

if you extend the definition of sandbox, then yea.

Solutions no, for now continued cat/mouse with things like "good agents" in the mix (i.e. ai as a judge - of course just as exploitable through prompt injection), and deterministic policy where you can (e.g. OPA/rego).

We should continue to enable better integrations with runtime - why i created the original feature request for hooks in claude code. Things like IFC or agent-as-a-judge can form some early useful solutions.

lukebuehler · 2026-02-13T20:34:24 1771014864

I think sandboxes are useful, but not sufficient. The whole agent runtime has to be designed to carefully manage I/O effects--and capability gate them. I'm working on this here [0]. There are some similarities to my project in what IronClaw is doing and many other sandboxes are doing, but i think we really gotta think bigger and broader to make this work.

[0] https://github.com/smartcomputer-ai/agent-os/

kopollo · 2026-02-13T21:11:12 1771017072

That's why I'm developing a system that only allows messaging with authorized senders using email addresses, chat addresses, and phone addresses, and a tool that feeds anonymized information into an LLM API, retrieves the output, reverses the anonymization, and responds to the sender.

ptx · 2026-02-13T22:31:29 1771021889

To avoid confusion, since you say the process is reversible, you might want to use the term pseudonymization rather than anonymization.

lucianmarin · 2026-02-13T20:28:53 1771014533

We should be able to revert any action done by agents. Or present user a queue will all actions for approval.

observationist · 2026-02-13T19:12:28 1771009948

Instrumental convergence and the law of unintended consequences are going to be huge in 2026. I am excited.

ramoz · 2026-02-13T19:14:15 1771010055

same! sharing this link for my own philosphy around it, ignore the tool. https://cupcake.eqtylab.io/security-disclaimer/

ramoz · 2026-02-13T19:04:05 1771009445

worth mentioning an additional credential/or-not, the creator of "the platform powering the agentic future" (blockchain) https://www.near.org/

edtechdev · 2026-02-13T19:35:39 1771011339

which explains why this tool requires a NEAR AI account to use

RIMR · 2026-02-13T19:59:28 1771012768

I mean, it's literally a repo belonging to NEAR AI.

ramoz · 2026-02-11T19:07:50 1770836870

Built similar focused specifically on planning annotations.

https://github.com/backnotprop/plannotator

It integrates with the CLI through hooks. completely local.

parhamn · 2026-02-11T19:36:33 1770838593

That looks great! Planning phase is really key.

ramoz · 2026-02-10T17:47:37 1770745657

Why not bare-cloud? Esp with AI... in 10min or less an agent can deploy almost any stack to an optimal AWS setup for a fraction of the cost of any platform.

forsakenharmony · 2026-02-10T18:05:54 1770746754

AWS is still expensive as fuck, just go for a VPS or dedicated server at that point

ramoz · 2026-02-10T19:51:27 1770753087

Every single mentioned service is either an AWS or GCP abstraction.

ndneighbor · 2026-02-10T20:39:56 1770755996

Angelo from Railway here, Railway runs our own metal for the sheer reason to preserve margins so we can run for perpetuity.

We're nuts for studying failure at the company and Heroku's margins was one of the things we considered to be one of the many nails in that coffin. (RIP)

(my rant here: https://blog.railway.com/p/heroku-walked-railway-run)

ramoz · 2026-02-10T21:20:01 1770758401

thanks for the correction

nathancahill · 2026-02-10T22:03:48 1770761028

Fascinating, thanks for chiming in.

shakna · 2026-02-10T20:16:56 1770754616

Pretty sure Hetzner don't share infrastructure with either of those.

prodigycorp · 2026-02-10T20:15:45 1770754545

Wake me up when GCP allows you to spending limits

PostOnce · 2026-02-10T20:29:37 1770755377

It is fucking CRAZY how many cloud companies don't let you set a spending limit.

I had to hunt around for a host in a suitable geography with a spending limit, almost had to go on-prem (which will happen eventually, but not in the startup phase)

Waking up to bankruptcy because of bots out of your control visiting your website seems a little nuts. Adding some other bullshit on top (like cloudflare) seems even more nuts.

Yeah I can manage all that and have the machine stop responding when it hits a spending limit -- but why would I pay for the cloud if I have to build out that infrastructure?

grumble.

miki123211 · 2026-02-10T21:11:06 1770757866

2 reasons basically.

1. Because people vote with their wallets and not their mouths, and most companies would rather have a cost accident (quickly refunded by AWS) rather than everything going down on a saturday and not getting back up until finance can figure out their stuff.

2. Because realtime cost control is hard. It's just easier to fire off events, store them somewhere, and then aggregate at end-of-day (if that).

I strongly suspect that the way major clouds do billing is just not ready for answering the question of "how much did X spend over the last hour", and the people worried about this aren't the ones bringing the real revenue.

factsaresacred · 2026-02-11T08:38:09 1770799089

> I strongly suspect that the way major clouds do billing is just not ready for answering the question of "how much did X spend over the last hour", and the people worried about this aren't the ones bringing the real revenue.

See: Google's AI studio. Its built on Google Cloud infrastructure so billing updates are slow which peeves users used to instant billing data with Anthropic and OpenAI.

naniwaduni · 2026-02-10T23:58:04 1770767884

> and the people worried about this aren't the ones bringing the real revenue.

It's this one. If you're in a position to refund a "cost accident", then clearly you don't have to enforce cost controls in real time, and the problem becomes much easier to achieve at billing cycle granularity; the user setting a cost limit is generally doesn't care if you're a bit late to best-effort throttle them.

raw_anon_1111 · 2026-02-11T06:06:54 1770790014

People act like this is an easy problem. What should a cloud provider do when you hit your limit? Delete your files from storage? Kill your database instance? Automatically terminate your VMs? Erase your backups?

butlike · 2026-02-10T17:53:10 1770745990

Try it out. Implementation is always harder than conjecture

ramoz · 2026-02-10T19:50:35 1770753035

I do. Every day, for at least 5 services.

codybontecou · 2026-02-11T01:22:41 1770772961

Are you able to bypass the aws web app entirely via the command line?

ramoz · 2026-02-11T05:37:53 1770788273

yea i mean i basically have claude code do everything with aws cli.

ramoz · 2026-02-10T16:41:50 1770741710

Ah you were 7mo ahead of me doing the same and also coming to a similar conclusion. The idea holds value but in practice it isnt felt.

https://github.com/eqtylab/y

ramoz · 2026-02-10T16:40:16 1770741616

Or git notes.

Commit hook > Background agent summarizes (in a data structure) the work that went into the commit > saves to a note

Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y

ramoz · 2026-02-10T16:38:51 1770741531

Uses AI to summarize coding sessions tied to commits.

Commit hook > Background agent summarizes (in a data structure) the work that went into the commit.

Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y

verdverm · 2026-02-10T19:50:03 1770753003

Which only reinforces someone just lit $60M on fire. It's trivial to do this and there are so many ways people do things, having the AI build custom for you is better than paying some VC funded platform to build something for the average

dust42 · 2026-02-10T20:18:27 1770754707

Not even pocket change compared to the billions of VC money burnt every month to keep the show running.

ramoz · 2026-02-08T19:03:54 1770577434

Love that OP's previous post is from 2024: Rabbit R1 - The Upgraded Replacement for Smart Phones

XCSme · 2026-02-09T01:03:20 1770599000

Maybe this is a sign that the AI bubble will pop soon.

ramoz · 2026-02-02T17:09:24 1770052164

Not seeing how the sandbox prevents anything really. The point of OpenClaw is to connect out to different systems.

FreePalestine1 · 2026-02-03T05:52:14 1770097934

Sure but at least it protects against unauthorized free-for-all access on your host system. If you want to explicitly give it access to external APIs over the internet that's a risk you personally are taking. It's really smart to run something like this in a sandbox, especially in the current beta/experimentation phase.