Hacker Newsnew | past | comments | ask | show | jobs | submit | niyikiza's commentslogin

I've been going down this exact rabbit hole for the last few months. The 'opt-in guardrails' problem you mentioned is the dealbreaker. If the agent can just ignore the read_file tool wrapper and call os.system('cat ...'), the policy is useless.

I ended up building a 'capability token' primitive (think Macaroons or Google Zanzibar, but for ephemeral agent tasks) to solve this.

My approach (Tenuo) works like this:

1. Runtime Enforcement: The agent gets a cryptographically signed 'Warrant' that mechanically limits what the runtime allows. It’s not a 'rule' the LLM follows; it’s a constraint the runtime enforces (e.g., fs:read is only valid for /tmp/*).

2. Attenuation: As the agent creates sub-tasks, it can only delegate less authority than it holds.

3. Offline Verify: I wrote the core in Rust so I can verify these tokens in ~27µs on every single tool call without a network round-trip.

If you are building a POC, feel free to rip out the core logic or use the crate directly. I’d love to see more tools move away from 'prompt engineering security' toward actual runtime guarantees.

Repo: https://github.com/tenuo-ai/tenuo


This is a really helpful comment, and I actually ran straight into the exact failure mode you’re describing.

I had a PreToolUse hook enabled that was supposed to block reads of ~/.env. Claude tried to read it, hit an error, then asked me for permission. When I said yes, it retried the read and succeeded. The hook was effectively bypassed via user consent.

That was the “oh wow” moment for me. Hooks can influence behavior, but they don’t remove authority. As long as the agent process still has filesystem access, enforcement is ultimately negotiable. I even tried adding an MCP server, but again its upto Claude to pick it up.

Your capability token approach is the missing piece here. It makes the distinction very clear: instead of asking the agent to behave, you never give it the power in the first place. No token, no read, no amount of prompting or approval changes that unless a new token is explicitly minted.

The way I’m thinking about it now is:

hooks are useful for intent capture and policy decisions

capability tokens are the actual enforcement primitive

approvals should mint new, narrower tokens rather than act as conversational overrides

Really appreciate you sharing Tenuo. This feels like the right direction if we want agent security to move past “prompt engineering as policy” and toward real runtime guarantees.


That "oh wow" moment you described, where the agent effectively social-engineered the user to bypass the hook, is exactly the failure mode that pushed me to build this. Hooks are advisory, capabilities are mandatory.

Your framing of "approvals should mint new tokens" is the core design pattern in Tenuo.

The agent starts with zero file access. When it asks "Can I read ~/.env?" and the user says "Yes", the system doesn't just disable the hook. It mints a fresh, ephemeral Warrant for path: "~/.env".

That way, even if the agent hallucinates later and tries to reuse that permission for ~/.ssh (or even ~/.env after ttl), it physically can't. The token doesn't exist.

Glad Tenuo resonates. This is the direction the whole ecosystem needs to move


I've been spending weekends thinking about authorization for AI agents, specifically delegation.

The failure mode I keep hitting: once you give an agent tools, it gets ambient authority over all of them. There's no clean way to say "for this task, read-only on the reports table" or "spin up no more than 3 VMs." When the agent spawns sub-agents mid-execution, they inherit full access by default.

IAM doesn't help much. Authority stays tied to the agent's identity even as intent shifts during execution.

I'm exploring a capability-based model instead: authority is explicit, task-scoped, and attenuating. Closest to Macaroons/Biscuit, but adapted for workflows where delegation happens dynamically mid-task.

Early prototype (Rust core, Python SDK, LangChain integration), still thinking it through. Notes here: https://niyikiza.com/posts/capability-delegation/


Excellent article, and I fully agree.

I came to the same realization a while ago and started building an agent runtime designed to ensure all (I/O) effects are capability bound and validated by policies, while also allowing the agent to modify itself.

https://github.com/smartcomputer-ai/agent-os/


Thanks! Just looked at Agent OS. Love the 'Signed Receipts' concept in your AIR spec.

We reached the same conclusion on the 'Ambient Authority' problem, but I attacked it from the other end of the stack.

Tenuo is just the authorization primitive (attenuating warrants + verification), not the full runtime. The idea is you plug it into whatever runtime you're already using (LangChain, LangGraph, your own).

I'm currently in stealth-ish/private alpha, but the architecture is designed to be 'userspace' agnostic. I’d love to see if Tenuo’s warrant logic could eventually serve as a primitive inside an Agent OS process.

I'll shoot you a note. I would love to swap notes on the 'Capabilities vs. Guardrails' implementation details.


Felt similarly back in the days when I bought my first house but had to buy another one (because reasons):

- Property owner: can house prices just go up and rise my equity

- Prospect house buyer: can house prices just stay sane until I buy again


where are you based? What's your specialization?


Mr Meeseeks with extra steps.


EY wrote a good book about that very topic: "Inadequate Equilibria -- Where and How Civilizations Get Stuck" https://equilibriabook.com/


I mean...he doesn't have to say it: he is clearly incompetent!


There are many "viable" options to advertising with AdSense!


I wonder if its celebrities know things


You may benefit from his other essay on how to disagree: http://www.paulgraham.com/disagree.html


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: