The model file is small enough to have in Git (safetensors is only 600MB) but the Gemma TOS make me unsure if I’m required to have the same “Read and accept the Gemma TOS” limitation that they have on their public huggingface model.
As for ptrace, I use it to inject code into the users shell to present the command in a way that doesn’t require further interaction to run. I wanted it to be more like the “AI terminal” experience without requiring the user to copy-paste the recommended command back into their shell prompt.
> During the TLS handshake, the client tells the server which treeheads it has.
I don’t love the idea of giving every server I connect to via TLS the ability to fingerprint me by how recently (or not) I’ve fetched MTC treeheads. Even worse if this is in client hello, where anyone on the network path can view it either per connection or for my DoH requests to bootstrap encrypted client hello.
If your browser is online on an unrestricted network, then the tree heads will be kept up to date, and this will leak nothing. If you had your laptop closer for a weekend, open it up immediately and visit a website before your browser had a chance to update, well, you leak for maybe a minute or two you had your laptop closed for a weekend. So it's not that much. But we'll want to see how we can reduce this as much as possible.
It can't possibly be updating continuously in real time, can it? Especially for battery devices, a constant background thread polling for updates seems untenable.
Sure, but unlike the CRL checks the server gets to directly know how recently the client fetched the update if my understanding is correct. Knowing which landmarks the client has would likely give you a fairly precise picture of the update time, since more frequent landmarks yields smaller MTC proofs.
Spitballing here, would it still meet the needs of the protocol if the client offered which MTCAs it has (no version information), the server sends back some “typical” depth (say, 3 levels up the tree), then the client can decide to either:
* Accept the MTC
* Request a deeper traversal, following some super linear growth like fib numbers. In that case, they’d communicate “give me up to 5 nodes above your leaf”
* Reject the MTC
* Request the full certificate for “traditional” validation
The server still has a side channel for “how recently updated is this client” by knowing how many levels of inclusion proofs needed to be shared, but this is much less signal than knowing exactly which landmarks a client has.
Different machines will need to have variations in when they grab updates to avoid thundering herd problems.
I could see the list of client-supplied available roots being added to client fingerprinting code for passive monitoring (e.g. JA4) if it’s in the client hello, or for the benefit of just the server if it’s encrypted in transit.
Yes, but, if this is a public website that anyone can use, then an abuser using your TURN server for other purposes can also grab a single-use credential from the site, making it a bit pointless.
Without TURN, two clients that want to do streaming communication connect directly to each other, letting both ends know things like IP addresses, supported protocols, and other fingerprintable features. This was the norm for a long time - “I got your IP, I know where you live”
I’m not sure what you mean by fingerprinting and supported protocols. None of that would be present inherently in a UDP stream unless the application included it. As for hiding IP address, that is a valid use case for a TURN server but I’m guessing 99% of TURN server usage occurs only because the NAT hole punch failed.
> by fingerprinting and supported protocols. None of that would be present inherently in a UDP stream unless the application included it.
Much like TLS, both clients offer all the protocols, versions, and media encodings that they support so that they can find a common set that they can use together.
This is standard negotiation when establishing connections in WebRTC and it's obviously fingerprintable information.
You’re talking about WebRTC which may or may not make use of a TURN server. And I assume there are other uses of TURN which are completely unrelated to WebRTC.
Use of a TURN server does not imply hiding of negotiation details. The TURN RFC [1] does not mention anything related to media encodings or WebRTC negotiations at all.
As for ptrace, I use it to inject code into the users shell to present the command in a way that doesn’t require further interaction to run. I wanted it to be more like the “AI terminal” experience without requiring the user to copy-paste the recommended command back into their shell prompt.