Since there are several questions about Encrypted Client Hello (ECH), and I kind of hand waved that section, I thought an example might be useful.
Let's say the system is running two web server daemons: a multi-tenant blog hosting platform listening on 2001:db8::1, and a multi-tenant bug tracker listening on 2001:db8::2. snid is on 192.0.2.1. Your DNS records would look like this:
blogs.example.com. A 192.0.2.1
blogs.example.com. AAAA 2001:db8::1
bugs.example.com. A 192.0.2.1
bugs.example.com. AAAA 2001:db8::2
The various tenants would be CNAMEd to one of these hostnames like:
The "decoy" hostnames (the "public_name" in ECH parlance) would be blogs.example.com or bugs.example.com. Thus, ECH would hide which tenant the client is connecting to, but would not hide the service. Note that if the client were connecting over IPv6, an eavesdropper would be able to determine the service anyways by looking at the destination IP address, which is unencrypted.
Note that if you wanted to provide privacy protection for the encrypted SNI, you'd have to have a single external-facing IPv6 address too, not just IPv4. Then, of course, you couldn't steal enough bits to represent the client address in the source address presented to the backend. Tricky. You might have to implement the PROXY protocol after all. Another option might be to do something like identd for established TCP connections, and interpose on getpeername() on the backend side to call the identd at the proxy.
Nice solution. You can got a step further if you have the need - your eavesdropper or malicious observer problem can be addressed by launching the network connections from inside the process space of your app, e.g. for golang:
https://github.com/openziti/sdk-golang
Similarly, this eliminates the IP address dependencies.
TLS + ECH encrypts all message content, including the client hello with the hostname, but it does not encrypt a specific message (an IP address + service identifier combination) that uniquely specifies which program will terminate the TLS connection.
If you want information to be private about the host name you're connecting to, you need not only a single public IP address for many hosts, but also for that server to terminate TLS for all of those hosts.
That's an accurate summary of how TLS+ECH would work with snid.
More generally, I don't think it's required for a single server to terminate TLS for all hosts. If an SNI proxy server knew the private key necessary for decrypting the ECH extension, it could look inside it to determine where to proxy the connection, without having to terminate TLS.
If snid worked this way, the unencrypted SNI hostname wouldn't need to identify the backend, which means that clients connecting over IPv4 would have more privacy. But snid would have to coordinate the ECH encryption key with the backends, which would add a lot of complexity, and IPv6 clients wouldn't benefit in any case.
> I have had it with standalone web servers: they're all over-complicated and I always end up with an awkward bifurcation of logic between my app's code and the web server's config
Personally I've grown really fond of letting nginx terminate TLS and proxy to the web app. It's a clean separation of concerns, not very complicated and upgrading the cert is easy (certbot).
Honestly, reverse proxying anything with Caddy has been so easy that I can’t use anything else anymore.
Docker containers are utterly easy to proxy, the defaults (e.g. the php_fastcgi directive) are sane and mostly work out of the box, the documentation is great, and everything seems so well thought out that one has to wonder why we put up so long with the convolutions of Apache and Nginx.
Indeed, but still, one SSRF vulnerability in anything on the host and the attacker can reconfigure Caddy to serve up any other resource on any of the networks the host can access, or deny access to any resources served by Caddy.
It's an unnecessary security risk is all I'm saying, and I personally would have preferred it was authenticated or off by default.
I really love Caddy, you've built an amazing piece of software, I just disagree with your design decision on this one little thing. It's good that you put that notice in the docs at least!
SSRF and RCE are different things. Being able to throw requests to localhost and accessing the host filesystem don't necessarily have to be the same vulnerabilities, but I'll concede that SSRF vulns are uncommon.
I'm just worried that Caddy will be the source of "security misconfiguration" (1) findings in penetration test reports. It's my opinion that we as software engineers should strive not to leave our software insecure by default, is what I'm saying, and that's how I see Caddy 2's admin API.
Can you demonstrate an exploit in Caddy itself (i.e. isn't actually exploiting something else, or configuring yourself into a hole first -- we can't do much about external factors)? If it's valid we'll see about a patch.
I'm lovin' Caddy! Below is a typical reverse proxy + www redirect + static file serving. It's so much more readable and simple than Apache/Nginx that I can't imagine switching back.
www.example.com {
redir https://example.com{uri}
}
example.com {
# tell Caddy where your favicon files are
@favicon {
path /favicon.ico
# + other favicon files...
# ...
# ...
}
# serve your static files
route /static/* {
root * /var/cache/example.com
header Cache-Control max-age=31536000 # 1 year
file_server
}
# serve your favicon files from your favicon route
route @favicon {
root * /var/cache/example.com/favicons
file_server
}
encode zstd gzip
# reverse proxy to your app
reverse_proxy 127.0.0.1:8080
# do some logging
log {
format json
output file /var/log/caddy/example.com/access.log {
roll_size 100MiB
roll_keep 10
roll_keep_for 2160h # 90 days
}
}
}
Caddy as a proxy is great - I only whish it was easier to copy apache-style auth/authz "satisfy any" (eg: whitelist some ips, require basic auth from others).
I also expect that as setups age, one is likely to miss mod_rewrite - but at that point/style of setup, maybe apache traffic server start to make sense. Or, just apache httpd of course.
You can do that easily with the `remote_ip`[0] matcher (pair it with the `not` matcher to invert the match). For example to require `basicauth`[1] for all non-private IPv4 ranges:
@needs-auth not remote_ip 192.168.0.0/16 172.16.0.0/12 10.0.0.0/8
basicauth @needs-auth {
Bob JDJhJDEwJEVCNmdaNEg2Ti5iejRMYkF3MFZhZ3VtV3E1SzBWZEZ5Q3VWc0tzOEJwZE9TaFlZdEVkZDhX
}
And for rewrites, there's the `rewrite`[2] handler. Not sure what you're missing?
One problem, as a lazy bastard, I've always had with the Caddy docs is it can sometimes be hard to glance at a page in the documentation and see "ah, that's how I'd do <common thing>".
Take the basicauth portion for example. If you had been reading the docs like a book, started in the references/tutorial sections, understood all there is to know about how request matcher syntax works as a result, and then read the basicauth page you would have a rock solid understanding of how to make basicauth do what you want here. If you land at the basicauth page from a Google search trying to stand up a quick file-server weekend project in a way your friends can access you either are really on top of it and notice "[<matcher>]" (not even mentioned in the Syntax breakdowns of the page) is what's used in the single example below and happens to be a path but might be a lot more or you leave without a hint of how to do basicauth the way you wanted. It'd be great if the syntax section breakouts just mentioned something that triggered more or less a "and hey dumbass, if you haven't learned how matchers work yet you need to go do that to fully utilized this directive".
I realize this is awfully needy, the docs have everything you need if you read them carefully, and I absolutely LOVE using Caddy so it's not an attempt to say it's bad overall by any means. I wanted to point it out though since this exact example is something I ran into a few weekends ago. I think the problem is exacerbated by v2 syntax being new, as well as the competing JSON syntax, making it harder for people to find use case examples outside of what's in the official docs.
Protip: you can click almost everything in code blocks in the docs. For example, if you click `[<matcher>]`, it brings you right to the request matcher syntax section, which explains what you can fill in there.
It would be redundant to write on every page what you can use as a matcher. The Caddyfile reference docs assume you've read https://caddyserver.com/docs/caddyfile/concepts which walks you through how the Caddyfile is structured, and it'll give you the fundamentals you need to understand the rest of the docs (I think, anyway).
If you think we need more examples for a specific usecase, we can definitely include those. Feel free to propose some changes on https://github.com/caddyserver/website, we could always use the help!
I had a huge facepalm the day I realized all of the syntax was clickable :). It's not even a uncommon feature I just hadn't tried clicking it at for some reason!
Yeah I wouldn't say necessarily every page needs examples of different matchers and certainly not about what all of the matcher options are. More a "if the token is shown in the syntax it should have a bullet in the syntax section" kind of approach which in the case of [<matcher>] could be as plain and short as "[<matcher>] is a token which allows you to control the scope a directive applies to. For details on how see Request Matchers." to raise the "you probably want to know more about this first" flag to anyone that just jumped in from Google or what have you to go read about matchers before trying to understand the directive.
If that makes any sense I'd be glad to raise it more formally over on the GitHub!
Right, I know it's possible (and thanks for the example) - I still whish it was easier to specify "only authorized given conditions x, y, z".
In the above, access is implied, then revoked without a valid username/password, then login is excepted for certain conditions (IPs in this case) - and access is implied (again).
IMNHO one of the strengths of apache is how authentication providers and authorization is separated and allow for easy(ish) combinations.
That said, there's something to be said for doubling down on a single type of handling for access and rewrites (matchers).
I still prefer matchers (which resource) combined with action/policy (allow/deny/rewrite).
I'm not totally sure I follow how you'd like for it to be structured. Could you give a config example? I think one of the issues is `basicauth` needs to write a response header to work correctly (i.e. WWW-Authenticate to tell the browser to prompt) so it can't only act as a matcher.
Caddy looks interesting, I currently use apache to proxy a few hundered sites and it works well enough, some are protected by client certificates, others by oidc, all then pass the authenticated user to the downstream server in a header, job done.
I've managed to do this with openresty (nginx not supporting oidc out of the box), but it doesn't fill me with confidence, I guess it's all the lua. A quick glance at caddy shows it likewise doesn't support oidc integration out of the box, but instead I have to use another module that's no longer maintained ( https://github.com/thspinto/caddy-oidc )
Yeah, we defer to plugins to provide auth solutions, because it's... a whole thing. It's best maintained outside of the standard distribution, because there's so many ways to approach it.
The caddy-oidc plugin you linked was written for Caddy v1, so it's no longer compatible. The most complete auth plugin for Caddy v2 is https://github.com/greenpau/caddy-security, and I think it probably does what you need.
> Since IPv6 addresses are 128 bits long, but IPv4 addresses are only 32 bits, it's possible to embed IPv4 addresses in IPv6 addresses. snid embeds the client's IP address in the lower 32 bits of the source address which it uses to connect to the backend.
How does this work? snid just makes up an IP address? What socket API calls do you make to do this? Just pick an address and bind, and the kernel is fine with that? And it all gets routed back and forth correctly? Do you have to configure this 64:ff9b:1::/48 prefix on the loopback interface?
> Encrypted Client Hello doesn't actually encrypt the initial Client Hello message. It's still sent in the clear, but with a decoy SNI hostname. The actual Client Hello message, with the true SNI hostname, is encrypted and placed in an extension of the unencrypted Client Hello. To make Encrypted Client Hello work with snid, I just need to ensure that the decoy SNI hostname resolves to the IPv6 address of the backend server. snid will see this hostname and route the connection to the correct backend server, as usual.
How does the decoy SNI hostname get chosen? This sounds like there needs to be a different decoy hostname for each backend service. Does that come from DNS somehow? The client doesn't just make it up at random?
> How does this work? snid just makes up an IP address? What socket API calls do you make to do this? Just pick an address and bind, and the kernel is fine with that? And it all gets routed back and forth correctly? Do you have to configure this 64:ff9b:1::/48 prefix on the loopback interface?
First you have to set the IP_FREEBIND socket option, which allows binding to nonlocal/nonexistent IP address and then you call bind with whatever address you like. To ensure the packets get routed back properly, you need a local route for the 64:ff9b:1::/96 prefix, which can be added with:
ip route add local 64:ff9b:1::/96 dev lo
> How does the decoy SNI hostname get chosen? This sounds like there needs to be a different decoy hostname for each backend service. Does that come from DNS somehow? The client doesn't just make it up at random?
The decoy hostname is specified in the ECHConfig struct[1], which is conveyed to the client via DNS in the HTTPS record[2].
It does indeed mean that each backend needs its own decoy hostname (which resolves to the IPv6 address of the backend). This means that ECH does not hide which backend is being connected to, but if a particular backend handles multiple hostnames, it can hide which of those hostnames the client is connecting to.
If you have at least a /96 to dedicate for snid, then couldn't you just use that public prefix instead of 64:ff9b to encapsulate ipv4 address, making the setup somewhat simpler? Also if you used public prefix then I imagine you could even run this setup over the internet, i.e. have snid run on some public server with dualstack and forward connections to ipv6-only app servers. I'm imagining the common situation where you have cgnat ipv4 + native ipv6 at home, you could host snid on public cloud instance to expose services running at home.
> Meanwhile, my preferred language, Go, has a high-quality, memory-safe HTTPS server in the standard library that is well suited for direct exposure on the Internet.
I know people do use Golang's http.Server for production use-cases. Does Google, though? Are services of Google customers ever actually directly talking to this Golang stack, without at least a minimal L7 WAF (as e.g. a default Nginx config does) in between?
I ask because there are a number of weird connection latency, slowness, and "stuttering" problems I've experienced with services which I know do directly expose Golang servers — e.g. Docker Registry instances including Docker Hub; Minio instances; go-ethereum nodes; etc. — that I've never experienced with any Google service, or with any known non-Golang service.
My hypothesis is that this is due to Golang's http.Server not having any upper limit on simultaneous connections (because just in CPU and memory terms, the Golang runtime can handle almost arbitrarily many), such that eventually the bottleneck actually becomes per-connection throughput, with clients becoming starved for space in the machine's network ring buffer; and because this is such an unusual bottleneck to have (usually it's only a thing with CDNs) — and because it causes no problems for the server, esp. if things like readiness checks are done through a separate internal NIC — the people running these servers don't even notice it's happening, and so lag far behind in horizontally scaling servers to spread out the demand for throughput.
Or, to put that another way: the Golang http.Server isn't observable — exposing server-internal metrics — in the way that actual web servers like Nginx, or even web-app server frameworks like Jetty, are; and so it's very hard to know when things are silently going wrong for users, esp. in cases where the developers of a piece of software aren't themselves running it at scale and so never think to manually add observability for metrics that only become relevant at scale (which authors of generic web server software are usually aware of, but authors of application software usually aren't.) This leads me to think that, if Google themselves are using Golang services for anything at scale, and yet not rushing to implement such metrics into http.Server, then they must be observing these services in a very different way than we mere mortals do. Maybe calculating per-flow packet-wise QoS at the edge in their fancy LANai switches using historical statistical fingerprints of predictable flow patterns, or something.
I wrote a web server that uses go http but not http.serve for this reason. I wanted more control over accept and close. Nice thing is the http library is decently composed, so you can take all the parts you want and build up.
you can use opentelemetry to get traces and metrics for golang net/http, but since I'm not using it I have no idea what metrics are already supported.
btw. google does use golang services at scale (I'm not a googler) but they probably do it like they do it with appengine and only limit a certain amount of req/s per service
> you can use opentelemetry to get traces and metrics for golang net/http
Yes, but not server-internal metrics, as IIRC the telemetry provider plugs into http.Server as a middleware. It only exposes the kinds of metrics you could calculate yourself in a handler; it doesn’t expose any of the “good stuff” (e.g. work queues, accept(2) latency, etc.)
Nice, this is kind of why I made Project Conncept. It's a powerful TCP and UDP stream multiplexer based on Caddy: https://github.com/mholt/caddy-l4
You can route raw TCP connections by using higher layer protocol matching logic like HTTP properties, SSH, TLS ClientHello info, and more, in composable routes that let you do nearly anything.
> You can route raw TCP connections by using higher layer protocol matching logic like HTTP properties, SSH, TLS ClientHello info, and more, in composable routes that let you do nearly anything.
How do you foresee such a setup handle QUIC? The encrypted connection-ids, 0RTT handshakes, and roaming client-ip and server-ips make it non trivial to proxy connections transparently.
Good question; I'm not really sure! Will need to look into it, or have people contribute some ideas. Feel free to start a discussion on the issue tracker if you're interested in this!
I have a server that hosts several websites. I wanted some of them to be installed in a separate (systemd) container (because they belong to the same organization).
I use nginx's ssl_preread module to proxy https requests to the container or to another port depending on the SNI. This is what snid does in the article if I understood correctly (without the DNS lookup because I don't need it, but it is able to do it too). It works well and it's good that the nginx at the front does not need to have the SSL certificates. In this setup, Nginx does not need to decode anything, it just does a pass-through, so this is quite light. It is also way simpler to setup than an actual HTTP reverse proxy.
This is great. I think SNI is currently one of the most pragmatic tools to work around ipv4 exhaustion.
I like the approach described here, but in practice I prefer the convenience of having a reverse proxy to automatically handle TLS certs for me. That said, libraries like certmagic are making it more feasible for every app to manage its own certs.
See also, DNS SVC (A/B) records (pseudo NAT at DNS layer); but not many deployments use it or understand it. Note that, SNI as a routing replacement works for TCP nicely without much (user-space) complication. With QUIC, transparently proxying connections isn't all that straight forward.
QUIC hides and encrypts everything it possibly can, and that includes server and client identities (IPs are inconsequential to a QUIC session which is instead maintained through connection-id(s) exchanged under TLS encryption or is obfuscated away; that is, there is no way for a middleware to analyse a QUIC session / flow without actually MiTMing TLS).
As part of my effort to single-stack-v6 and minimum-effort-v4, for my self hosted services and projects, I've always wanted a way to avoid reverse proxy configuration for each hosted app; v6 can directly listen to a new address.
If the decoy hostnames (as the author describes it) used in encrypted SNI are deterministic such that you can statically determine the real hostname the client wants, then what’s the point of encrypted SNI in the first place?
ECH can't hide which backend the client wants, but if a particular backend handles multiple hostnames, it can hide which of those hostnames the client wants. I went into further detail here: https://news.ycombinator.com/item?id=31136335
Clever. I think I'd prefer to do TLS termination at the proxy though, something akin to stunnel. Of course snid could be used together with stunnel, but I think it would lose the O(1) configuration then. Just terminating tls and not touching the http content would still avoid any of those http parsing issues mentioned.
An earlier version did the TLS termination in the proxy. It's true you can avoid the HTTP parsing issues, but you lose the ability to do client certs or have backend-specific cipher/TLS version requirements. Also, I really like it that IPv6 clients can connect directly to the backend, bypassing any proxies.
I'm a huge fan of this approach, but I would also combine it almost equally with standard http reverse proxies; there's a lot you can gain from having a proxy that can understand paths,buffer requests, etc.
Let's say the system is running two web server daemons: a multi-tenant blog hosting platform listening on 2001:db8::1, and a multi-tenant bug tracker listening on 2001:db8::2. snid is on 192.0.2.1. Your DNS records would look like this:
The various tenants would be CNAMEd to one of these hostnames like: The "decoy" hostnames (the "public_name" in ECH parlance) would be blogs.example.com or bugs.example.com. Thus, ECH would hide which tenant the client is connecting to, but would not hide the service. Note that if the client were connecting over IPv6, an eavesdropper would be able to determine the service anyways by looking at the destination IP address, which is unencrypted.