Using SNI proxying and IPv6 to share port 443 between webapps

agwa · on April 23, 2022

Since there are several questions about Encrypted Client Hello (ECH), and I kind of hand waved that section, I thought an example might be useful.

Let's say the system is running two web server daemons: a multi-tenant blog hosting platform listening on 2001:db8::1, and a multi-tenant bug tracker listening on 2001:db8::2. snid is on 192.0.2.1. Your DNS records would look like this:

  blogs.example.com. A 192.0.2.1
  blogs.example.com. AAAA 2001:db8::1
  bugs.example.com.  A 192.0.2.1
  bugs.example.com.  AAAA 2001:db8::2

The various tenants would be CNAMEd to one of these hostnames like:

  blog.domain1.example. CNAME blogs.example.com.
  bugs.domain2.example. CNAME bugs.example.com.

The "decoy" hostnames (the "public_name" in ECH parlance) would be blogs.example.com or bugs.example.com. Thus, ECH would hide which tenant the client is connecting to, but would not hide the service. Note that if the client were connecting over IPv6, an eavesdropper would be able to determine the service anyways by looking at the destination IP address, which is unencrypted.

cryptonector · on April 24, 2022

Note that if you wanted to provide privacy protection for the encrypted SNI, you'd have to have a single external-facing IPv6 address too, not just IPv4. Then, of course, you couldn't steal enough bits to represent the client address in the source address presented to the backend. Tricky. You might have to implement the PROXY protocol after all. Another option might be to do something like identd for established TCP connections, and interpose on getpeername() on the backend side to call the identd at the proxy.

gz5 · on April 23, 2022

Nice solution. You can got a step further if you have the need - your eavesdropper or malicious observer problem can be addressed by launching the network connections from inside the process space of your app, e.g. for golang: https://github.com/openziti/sdk-golang

Similarly, this eliminates the IP address dependencies.

Sample (Java in this case - see GitHub above for various language options): https://blogs.oracle.com/javamagazine/post/java-zero-trust-o...

bscphil · on April 23, 2022

Would it be accurate to summarize this way?

TLS + ECH encrypts all message content, including the client hello with the hostname, but it does not encrypt a specific message (an IP address + service identifier combination) that uniquely specifies which program will terminate the TLS connection.

If you want information to be private about the host name you're connecting to, you need not only a single public IP address for many hosts, but also for that server to terminate TLS for all of those hosts.

agwa · on April 23, 2022

That's an accurate summary of how TLS+ECH would work with snid.

More generally, I don't think it's required for a single server to terminate TLS for all hosts. If an SNI proxy server knew the private key necessary for decrypting the ECH extension, it could look inside it to determine where to proxy the connection, without having to terminate TLS.

If snid worked this way, the unencrypted SNI hostname wouldn't need to identify the backend, which means that clients connecting over IPv4 would have more privacy. But snid would have to coordinate the ECH encryption key with the backends, which would add a lot of complexity, and IPv6 clients wouldn't benefit in any case.

ignoramous · on April 23, 2022

> That's an accurate summary of how TLS+ECH would work with snid.

snid or not, ECH doesn't function much different to domain fronting, does it?

barbazoo · on April 23, 2022

Wow this is neat.

> I have had it with standalone web servers: they're all over-complicated and I always end up with an awkward bifurcation of logic between my app's code and the web server's config

Personally I've grown really fond of letting nginx terminate TLS and proxy to the web app. It's a clean separation of concerns, not very complicated and upgrading the cert is easy (certbot).

francislavoie · on April 23, 2022

FWIW, Caddy can replace nginx and certbot for this purpose, with less config and more robust ACME.

Let me know if you need me to clarify, if you have any concerns.

fffrantz · on April 23, 2022

Honestly, reverse proxying anything with Caddy has been so easy that I can’t use anything else anymore. Docker containers are utterly easy to proxy, the defaults (e.g. the php_fastcgi directive) are sane and mostly work out of the box, the documentation is great, and everything seems so well thought out that one has to wonder why we put up so long with the convolutions of Apache and Nginx.

rekoil · on April 23, 2022

I take issue with the Caddy 2 admin API. It is easy to disable, but it's enabled by default(!!!) and requires no authentication!

Besides that Caddy is indeed amazing and very well thought out!

mholt · on April 24, 2022

It's only accessible on the loopback interface.

rekoil · on April 24, 2022

Indeed, but still, one SSRF vulnerability in anything on the host and the attacker can reconfigure Caddy to serve up any other resource on any of the networks the host can access, or deny access to any resources served by Caddy.

It's an unnecessary security risk is all I'm saying, and I personally would have preferred it was authenticated or off by default.

I really love Caddy, you've built an amazing piece of software, I just disagree with your design decision on this one little thing. It's good that you put that notice in the docs at least!

mholt · on April 25, 2022

Glad you love Caddy.

If the host is popped, I'm not sure what Caddy can do to save you. Even authentication has to be stored on the machine... that was popped.

rekoil · on April 26, 2022

SSRF and RCE are different things. Being able to throw requests to localhost and accessing the host filesystem don't necessarily have to be the same vulnerabilities, but I'll concede that SSRF vulns are uncommon.

I'm just worried that Caddy will be the source of "security misconfiguration" (1) findings in penetration test reports. It's my opinion that we as software engineers should strive not to leave our software insecure by default, is what I'm saying, and that's how I see Caddy 2's admin API.

1: https://owasp.org/Top10/A05_2021-Security_Misconfiguration/

jxbjdjskjsj · on April 24, 2022

Ahhh, unauthenticated services on the loopback. For when you want to be penetrated from behind via javascript drive-by.

Yes, including your server, because you're probably forwarding that port to your workstation via ssh.

mholt · on April 25, 2022

Can you demonstrate an exploit in Caddy itself (i.e. isn't actually exploiting something else, or configuring yourself into a hole first -- we can't do much about external factors)? If it's valid we'll see about a patch.

tbran · on April 24, 2022

I'm lovin' Caddy! Below is a typical reverse proxy + www redirect + static file serving. It's so much more readable and simple than Apache/Nginx that I can't imagine switching back.

  www.example.com {
    redir https://example.com{uri}
  }

  example.com {

    # tell Caddy where your favicon files are
    @favicon {
      path /favicon.ico
      # + other favicon files...
      # ...
      # ...
    }

    # serve your static files
    route /static/*  {
      root * /var/cache/example.com
      header Cache-Control max-age=31536000 # 1 year
      file_server
    }

    # serve your favicon files from your favicon route
    route @favicon {
      root * /var/cache/example.com/favicons
      file_server
    }

    encode zstd gzip

    # reverse proxy to your app
    reverse_proxy 127.0.0.1:8080

    # do some logging
    log {
      format json
      output file /var/log/caddy/example.com/access.log {
        roll_size   100MiB
        roll_keep   10
        roll_keep_for 2160h # 90 days
      }
    }
  }

e12e · on April 23, 2022

Caddy as a proxy is great - I only whish it was easier to copy apache-style auth/authz "satisfy any" (eg: whitelist some ips, require basic auth from others).

I also expect that as setups age, one is likely to miss mod_rewrite - but at that point/style of setup, maybe apache traffic server start to make sense. Or, just apache httpd of course.

francislavoie · on April 23, 2022

You can do that easily with the `remote_ip`[0] matcher (pair it with the `not` matcher to invert the match). For example to require `basicauth`[1] for all non-private IPv4 ranges:

    @needs-auth not remote_ip 192.168.0.0/16 172.16.0.0/12 10.0.0.0/8
    basicauth @needs-auth {
        Bob JDJhJDEwJEVCNmdaNEg2Ti5iejRMYkF3MFZhZ3VtV3E1SzBWZEZ5Q3VWc0tzOEJwZE9TaFlZdEVkZDhX
    }

And for rewrites, there's the `rewrite`[2] handler. Not sure what you're missing?

[0] https://caddyserver.com/docs/caddyfile/matchers#remote-ip

[1] https://caddyserver.com/docs/caddyfile/directives/basicauth

[2] https://caddyserver.com/docs/caddyfile/directives/rewrite

zamadatix · on April 23, 2022

One problem, as a lazy bastard, I've always had with the Caddy docs is it can sometimes be hard to glance at a page in the documentation and see "ah, that's how I'd do <common thing>".

Take the basicauth portion for example. If you had been reading the docs like a book, started in the references/tutorial sections, understood all there is to know about how request matcher syntax works as a result, and then read the basicauth page you would have a rock solid understanding of how to make basicauth do what you want here. If you land at the basicauth page from a Google search trying to stand up a quick file-server weekend project in a way your friends can access you either are really on top of it and notice "[<matcher>]" (not even mentioned in the Syntax breakdowns of the page) is what's used in the single example below and happens to be a path but might be a lot more or you leave without a hint of how to do basicauth the way you wanted. It'd be great if the syntax section breakouts just mentioned something that triggered more or less a "and hey dumbass, if you haven't learned how matchers work yet you need to go do that to fully utilized this directive".

I realize this is awfully needy, the docs have everything you need if you read them carefully, and I absolutely LOVE using Caddy so it's not an attempt to say it's bad overall by any means. I wanted to point it out though since this exact example is something I ran into a few weekends ago. I think the problem is exacerbated by v2 syntax being new, as well as the competing JSON syntax, making it harder for people to find use case examples outside of what's in the official docs.

francislavoie · on April 23, 2022

Protip: you can click almost everything in code blocks in the docs. For example, if you click `[<matcher>]`, it brings you right to the request matcher syntax section, which explains what you can fill in there.

It would be redundant to write on every page what you can use as a matcher. The Caddyfile reference docs assume you've read https://caddyserver.com/docs/caddyfile/concepts which walks you through how the Caddyfile is structured, and it'll give you the fundamentals you need to understand the rest of the docs (I think, anyway).

If you think we need more examples for a specific usecase, we can definitely include those. Feel free to propose some changes on https://github.com/caddyserver/website, we could always use the help!

zamadatix · on April 23, 2022

I had a huge facepalm the day I realized all of the syntax was clickable :). It's not even a uncommon feature I just hadn't tried clicking it at for some reason!

Yeah I wouldn't say necessarily every page needs examples of different matchers and certainly not about what all of the matcher options are. More a "if the token is shown in the syntax it should have a bullet in the syntax section" kind of approach which in the case of [<matcher>] could be as plain and short as "[<matcher>] is a token which allows you to control the scope a directive applies to. For details on how see Request Matchers." to raise the "you probably want to know more about this first" flag to anyone that just jumped in from Google or what have you to go read about matchers before trying to understand the directive.

If that makes any sense I'd be glad to raise it more formally over on the GitHub!

mholt · on April 23, 2022

Really appreciate this feedback and the positive attitude. This is helpful, thank you!

e12e · on April 23, 2022

Right, I know it's possible (and thanks for the example) - I still whish it was easier to specify "only authorized given conditions x, y, z".

In the above, access is implied, then revoked without a valid username/password, then login is excepted for certain conditions (IPs in this case) - and access is implied (again).

IMNHO one of the strengths of apache is how authentication providers and authorization is separated and allow for easy(ish) combinations.

That said, there's something to be said for doubling down on a single type of handling for access and rewrites (matchers).

I still prefer matchers (which resource) combined with action/policy (allow/deny/rewrite).

francislavoie · on April 23, 2022

You could also do that, by structuring it like this:

    @first <whatever matcher>
    handle @first {
        do_something
    }

    @second <whatever matcher>
    handle @second {
        do_something_else
    }
    
    handle {
        error "Unauthorized" 403
    }

Which is basically an if/else-if/else structure.

I'm not totally sure I follow how you'd like for it to be structured. Could you give a config example? I think one of the issues is `basicauth` needs to write a response header to work correctly (i.e. WWW-Authenticate to tell the browser to prompt) so it can't only act as a matcher.

iso1210 · on April 23, 2022

Caddy looks interesting, I currently use apache to proxy a few hundered sites and it works well enough, some are protected by client certificates, others by oidc, all then pass the authenticated user to the downstream server in a header, job done.

I've managed to do this with openresty (nginx not supporting oidc out of the box), but it doesn't fill me with confidence, I guess it's all the lua. A quick glance at caddy shows it likewise doesn't support oidc integration out of the box, but instead I have to use another module that's no longer maintained ( https://github.com/thspinto/caddy-oidc )

francislavoie · on April 23, 2022

Yeah, we defer to plugins to provide auth solutions, because it's... a whole thing. It's best maintained outside of the standard distribution, because there's so many ways to approach it.

The caddy-oidc plugin you linked was written for Caddy v1, so it's no longer compatible. The most complete auth plugin for Caddy v2 is https://github.com/greenpau/caddy-security, and I think it probably does what you need.

barbazoo · on April 24, 2022

Looks interesting. I'll check it out, thanks!!

twic · on April 23, 2022

Some lazy questions ...

> Since IPv6 addresses are 128 bits long, but IPv4 addresses are only 32 bits, it's possible to embed IPv4 addresses in IPv6 addresses. snid embeds the client's IP address in the lower 32 bits of the source address which it uses to connect to the backend.

How does this work? snid just makes up an IP address? What socket API calls do you make to do this? Just pick an address and bind, and the kernel is fine with that? And it all gets routed back and forth correctly? Do you have to configure this 64:ff9b:1::/48 prefix on the loopback interface?

> Encrypted Client Hello doesn't actually encrypt the initial Client Hello message. It's still sent in the clear, but with a decoy SNI hostname. The actual Client Hello message, with the true SNI hostname, is encrypted and placed in an extension of the unencrypted Client Hello. To make Encrypted Client Hello work with snid, I just need to ensure that the decoy SNI hostname resolves to the IPv6 address of the backend server. snid will see this hostname and route the connection to the correct backend server, as usual.

How does the decoy SNI hostname get chosen? This sounds like there needs to be a different decoy hostname for each backend service. Does that come from DNS somehow? The client doesn't just make it up at random?

agwa · on April 23, 2022

Great questions!

> How does this work? snid just makes up an IP address? What socket API calls do you make to do this? Just pick an address and bind, and the kernel is fine with that? And it all gets routed back and forth correctly? Do you have to configure this 64:ff9b:1::/48 prefix on the loopback interface?

First you have to set the IP_FREEBIND socket option, which allows binding to nonlocal/nonexistent IP address and then you call bind with whatever address you like. To ensure the packets get routed back properly, you need a local route for the 64:ff9b:1::/96 prefix, which can be added with:

ip route add local 64:ff9b:1::/96 dev lo

> How does the decoy SNI hostname get chosen? This sounds like there needs to be a different decoy hostname for each backend service. Does that come from DNS somehow? The client doesn't just make it up at random?

The decoy hostname is specified in the ECHConfig struct[1], which is conveyed to the client via DNS in the HTTPS record[2].

It does indeed mean that each backend needs its own decoy hostname (which resolves to the IPv6 address of the backend). This means that ECH does not hide which backend is being connected to, but if a particular backend handles multiple hostnames, it can hide which of those hostnames the client is connecting to.

[1] https://www.ietf.org/archive/id/draft-ietf-tls-esni-14.html#...

[2] https://www.ietf.org/archive/id/draft-ietf-dnsop-svcb-https-...

zokier · on April 23, 2022

If you have at least a /96 to dedicate for snid, then couldn't you just use that public prefix instead of 64:ff9b to encapsulate ipv4 address, making the setup somewhat simpler? Also if you used public prefix then I imagine you could even run this setup over the internet, i.e. have snid run on some public server with dualstack and forward connections to ipv6-only app servers. I'm imagining the common situation where you have cgnat ipv4 + native ipv6 at home, you could host snid on public cloud instance to expose services running at home.

agwa · on April 23, 2022

Yup, that would work. Nice idea!

tedunangst · on April 23, 2022

Oops, draft-ietf-dnsop-svcb-https-08 has expired.

derefr · on April 23, 2022

> Meanwhile, my preferred language, Go, has a high-quality, memory-safe HTTPS server in the standard library that is well suited for direct exposure on the Internet.

I know people do use Golang's http.Server for production use-cases. Does Google, though? Are services of Google customers ever actually directly talking to this Golang stack, without at least a minimal L7 WAF (as e.g. a default Nginx config does) in between?

I ask because there are a number of weird connection latency, slowness, and "stuttering" problems I've experienced with services which I know do directly expose Golang servers — e.g. Docker Registry instances including Docker Hub; Minio instances; go-ethereum nodes; etc. — that I've never experienced with any Google service, or with any known non-Golang service.

My hypothesis is that this is due to Golang's http.Server not having any upper limit on simultaneous connections (because just in CPU and memory terms, the Golang runtime can handle almost arbitrarily many), such that eventually the bottleneck actually becomes per-connection throughput, with clients becoming starved for space in the machine's network ring buffer; and because this is such an unusual bottleneck to have (usually it's only a thing with CDNs) — and because it causes no problems for the server, esp. if things like readiness checks are done through a separate internal NIC — the people running these servers don't even notice it's happening, and so lag far behind in horizontally scaling servers to spread out the demand for throughput.

Or, to put that another way: the Golang http.Server isn't observable — exposing server-internal metrics — in the way that actual web servers like Nginx, or even web-app server frameworks like Jetty, are; and so it's very hard to know when things are silently going wrong for users, esp. in cases where the developers of a piece of software aren't themselves running it at scale and so never think to manually add observability for metrics that only become relevant at scale (which authors of generic web server software are usually aware of, but authors of application software usually aren't.) This leads me to think that, if Google themselves are using Golang services for anything at scale, and yet not rushing to implement such metrics into http.Server, then they must be observing these services in a very different way than we mere mortals do. Maybe calculating per-flow packet-wise QoS at the edge in their fancy LANai switches using historical statistical fingerprints of predictable flow patterns, or something.

finnh · on April 23, 2022

Seems like eBPF could be useful here as well, to get some external insight into per-connection counts and behavior.

tedunangst · on April 23, 2022

I wrote a web server that uses go http but not http.serve for this reason. I wanted more control over accept and close. Nice thing is the http library is decently composed, so you can take all the parts you want and build up.

merb · on April 23, 2022

you can use opentelemetry to get traces and metrics for golang net/http, but since I'm not using it I have no idea what metrics are already supported.

btw. google does use golang services at scale (I'm not a googler) but they probably do it like they do it with appengine and only limit a certain amount of req/s per service

derefr · on April 24, 2022

> you can use opentelemetry to get traces and metrics for golang net/http

Yes, but not server-internal metrics, as IIRC the telemetry provider plugs into http.Server as a middleware. It only exposes the kinds of metrics you could calculate yourself in a handler; it doesn’t expose any of the “good stuff” (e.g. work queues, accept(2) latency, etc.)

mholt · on April 23, 2022

Nice, this is kind of why I made Project Conncept. It's a powerful TCP and UDP stream multiplexer based on Caddy: https://github.com/mholt/caddy-l4

You can route raw TCP connections by using higher layer protocol matching logic like HTTP properties, SSH, TLS ClientHello info, and more, in composable routes that let you do nearly anything.

ignoramous · on April 23, 2022

Neat. Kind of like a highly configurable https://github.com/inetaf/tcpproxy

> You can route raw TCP connections by using higher layer protocol matching logic like HTTP properties, SSH, TLS ClientHello info, and more, in composable routes that let you do nearly anything.

How do you foresee such a setup handle QUIC? The encrypted connection-ids, 0RTT handshakes, and roaming client-ip and server-ips make it non trivial to proxy connections transparently.

mholt · on April 23, 2022

Good question; I'm not really sure! Will need to look into it, or have people contribute some ideas. Feel free to start a discussion on the issue tracker if you're interested in this!

rasengan · on April 23, 2022

Thank you for this and Caddy.

mholt · on April 23, 2022

You're welcome! Thanks for the nice comment. :) We have many contributors and several maintainers to thank as well.

jraph · on April 23, 2022

I have a server that hosts several websites. I wanted some of them to be installed in a separate (systemd) container (because they belong to the same organization).

I use nginx's ssl_preread module to proxy https requests to the container or to another port depending on the SNI. This is what snid does in the article if I understood correctly (without the DNS lookup because I don't need it, but it is able to do it too). It works well and it's good that the nginx at the front does not need to have the SSL certificates. In this setup, Nginx does not need to decode anything, it just does a pass-through, so this is quite light. It is also way simpler to setup than an actual HTTP reverse proxy.

apitman · on April 23, 2022

This is great. I think SNI is currently one of the most pragmatic tools to work around ipv4 exhaustion.

I like the approach described here, but in practice I prefer the convenience of having a reverse proxy to automatically handle TLS certs for me. That said, libraries like certmagic are making it more feasible for every app to manage its own certs.

ignoramous · on April 23, 2022

> I think SNI is currently one of the most pragmatic tools to work around ipv4 exhaustion.

Pretty much: https://research.cloudflare.com/publications/Fayed2021/

See also, DNS SVC (A/B) records (pseudo NAT at DNS layer); but not many deployments use it or understand it. Note that, SNI as a routing replacement works for TCP nicely without much (user-space) complication. With QUIC, transparently proxying connections isn't all that straight forward.

apitman · on April 27, 2022

Thanks for the link. What exactly are the issues with QUIC? I've wondered if it's possible to do SNI routing with UDP but haven't looked into it yet.

ignoramous · on May 2, 2022

QUIC hides and encrypts everything it possibly can, and that includes server and client identities (IPs are inconsequential to a QUIC session which is instead maintained through connection-id(s) exchanged under TLS encryption or is obfuscated away; that is, there is no way for a middleware to analyse a QUIC session / flow without actually MiTMing TLS).

More: https://lwn.net/Articles/745590/

karmanyaahm · on April 23, 2022

This is such a neat solution.

As part of my effort to single-stack-v6 and minimum-effort-v4, for my self hosted services and projects, I've always wanted a way to avoid reverse proxy configuration for each hosted app; v6 can directly listen to a new address.

dcow · on April 23, 2022

If the decoy hostnames (as the author describes it) used in encrypted SNI are deterministic such that you can statically determine the real hostname the client wants, then what’s the point of encrypted SNI in the first place?

agwa · on April 23, 2022

ECH can't hide which backend the client wants, but if a particular backend handles multiple hostnames, it can hide which of those hostnames the client wants. I went into further detail here: https://news.ycombinator.com/item?id=31136335

justsomehnguy · on April 23, 2022

Previous disc: https://news.ycombinator.com/item?id=31040667

zokier · on April 23, 2022

Clever. I think I'd prefer to do TLS termination at the proxy though, something akin to stunnel. Of course snid could be used together with stunnel, but I think it would lose the O(1) configuration then. Just terminating tls and not touching the http content would still avoid any of those http parsing issues mentioned.

agwa · on April 23, 2022

An earlier version did the TLS termination in the proxy. It's true you can avoid the HTTP parsing issues, but you lose the ability to do client certs or have backend-specific cipher/TLS version requirements. Also, I really like it that IPv6 clients can connect directly to the backend, bypassing any proxies.

GauntletWizard · on April 23, 2022

I'm a huge fan of this approach, but I would also combine it almost equally with standard http reverse proxies; there's a lot you can gain from having a proxy that can understand paths,buffer requests, etc.

thayne · on April 23, 2022

> To make Encrypted Client Hello work with snid, I just need to ensure that the decoy SNI hostname resolves to the IPv6 address of the backend server.

Doesn't that sort of defeat the purpose of ECH?

peter_retief · on April 23, 2022

Great idea, I have been thinking of doing something similar but am struggling to get ISP's to support Ipv6. Some actually block Ipv6.

scott00 · on April 23, 2022

I don't get how this works with ECH. Can anybody add some detail to what's in the article?

klysm · on April 23, 2022

Well I guess I really have to learn IPv6 now