Almost no modern systems are running Torus these days - at least not at the node level. The backbone links are still occasionally designed that way, although Dragonfly+ or similar is much more common and maps better onto modern switch silicon.
You're spot on that the bandwidth available in these machines hugely outstrips that in common cloud cluster rack-scale designs. Although full bisection bandwidth hasn't been a design goal for larger systems for a number of years.
LambdaLabs GPU cluster provides internode bandwidth of 3.2Tbps: I personally verified it in a cluster of 64 nodes (8xH100 servers) and they claim it holds for up to 5k GPU cluster. What is the internode bandwidth of Frontier? Someone claimed it's 200Gbps, which, if true, would be a huge bottleneck for some ML models.
Bisection bandwidth is the metric these systems will cite, and impacts how the largest simulations will behave. Inter-node bandwidth isn't a direct comparison, and can be higher at modest node counts as long as you're within a single switch. I haven't seen a network diagram for LambdaLabs, but it looks like they're building off 200Gbps Infiniband once you get outside of NVLink. So they'll have higher bandwidth within each NVLink island, but the performance will drop once you need to cross islands.
I thought NVLink is only for communication between GPUs within a single node, no? I don't know what the size of their switches are, but I verified that within a 64 node cluster I got the full advertised 3.2Tbps bandwidth. So that's 4x as fast as 4x200Gbps, but 800Gbps is probably not a bottleneck for any real world workload.
You're spot on that the bandwidth available in these machines hugely outstrips that in common cloud cluster rack-scale designs. Although full bisection bandwidth hasn't been a design goal for larger systems for a number of years.