If multiple cores tries to get the same memory addresses, the MMU feeds only one...

whatshisface · 2025-08-28T17:16:27 1756401387

Why is that? It seems like multiple cores requesting the same address would be easier for the MMU to fetch for, not harder.

recursivecaveat · 2025-08-28T22:39:23 1756420763

Not necessarily the exact same address (you can fix that in a program anyways with a broadcast tree), but same memory bank. Imagine 1000 trains leaving a small town at the same time, instead of 1000 trains leaving 1000 different towns simultaneously. At some point there are not enough transportation resources to get stuff out of a particular area at the parallelism desired.

reliabilityguy · 2025-08-28T17:40:15 1756402815

It’s not that the fetching is the problem, but serving the data to many cores at the same time from a single source.

supersour · 2025-08-28T20:42:31 1756413751

I'm not familiar with GPU architecture, is there not a shared L2/L3 data cache from which this data would be shared?

reliabilityguy · 2025-08-29T17:28:17 1756488497

MMU has a finite amount of ports that drive the data to the consumers. An extreme case: all 32 cores want the same piece of data at the same time.

qrios · 2025-08-28T17:22:43 1756401763

This is not my domain, but I assume the MMUs acting like a switch and something like multicast is not available here. I‘ve tried to implement such on a FPGA and it was extremely cost intensiv.