For two sensors of the same resolution (= number of photosensitive cells), a physically larger sensor will have a greater number of photons hitting each photosite. This means better signal-to-noise ratio (fewer photons = base electrical noise is greater in relation to the signal) and dynamic range.
So inescapably, due to basic laws of physics, for two identically implemented sensors of the same resolution, the larger one will always be better.
This doesn't mean you can't make miniscule sensors with 41MP -- you certainly can. There can even be advantages to doing so. The Nokia PureView cameras were based on a novel concept that by capturing very high resolution images (41MP) you can then smooth out the noise (because it is essentially random) while retaining huge amounts of detail if you downsample them to a reasonable size (something like 12MP? I forget what it does exactly) in post-processing. It is a tradeoff -- you trade worse dynamic range for better detail -- but it worked really well for a smartphone at the time.
If you were to pixel peep the 41 MP files before downsampling they would look horrible, especially at higher sensitivities.
> For two sensors of the same resolution (= number of photosensitive cells), a physically larger sensor will have a greater number of photons hitting each photosite.
Nitpick: this assumes that you're holding the f number constant. In practice smaller sensors tend to be used with smaller f numbers, which somewhat offsets the effect (especially if you are not someone who's given to shooting everything wide open on your DSLR).
A more useful way to think about it is to forget about the sensor and just consider the absolute diameter of the aperture (for a given angle of view). Your phone's aperture is a few mm in diameter. If you're shooting at the same angle of view with your DSLR, then the amount of additional light hitting its sensor (as compared to the phone) is in proportion to the additional diameter of the DSLR's aperture. So if you're shooting at, say, f16, you may not be getting any more light than the phone is at f1.8.
> Nitpick: this assumes that you're holding the f number constant. In practice smaller sensors tend to be used with smaller f numbers, which somewhat offsets the effect (especially if you are not someone who's given to shooting everything wide open on your DSLR).
Well most phone cameras seem to be around f/2, some slightly above, some a little below. The archetypal nifty fifty is f/1.8 or f/2 as well, and primes in that range are usually available for most applications and reasonable in price. Slightly slower primes at f/2.8 are often also available and cheaper. So dit-for-dat, you'd expect a full-frame camera to have at least 6 EVs lower noise than your average 1/3.something inch phone camera sensor (crop factor of ~10, area difference of ~100, ld(100) = 6...).
Your entrance pupil metric is really just a roundabout way to compensate for the crop factor of the sensor to get to the same FoV. The relevant property for exposure is the f-stop.
F-stop is the relevant property for exposure, but not for the total amount of light collected by the sensor, which is what determines the noise level (all else being equal). Exposure is light per unit area.
I do think that focusing on sensor size is unhelpful when thinking about noise levels. Big sensors do not magically collect more light simply in virtue of being bigger. They can only do so if you’re able to put a bigger hole in front of them (again, holding constant the angle of view). The use of very wide apertures is inherently more practical with smaller sensors.
As for using wide apertures with a DSLR, this is of course possible, but only in cases where a shallow depth of field is acceptable. On a cell phone camera f1.8 will almost always give sufficient depth of field. Realistically speaking most photos on a DSLR will be taken a few stops down from that.
It’s undoubtedly the case that DSLRs have an advantage over phones in terms of noise levels, but you have to consider the whole optical system to estimate the magnitude of the difference, not just the size of the sensor.
I had a Nokia 808 PureView with a 38MP camera and a real flash, you were able to save the full 38MP jpg and the quality was really good for the time in daylight, but it performed badly in low light and saving the picture was slow at 38MP.
I wouldn't try to keep a full pixel array in main memory when it is so limited. I'd keep the image in storage and decode as needed. It's a lot of software work to implement a special decoder and you'll always face a significant performance penalty, but it's doable.