Bear in mind the photos will usually be viewed on a 1080p phone screen which is ~2MP. Or a 4k display which is ~8MP. The sensor has been getting bigger since the 6s which means more light for each pixel.
Camera pixels are not like screen pixels, more like screen subpixels. Each one has a color filter in front in a Bayer pattern, meaning a 2MP screen more closely corresponds to a 6MP camera sensor than a 2MP sensor. Respectively a 4K display already goes to 24MP camera sensor equivalency, which is quite a lot - many of the modern full-frame cameras have 24MP and more than that is considered high-resolution.
He does have a point in that a 12 MP CMOS sensor will have 12 million sensor elements, not 36 million. Colour filters are placed in front of each pixel, so RGB data can be extracted. Usually, 1/4 of the pixels are R, 1/4 are B and 1/2 are G. The raw sensor data for each pixel thus contains either R, G, or B, of varying intensities, depending on the passbands of each filter. The data is combined using a demosaicing/debayering filter/algorithm to extract subpixel data. That is, surrounding colour information is combined so that each pixel has R, G, and B elements.
Sorry if the writeup isn't that specific, I mostly work with monochrome CMOS cameras.
edit: I should also state that I don't know anything about iPhone cameras. It's quite possible, but not typical that they have a 36 MP sensor producing 12 MP images.
edit 2: I read that the iphone 12 has 1.7 um pixels. A 36 MP 4:3 sensor with 1.7 um pixels would be 8.3 mm. A 12 MP 4:3 sensor would be just 6.8 mm wide.
no the sensor has rows of 4000 grayscale pixels with different color filters on them. the actual rgb resolution is actually a quarter but the debayering algorithm upscales the data by 2x in each direction. So yes the RGB resultion is the same as the subpixel resultion, but at the same time it isn't.
You're correct. The numbers that are promoted for just about any camera out there refer to the actual size of the output, not the number of elements in the sensor. I'm not sure where the parent comment was getting his info from.
The actual size of the output is not the actual size of the sensor. The color data is interpolated. The promoted size is the output size, but that's not really full subpixel resolution in the monitor sense.
The promoted size is the number of photosites on the sensor, but each photosite is grayscale. Look at any RAW camera format or the datasheet of a sensor (e.g. one of popular Sony sensors). All of that applies to Bayer sensor and not Foveon, but Foveon is not particularly popular by any measure.
I've played with a friends Samsung S21 Ultra 108 MP and it blows out of the water my iPhone 12 Pro Max, provided you can keep it stable - portraits were insanely realistic, yet I was unable to capture a single photo of my toddler!