why would a microkernel be smaller than than a monolithic kernel, given the same feature set? Unless you claim that async message passing leads to shorter code than procedure calls.
Because it gets spread around over multiple processes you get process based isolation for each module that normally all lives within the same address space. So you may be able to break a module but you can't use that to elevate yourself to lord of the realm.
I very much agree that a microkernel can be both more robust and secure. I'm arguing against being easier to understand. Modularity does not require address space separation, which is a runtime property, not a code organizations feature.
The one tends to go hand-in-hand with the other, in my experience. Of course, that's anecdata, but the more complex projects that I've worked on that used message passing separated much more naturally and cleanly into isolated chunks of code that produced their own binaries. Essentially message passing and micro kernels dictate a services based approach, which is an excellent match for OS development.
Less facetiously, some micro kernels have been proven formally correct. You need a small code to prove formally correct.
And even on an intuitive level, if the code is small I can "hold" it all in my head - at least really grock it in a way you can 1E6 loc.