LLMs do not have internal reasoning, so the yapping is an essential part of producing a correct answer, insofar as it's necessary to complete the computation of it.
Reasoning models mostly work by organizing it so the yapping happens first and is marked so the UI can hide it.
My favorite is when it does all that thinking and then the answer completely doesn't use it.
Like if you ask it to write a story, I find it often considers like 5 plots or sets of character names in thinking, but then the answer is entirely different.
Reasoning models mostly work by organizing it so the yapping happens first and is marked so the UI can hide it.