How do we know the TLM is any more accurate than the LLM (especially if it's not trained on any local data)? If determining veracity were that simple, LLMs would just incorporate a fact-checking stage.
You might be thinking of LLM as-a-judge, where one simply asks another LLM to fact-check the response. Indeed that is very unreliable due to LLM hallucinations, the problem we are trying to mitigate in the first place.
TLM is instead an uncertainty estimation technique applied to LLMs, not another LLM model.
How do we know the TLM is any more accurate than the LLM (especially if it's not trained on any local data)? If determining veracity were that simple, LLMs would just incorporate a fact-checking stage.