How do you evaluate quality ? Also I suspect the performance between models woul...

mrlongroots · 2025-03-15T19:49:04 1742068144

I think summarization quality can only be a subjective criterion measured using user studies and things like that.

The task itself is not very well-defined. You want a lossy representation that preserves the key points -- this may require context that the model does not have. For technical/legal text, seemingly innocuous words can be very load-bearing, and their removal can completely change the semantics of the text, but achieving this reliably requires complete context and reasoning.

anon373839 · 2025-03-16T10:06:57 1742119617

There are actually some clever approaches to eval abstractive summarization.

Examples: https://eugeneyan.com/writing/evals/#summarization-consisten...

imoreno · 2025-03-15T19:57:13 1742068633

>evaluate quality

[information content of summary] / [information content of original] for summaries of a given length cap?