Coverage: A Framework for Measuring What Language Models Actually Preserve
In the past few weeks I have been busy with an idea that kept nagging me: I believe we are evaluating language models the wrong way. While I am still working on a paper proving my point, my claim is already more than just a hypothesis. Preliminary data I have...
May 251