I've seen a lot of news lately about the ways that particular LLMs score on particular tests.
Which if any of those tests can I go take online to see how my performance on them compares to the models?
I've seen a lot of news lately about the ways that particular LLMs score on particular tests.
Which if any of those tests can I go take online to see how my performance on them compares to the models?
On twitter the IQ score of IIRC 84 for ChatGPT and 96 for GPT-4 were making the rounds, maybe you refer to those? I believe these scores are based on this freely available online test:
https://iqtest.com/take-the-test/
I took it on wednesday just for fun. It's purely text-based but involves many different types of reasoning (including spatial reasoning). It's also a timed test which arguably inflates the LLM scores compared to humans.
Not for the all of them, but for the many of them you can see data and other info around here : https://paperswithcode.com/dataset/mmlu
I browsed around but cannot find the actual mmlu questions, or an example of 1 question. How do I view them>
Not sure if you've seen this or not: https://mashable.com/article/openai-gpt-4-exam-scores
But that references a number of standardized tests, some of which I suspect you have also taken. Here are a could of links to practice test that might have good matches for you to try.
https://www.tests.com/Free-Practice-Tests
[Rewrite as I don't think the first comment was actually helpful.]
Thank you! I didn't see your first version of this, but your current version is helpful for the human-specific tests that they're benchmarked on :)
Is this one of those tests where you spend lot of time answering the questions, and at the end there is "if you want to see the results, send money"?
Also, is there any reason to believe that the test was actually somehow validated, as opposed to just numbers completely made up?