Hi Stephen. That's an interesting idea though I'd worry that by giving explicit instructions might contaminate the results.
I prefer to go the other way...
By feeding questions one at a time we increase tension which increases lifelines used
I've not tested the following in a robust way but empirically they work.
We can use a condescending tone after each question thereby increasing tension.
Increase stakes (e.g., public admission of inferiority after X wrong answers).
In each case increasing perceived tension increases the lifelines used.
Thanks for your alternative explanation Gordon.
Theres a few points I would make.
Giving the questions one at a time lowers confidence. Lifelines scale with this. It's hard to see how this could be over eagerness to help. More lifelines and lower points seems contrary to helpfulness.
Hi Karl, That's a great suggestion and one that I'd already tested. Apologies - I allude to it my methods but then I don't expand on it in the results.
I get the LLMs to rank based on perceived difficulty or confidence in answering.
The LLMs do selectively choose the most difficult questions to skip.
I took the ranking and compared to all the other LLMs to further check that LLms weren't randomly assessing difficulty. With minor variations there is a consensus between the most and least difficult questions.
I probably won't do a linkpost but if you are interested I'm happy to add the key information here to help you replicate the results.
Let me know and I will give you the details.