I really enjoyed reading this story! It was a super cool mix of narrative and technical aspects. While reading, I noticed similarities between the world described in this story and the world model used as a part of Bengio et al.'s Scientist AI proposal. Now, the Terrarium itself isn't a world model like Bengio describes, but I believe it could rather be a world that the world model generates theories about. Having Scientist AI generate theories about the Terrarium could lead to building intriguing theories about the emergent agent behavior and interactions...
I agree with a lot of what you're saying here. Homework that takes the form of guessing the teachers password is not at all beneficial to the growth of reasoning skills. It only furthers the bad habit of repeating what you've heard because you know it's right without actually learning what that word or phrase means. However, I don't think that this framework applies to upper level coursework (roughly 9-12).
Homework at this level builds off of what was learned in the classroom very well, and depending on the subject, such as with math or physics, there isn'...
This is a very good question. I think where I see the most issue in cheating on homework, especially for a child, Is that ultimately it becomes a crutch and hinders the growth of critical reasoning skills. Even as a college student I see people who are entirely dependent on AI to complete their homework. Then when finals and midterms come around, they struggle to learn concepts by themselves. I can imagine this would have even larger effects on a child, where their early years in school are the most vital to developing their reasoning skills. Despite the f...
This is a super cool research topic. Reading about this reminded me of ryan_greenblatt's post "Current AIs seem pretty misaligned to me", and how Greenblatt saw that when faced with hard to verify tasks, AI would present some concerning and potentially dangerous behaviors like overselling work quality and producing confident-sounding but overall shallow analysis.
Your LLM auditors seem to be placed in a similar situation when they try to solve design sabotages where it's hard to verify whether the sabotage was actually active sabotage or just a design decis... (read more)