If the strong AI has knowledge of the benchmarks (or can make correct guesses about how these were structured), then it might be able to find heuristics that work well on them, but not more generally, Some of these heuristics might seem more likely than not to humans.
Still seems like a useful technique if the more powerful model isn't much more powerful.
I really like the way that you've approached this pragmatically, "If you do X, which may be risky or dubious, at least do Y".
I suspect that there's a lot of alpha in taking a similar approach to other issues.
The second paper looks interesting.
(Having read through it, it's actually really, really good).
My take is that you can't define term X until you know why you're trying to define term X.
For example, if someone asks what "language" is, instead of trying to jump in with an answer, it's better to step back and ask why the person is asking the question.
For example, if someone asks "How many languages do you know?", they probably aren't asking about simple schemes like "one click = yes, two clicks = no". On the other hand, it may make sense to talk about such simple schemes in an introductory course on "human languages".
Asking "Well what really is language?" independent of any context is naive.
I'd like access.
TBH, if it works great I won't provide any significant feedback, apart from "all good"
But if it annoys me in any way I'll let you know.
For what it's worth, I have provided quite a bit of feedback about the website in the past.
I want to see if it helps me with my draft document on proposed alignment solutions:
https://docs.google.com/document/d/1Mis0ZxuS-YIgwy4clC7hKrKEcm6Pn0yn709YUNVcpx8/edit#heading=h.u9eroo3v6v28
I think the benefits are adequately described in the post.
But I don't know if any of us have explicitly called for an AI pause, in part because it seems useless, but may have opportunity cost.
The FLI Pause letter didn't achieve a pause, but it dramatically shifted the Overton Window.
Helps people avoid going down pointless rabbit holes.
I highly recommend this post. Seems like a more sensible approach to philosophy than conceptual analysis:
https://www.lesswrong.com/posts/9iA87EfNKnREgdTJN/a-revolution-in-philosophy-the-rise-of-conceptual
Thank you for your service!
For what it's worth, I feel that the bar for being a valuable member of the AI Safety Community, is much more attainable than the bar of working in AI Safety full-time.