Bachelor in general and applied physics. AI safety/Agent foundations researcher wannabe.
I love talking to people, and if you are an alignment researcher we will have at least one common topic (but I am very interested in talking about unknown to me topics too!), so I encourage you to book a call with me: https://calendly.com/roman-malov27/new-meeting
Email: roman.malov27@gmail.com
GitHub: https://github.com/RomanMalov
TG channels (in Russian): https://t.me/healwithcomedy, https://t.me/ai_safety_digest
Idea status: butterfly idea
In real life, there are too many variables to optimize each one. But if a variable is brought to your attention, it is probably important enough to consider optimizing it.
Negative example: you don’t see your eyelids; they are doing their job of protecting your eyes, so there’s no need to optimize them.
Positive example: you tie your shoelaces; they are the focus of your attention. Can this process be optimized? Can you learn to tie shoelaces faster, or learn a more reliable knot?
Humans already do something like this, but mostly consider optimizing a variable when it annoys them. I suggest widening the consideration space because the “annoyance” threshold is mostly emotional and therefore probably optimized for a world with far fewer variables and much smaller room for improvement (though I only know evolutionary psychology at a very surface level and might be wrong).
To the first: I already addressed it in the "Why not just...?" part:
Add "hey I'm just probing you, please don't update on that query" in the query
That might decrease the update a bit, but insofar if inquirer counterfactually adds that in cases they need the answer in some hypothesis-specific case the oracle would still update somewhat.
To the second: that one might actually work, I don't see an obvious way it fails. Perhaps only in the scenario with an extremely smart oracle, which could somehow predict the question you actually want to know the answer to. But at that point it would be hard to stop it from updating on anything, so updating on the query would be the least of your problems. Though it only gives you an answer to 1 question traded for 1000 queries. If we want full distribution, that would require 1000*#of_hypotheses queries, which is O(n) and beats my O(n!) suggestion, but it is still far from ~1 query per hypothesis (which would be ideal).
P(First|Second)
I think you meant P(Second|Find) here.
I am a bit confused about what 10x slowdown means. I assumed you meant going from to on R&D coefficient, but the definition from the comment by @ryan_greenblatt seems to imply going from to (which, according to AI 2027 predictions, would result in a 6-month delay).
The definition I'm talking about:
8x slowdown in the rate of research progress around superhuman AI researcher level averaged over some period
i.e. to see fresh comments
Where specifically that assumption is spelled out?
Also, I don't like that if I click on the post in the update feed and then refresh the page I loose the post
It might be that I have smth wrong with the internet but this beige widget isn't loading
People often say, "Oh, look at this pathetic mistake AI made; it will never be able to do X, Y, or Z." But they would never say to a child who made a similar mistake that they will never amount to doing X, Y, or Z, even though the theoretical limits on humans are much lower than for AI.