You are right, that is also a possibility. I only considered cases with one intervention, because the examples I've heard given for Goodhart's law only contain one (I'm thinking of UK monetary policy, Soviet nail factory and other cases where some "manager" introduces an incentive toward a proxy to the system). However, multiple intervention cases can also be interesting. Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)? My intuition suggests that in the real world those type of causal influences are rare and also, I don't think we can say that "P causes V" in those cases. Do you think that is too narrow of a definition?
Have P proxy and V value. Based on past observances P is correlated with V.
Increase P! (Either directly or by introducing a reward to the agents inside the system for increasing P, who cares)
P does not cause V
P causes V
Case 1: Wow, Goodhart is a genius! Even though I had a correlation, I increased one variable and the other did not increase!
Case 2: Wow, you are pedantic. Obviously if the relationship between the variables is so special that P causes V, Goodhart's law won't apply. If I increase the amount of weight lifted (proxy), then obviously I... (read more)
Replication Markets is going to start a new project focusing on COVID studies. Infos:
Your betrayal of the clique is very nice, hats off to you. I also liked your idea of getting others not that interested in the game to submit bots helping you, It's a pity it did not occur to me.
However, I think you are, too, overconfident in you winning. I've run a simulation of the whole tournament till the 160th round with 8 bots (MatrixCrashingBot, TitforTatBot, PasswordBot1, PasswordBot2, EmptyCloneBot, earlybird, incomprehensiblebot, CliqueZviBot) and in the final equilibrium state there are three bots: earlybird, incomprehensiblebot and CliqueZviBot... (read more)
Disqualifying players for things they obviously wouldn't do if they knew the rules of the game seems pretty cruel. I hope isusr just deletes that line for you.
The links you posted do not work for me. (Nevermind)
Wow, you are really confident in you winning. There are 10 players in the clique, so even if there are no players outside the clique (a dubious assumption) a priori there is 10% chance. If I had money I would bet with you.
I also think there is a good chance that a CloneBot wins. 10 possible member is a good number imo. i would say 80%.
I would say 70% for the (possibly accidental) betrayal.
Without seeing your jailbreak.py I can't say how likely that others are able to simulate you.
What does "act out" mean in this context?
Yes, I feared that some might think my friend is in the clique. However I couldn't just say that they are not in the clique, because that would have been too obvious. (like my other lie: "Yeah, I totally have another method for detecting being in a simulation even if the simulation runs in a separate process, but unfortunately I can't reveal it.") So I tried to imply it by speaking about him as if he is not in the conversation and him not commenting after I mentioned him. I hoped in case someone was planning to submit a simulator outside the clique they would try to sneakily inquire about whether my friend is in the clique or not and then I would have asked a random, not competing lesswronger to play the part of my friend.
Good to know. I'm a C++ guy which has a "one definition rule" not only for the translation unit, but for the whole program, so I incorrectly assumed that python is the same even though the languages are obviously very different.
Maybe it's a little cheap to say this after you've revealed it, but it did actually occur to me that you might have deliberately made this weakness. Had I known that in Python you can redefine methods, I might have reported it, but the exploit with __new__() seemed pretty obscure (even though I didn't know the other way and I did know this). The possibility of this being a test was also the reason I went with the "Oh I'm so busy, I didn't have time to review the code.." excuse. I'm also curious whether Larion calculated with you deliberately planting the m... (read more)
After seeing Vanilla_Cabs's comment I lied to them about wanting to join the clique. I was undecided, but I figured seeing the code of the clique can be a great advantage if I can exploit some coding mistake and I can still decide to join later anyway if I want.
The first versions of CloneBot (the name of the program for our clique) did actually contain a mistake I could exploit (by defining the __new__() method of the class after the payload) and so this was my plan until Vanilla_Cabs fi... (read more)
The first versions of CloneBot (the name of the program for our clique) did actually contain a mistake I could exploit (by defining the __new__() method of the class after the payload) and so this was my plan until Vanilla_Cabs fixed this mistake. After they fixed it, I didn't notice any way I can take advantage, so I joined the clique in spirit.
Little did you know that I was aware of this weakness from the beginning, and left it as a test to find whom I could trust to search for the weaknesses I didn't know. Of the 3 (I think) to whom I showed the code ea... (read more)
In what order do programs get disqualified? For example, if I submit a program with an infinite loop, every other program using simulation will also go into infinite loop when meeting with my program as detecting infinite loops generally isn't theoretically feasible. Is my program disqualified before the others? What is the general principle?
EDIT: An unrelated question: Do round numbers start from 0 or 1? In the post you write "Unlike Zvi's original game, you do get to know what round it is. Rounds are indexed starting at 0.", but also: "Your class must have an __init__(self, round=1) [..]". Why not have the default initializer also use 0 if the round numbers start from zero?
You should also check whether 'exec' is in the program code string, because someone could call getopponentsource with exec and caesar-encryption, otherwise you will be DQ'd if someone submits a program like that. (However, rebinding getopponentsource is probably more elegant than this type of static analysis.)
I don't have much time, so I've only checked the first study. The numbers come from this one: https://ultrasuninternational.com/wp-content/uploads/raharusun-et-al-2020_patterns_of_covid-19_mortality_and_vitamin_d_an_indonesian_study.pdf
I looked a bit more and found this: https://www.cambridge.org/core/journals/british-journal-of-nutrition/article/covid19-and-misinformation-how-an-infodemic-fueled-the-prominence-of-vitamin-d/8AC1297F0D6F4196938FB13A85A817A3
It seems to be misinformation.
I couldn't find the second study, though I haven't l... (read more)