One is that many participants at ARENA have already done AI Safety research before participating. Second evidence is that at least four ARBOx (a 2-week compressed version of ARENA) are doing elite AI safety fellowships (1 Anthopic Fellows Program, 2 LASR Labs, 1 MATS).
I don't think this is evidence that ARENA is about signalling:
We've all heard of "Safety Cases", i.e. structured arguments that an AI deployment has low chance of catastrophe. Should labs be required to make Benefit Cases, i.e. structured arguments for why their AI deployment has high expected benefits?
Otherwise, how do we know that the benefits outweigh the risks?
@Eli Tyre asked for an example:
An event-space was hosting a Christmas party in London. I arrived late, maybe 10pm. They had oversupplied food, and a large cake, topped with strawberries, had been abandoned in the corner. Rather than cutting a slice of cake, I simply took a strawberry from the top. I turned to my friend and said "This is unethical"[1] and ate the strawberry.
Clopus45 initially tells me that taking the strawberry was fine, but after some back-and-forth we've agreed on this assessment: Strawberries are the scarce, desirable resource; cake is the abundant substrate. The intended allocation bundles them together. By taking a strawberry without cake, you claim more than your proportional share of the good stuff while leaving strawberry-depleted cake for others. You might argue you weren't going to eat cake regardless—but someone else might have wanted a properly-topped slice. The strawberry you took was theirs. This is mitigated by the likelihood that the cake would go uneaten anyway, but not eliminated by it.
What's up with lesswrong and lumenators? It's not that rats are less susceptible to marketing, or better at finding products. (Or, not only.) It's something about reframing the problem from "I need blue light therapy" to "I need photons" and then sourcing photons from wherever they're cheapest.
This is related to More Dakka: "I need blue light therapy" isn't dakka-able because you're either doing the therapy or your not. Whereas "I need photons" is dakka-able -- it's easier to see what it would mean to 100x photons.
Two additional considerations in favour of long-dated call options:
But if you give them Iphones, they can always just decide they don't want to use them.
So there's a few thoughts which push in the opposite direction:
IDK. I think I'm just confused about all of this.
But I think we shouldn't reward people for only making joking predictions instead of 100-page reports
I don't like this sentiment.
If he predicted ASI in 2028-2030 then I'm not 'punishing' him by believing he did; if he didn't then I'm not 'rewarding' him by believing he didn't.
I think that SGD isn't sample efficient enough to solve continual learning