One is that many participants at ARENA have already done AI Safety research before participating. Second evidence is that at least four ARBOx (a 2-week compressed version of ARENA) are doing elite AI safety fellowships (1 Anthopic Fellows Program, 2 LASR Labs, 1 MATS).
I don't think this is evidence that ARENA is about signalling:
We've all heard of "Safety Cases", i.e. structured arguments that an AI deployment has low chance of catastrophe. Should labs be required to make Benefit Cases, i.e. structured arguments for why their AI deployment has high expected benefits?
Otherwise, how do we know that the benefits outweigh the risks?
@Eli Tyre asked for an example:
An event-space was hosting a Christmas party in London. I arrived late, maybe 10pm. They had oversupplied food, and a large cake, topped with strawberries, had been abandoned in the corner. Rather than cutting a slice of cake, I simply took a strawberry from the top. I turned to my friend and said "This is unethical"[1] and ate the strawberry.
Clopus45 initially tells me that taking the strawberry was fine, but after some back-and-forth we've agreed on this assessment: Strawberries are the scarce, desirable resource; cake is the abundant substrate. The intended allocation bundles them together. By taking a strawberry without cake, you claim more than your proportional share of the good stuff while leaving strawberry-depleted cake for others. You might argue you weren't going to eat cake regardless—but someone else might have wanted a properly-topped slice. The strawberry you took was theirs. This is mitigated by the likelihood that the cake would go uneaten anyway, but not eliminated by it.
What's up with lesswrong and lumenators? It's not that rats are less susceptible to marketing, or better at finding products. (Or, not only.) It's something about reframing the problem from "I need blue light therapy" to "I need photons" and then sourcing photons from wherever they're cheapest.
This is related to More Dakka: "I need blue light therapy" isn't dakka-able because you're either doing the therapy or your not. Whereas "I need photons" is dakka-able -- it's easier to see what it would mean to 100x photons.
Two additional considerations in favour of long-dated call options:
But if you give them Iphones, they can always just decide they don't want to use them.
So there's a few thoughts which push in the opposite direction:
IDK. I think I'm just confused about all of this.
But I think we shouldn't reward people for only making joking predictions instead of 100-page reports
I don't like this sentiment.
If he predicted ASI in 2028-2030 then I'm not 'punishing' him by believing he did; if he didn't then I'm not 'rewarding' him by believing he didn't.
Thanks for these considerations, I'll ponder on them more later.
Here are my immediate thoughts:
If you think that the amount of resources used on mundane human welfare post-singualarity is constant, then adding the Zambian child to the population leads to a slight decrease in the lifespan of the rest of the population, so it's zero-sum.
Hmm, this is true on impersonal ethics, in which the only moral consideration is maximising pleasurable person-moments. On such a view, you are morally neutral about killing 1000 infants and replacing them with people with the same welfare. But this violates common sense morality. And I think you should have some credence (under moral uncertainty) that this is bad.
If you think that the lightcone will basically be spent on the CEV of the humans that exist around the singularity, you might worry that the marginal child's vote will make the CEV worse.
Hmm, this doesn't seem clear-cut, certainly not enough to justify deviating so strongly from common-sense morality.
I think continual learning might be solved by giving an agent tool access to a database, and then training the agent to use the tool effectively. Rather than something with the weights.
My odds are: