AIS student, self-proclaimed aspiring rationalist, very fond of game theory.
"The only good description is a self-referential description, just like this one."

Wiki Contributions


Not everything suboptimal, but suboptimal in a way that causes suffering on an astronomical scale (e.g. galactic dystopia, or dystopia that lasts for thousands of years, or dystopia with an extreme number of moral patients (e.g. uploads)).
I'm not sure what you mean by Ord, but I think it's reasonable to have a significant probability of S-risk from a Christiano-like failure.

I think you miss one important existential risk separate from extinction, which is having a lastingly suboptimal society. Like, systematic institutional inefficiency, and being unable to change anything because of disempowerment.
In that scenario, maybe humanity is still around because one of the things we can measure and optimize for is making sure a minimum amount of humans are alive, but the living conditions are undesirable.

I'm not sure either, but here's my current model:
Even though it looks pretty likely that AISC is an improvement on no-AISC, there are very few potential funders:
1) EA-adjacent caritative organizations.
2) People from AIS/rat communities.

Now, how to explain their decisions?
For the former, my guess would be a mix of not having heard of/received an application from AISC and preferring to optimize heavily towards top-rated charities. AISC's work is hard to quantify, as you can tell from the most upvoted comments, and that's a problem when you're looking for projects to invest because you need to avoid being criticized for that kind of choice if it turns out AISC is crackpotist/a waste of funds. The Copenhagen interpretation of ethics applies hard there for an opponent with a tooth against the organization.
For the latter, it depends a lot on individual people, but here are the possibilities that come to mind:
- Not wanting donate anything but feeling like having to, which leads to large donations to few projects when you feel like donating enough to break the status quo bias.
- Being especially mindful of one's finances and donating only to preferred charities, because of a personal attachment (again, not likely to pick AISC a priori) or because they're provably effective.

To answer 2), you can say why you don't donate to AISC? Your motivations are probably very similar to other potential donators here.

Follow this link to find it. The translation is made by me, and open to comments. Don't hesitate to suggest improvements.

It's not obvious at all to me, but it's certainly a plausible theory worth testing!

To whom it may concern, here's a translation of "Bold Orion" in French.

A lot of the argumentation in this post is plausible, but also, like, not very compelling?
Mostly the "frictionless" model of sexual/gender norms, and the examples associated: I can see why these situations are plausible (if at least because they're very present in my local culture) but I wouldn't be surprised if they are a bunch of social myth either, in which case the whole post is invalidated.

I appreciate the effort though; it's food for thought even if it doesn't tell me much about how to update based on the conclusion.

Epistemic status: Had a couple conversations on AI Plans with the founder, participated in the previous critique-a-thon. I've helped AI Plans a bit before, so I'm probably biased towards optimism.


Neglectedness: Very neglected. AI Plans wants to become a database of alignment plans which would allow quick evaluation of whether an approach is worth spending effort on, at least as a quick sanity check for outsiders. I can't believe it didn't exist before! Still very rough and unuseable for that purpose for now, but that's what the critique-a-thon is for: hopefully, as critiques accumulate and more votes are fed into the system, it will become more useful.

Tractability: High. It may be hard to make winning critiques, but considering the current state of AI Plans, it's very easy to make an improvement. If anything, you can filter out the obvious failures.

Impact: I'm not as confident here. If AI Plans works as intended, it could be very valuable to allocate funds more efficiently and save time by figuring out which approaches should be discarded. However, it's possible that it will just fail to gain steam and become a stillborn project. I've followed it for a couple months, and I've been positively surprised several times, so I'm pretty optimistic.


The bar to entry is pretty low; if you've been following AIS blogs or forums for several months, you probably have something to contribute. It's very unlikely you'll have a negative impact.
It may also be an opportunity for you to discuss with AIS-minded people and check your opinions on a practical problem; if you feel like an armchair safetyist and tired to be one, this is the occasion to level up.
Another way to think about it is that the engagement was very low in previous critique-a-thon so if you have a few hours to spare, you can make some easy money and fuzzies even if you're not sure about the value in utilons.

Thank you, this is incredibly interesting! Did you ever write up more on the subject? I'm excited to see how it relates to mesa-optimisation in particular.

In the finite case, where , then 

Typo: I think you mean  ?

I'm surprised to hear they're posting updates about CoEm.

At a conference held by Connor Leahy, I said that I thought it was very unlikely to work, and asked why they were interested in this research area, and he answered that they were not seriously invested in it.

We didn't develop the topic and it was several months ago, so it's possible that 1- I misremember or 2- they changed their minds 3- I appeared adversarial and he didn't feel like debating CoEm. (For example, maybe he actually said that CoEm didn't look promising and this changed recently?)
Still, anecdotal evidence is better than nothing, and I look forward to seeing OliviaJ compile a document to shed some light on it.

Load More