RyanCarey's Comments

Subscripting Typographic Convention For Citations/Dates/Sources/Evidentials: A Proposal

This is a cool idea. However, are you actually using the subscript in two confusingly different ways? In I_2010, it seems you're talking about you, indexed to the year 2020, whereas in {Abdul Bey}_2000, it seems you're citing a book. It would be pretty bad for people to see a bunch of the first kind of case, and then expect citations, but only get them half of the time.

Defining AI wireheading

Seems like the idea is that wireheading denotes specification gaming that is egregious in its focus on the measurement channel. I'm inclined to agree..

What technical prereqs would I need in order to understand Stuart Armstrong's research agenda?

You could check out Best Textbooks on Every Subject. But people usually recommend Linear Algebra Done Right for LinAlg. Understanding ML seems good for ML Theory. Sutton and Barto is an easy read for RL.

What technical prereqs would I need in order to understand Stuart Armstrong's research agenda?

It may be that technical prereqs are missing. It could also be that you're missing a broader sense of "mathematical maturity", or that you're struggling because Stuart's work is simply hard to understand. That said, useful prereq areas (in which you could also gain overall mathematical maturity) would include:

  • Probability theory
  • Linear Algebra
  • Machine learning theory
  • Reinforcement Learning

It's probably overkill to go deep into these topics. Usually, what you need is in the first chapter.

Where are people thinking and talking about global coordination for AI safety?

I would guess three main disagreements are:

i) are the kinds of transformative AI that we're reasonably likely to get in the next 25 years are unalignable?

ii) how plausible are the extreme levels of cooperation Wei Dai wants

iii) how important is career capital/credibility?

I'm perhaps midway between Wei Dai's view and the median governance view so may be an interesting example. I think we're ~10% likely to get transformative general AI in the next 20 years, and ~6% likely to get an incorrigible one, and ~5.4% likely to get incorrigible general AI that's insufficiently philosophically competent. Extreme cooperation seems ~5% likely, and is correlated with having general AI. It would be nice if more people worked on that, or on whatever more-realistic solutions would work for the transformative unsafe AGI scenario, but I'm happy for some double-digit percentage of governance researchers to keep working on less extreme (and more likely) solutions to build credibility.

Forum participation as a research strategy

I agree that some people can benefit from doing both, although getting everyone online is a hard collective action problem. I just claim that many researchers will satisfy with OP. At MIRI/FHI/OpenAI there are ~30-150 researchers, who think about a wide range of areas, which seems broadly comparable to the researchers among LessWrong/AF's active users (depending on your definition of 'researcher', or 'active'). Idea-exchange is extended by workshops and people moving jobs. Many in such a work environment will fund that FP has unacceptably low signal-noise ratio and will inevitably avoid FP...

Forum participation as a research strategy

I would note that many of these factors apply as benefits of office-chat participation (OP) as well. The main benefit of FP absent from OP, I suppose, is preparing you for efficient written communication, but the rest seem feature in both. The fact that their benefits overlap explains why remote researchers benefit so much more than others from FP.

IRL in General Environments
Aside from yourself, the other CHAI grad students don't seem to have written up their perspectives of what needs to be done about AI risk. Are they content to just each work on their own version of the problem?

I think this is actually pretty strategically reasonable.

CHAI students would have high returns to their probability of attaining a top professorship by writing papers, which is quite beneficial for later recruiting top talent to work on AI safety, and quite structurally beneficial for the establishment of AI safety as a field of research. The time they might spend writing up their research strategy does not help with their this, nor with recruiting help with their line of work (because other nearby researchers face similar pressures, and because academia is not structured to have PhD students lead large teams).

Moreover, if they are pursuing academic success, they face strong incentives to work on particular problems, and so their research strategies may be somewhat distorted by these incentives, decreasing the quality of a research agenda written in that context.

When I look at CHAI research students, I see some pursuing IRL, some pursuing game theory, some pursuing the research areas of their supervisors (all of which could lead to professorships), and some pursuing projects of other research leaders like MIRI or Paul. This seems healthy to me.

Writing children's picture books

In general, thinking of yourself commuciating your ideas to a less intelligent and knowledgeable person could push you in the direction of confabulating freeer-flowing stories whereas imagining yourself communicating your ideas to a smarter person could push you in the direction of saying less, with higher-rigour.

It seems like which one is desirable depends on the individual and the context (cf the Law of Equal and Opposite Advice)

Load More