LESSWRONG
LW

JamesH
342550
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
ARENA 6.0 - Call for Applicants
JamesH21d10

I'm sorry it took you so long to fill out! I hope you're correct that you were slower than the median by a decent amount, since I don't want the application to take up quite so much of people's time. However, definitely appreciate you letting us know how long it took you.

We want to try and get a sense of people's experience in AI safety, future career plans, and see how they engage with AI safety material through the application (as well as their technical experience, since it can be a pretty technically demanding course), as well as a bunch of logistical stuff, of course.

I find that we tend to get a lot of signal from virtually all the parts of the application (apart from some of the logistical stuff, but I imagine that stuff is relatively quick to fill out). We have thought about trying to cut it down somewhat, but found it difficult to remove anything.

Reply
ARENA 5.0 - Call for Applicants
JamesH5mo10

We do want the participants on ARENA to have quite a strong interest in AI safety, which is why we ask people to evidence some substantial engagement with AI safety agendas (which is what that question is designed to do). However, we're not looking for perfect answers for either of the questions, nothing that should take hours of research if you're engaged consistently on LessWrong/Alignment Forum.

However, it's not that you must finish the application within 60-90 minutes, this is just a rough estimate of how long it would take someone who's engaged with AI and AI safety to complete it (which may be wrong, sorry that this was the case!). We aren't presuming in this estimate that people are doing a lot of research to provide the highest quality answers to these questions, since that's really not what we're expecting. Although of course you're free to spend as much or as little time as you want on the application.

Reply
Singular learning theory: exercises
JamesH10mo20

I think there's a mistake in 17: \sin(x) is not a diffeomorphism between (-\pi,\pi) and (-1,1) (since it is e.g. not bijective between these domains). Either you mean sin(x/2) or the interval bounds should be (-\pi/2, \pi/2)

Reply
AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0
JamesH1y20

ARENA might end up teaching this person some mech-interp methods they haven't seen before, although it sounds like they would be more than capable of self-teaching any mech-interp. The other potential value-add for your acquaintance would be if they wanted to improve their RL or Evals skills, and have a week to conduct a capstone project with advisors. If they were mostly aiming to improve their mech-interp ability by doing ARENA, there would probably be better ways to spend their time.

Reply
Project proposal: Testing the IBP definition of agent
JamesH3y51

The way we see this project going concretely looks something like:

First things first, we want to get a good enough theoretical background of IBP. This will ultimately result in something like a distillation of IBP that we will use as reference, and hope others will get a lot of use from.

In this process, we will be doing most of our testing in a theoretical framework. That is to say, we will be constructing model agents and seeing how InfraBayesian Physicalism actually deals with these in theory, whether it breaks down at any stage (as judged by us), and if so whether we can fix or avoid those problems somehow.

What comes after this, as we see it at the moment, is trying to implement the principles of InfraBayesian Physicalism in a real-life, honest-to-god, Inverse Reinforcment Learning proposal. We think IBP stands a good chance of being able to patch some of the largest problems in IRL, which should ultimately be demonstrable by actually making an IRL proposal that works robustly. (When this inevitably fails the first few times, we will probably return to step 1, having gained useful insights, and iterate).

Reply
26ARENA 6.0 - Call for Applicants
1mo
3
35ARENA 5.0 - Call for Applicants
5mo
2
43ARENA 4.0 Impact Report
8mo
3
57AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0
1y
7
37Inner Alignment via Superpowers
3y
13
59Finding Goals in the World Model
Ω
3y
Ω
8
76The Core of the Alignment Problem is...
Ω
3y
Ω
10
21Project proposal: Testing the IBP definition of agent
3y
4
27Translating between Latent Spaces
3y
2
14Formalizing Deception
3y
2
Load More