Ben Pace

I'm an admin of this site; I work full-time on trying to help people on LessWrong refine the art of human rationality.

Longer bio:


AI Alignment Writing Day 2019
Transcript of Eric Weinstein / Peter Thiel Conversation
AI Alignment Writing Day 2018
Share Models, Not Beliefs


I think this tag description is confused between humility and modesty, as was originally written in the Twelve Virtues:

To be humble is to take specific actions in anticipation of your own errors. To confess your fallibility and then do nothing about it is not humble; it is boasting of your modesty.

Might come back and edit the description a bit, but overall it seems that the list of posts is largely accurate and useful.

Forecasting Thread: AI Timelines

Yeah. Hover-overs that tell you whose distribution it is would help here.

The rationalist community's location problem

I did start trying to make a survey on this sort of info once, and for me it quickly became unmanageably big, like it would have had over a hundred questions and would require lots of iteration loops with users. I spent a bunch of time trying to figure out how far from a major city people would be able to live, with lots of questions about earning, earning potential, openness and ability to do remote work, etc.

Probably someone else will figure out a clever smaller survey worth making though. (I'd be happy to give comments on any draft surveys people make.)

Forecasting Thread: Existential Risk

Suppose there's a point where it's clearly possible that humanity explores the universe and flourishes. Suppose there's a point where it's clearly no longer possible that humanity explores the universe and flourishes. Assign 0% of existential risk at the first point, and 100% at the second point. Draw a straight line between. 

Then make everything proportional to the likelihood of things going off the rails at that time period.

AI Research Considerations for Human Existential Safety (ARCHES)

I listened to this yesterday! Was quite interesting, I'm glad I listened to it.

Draft report on AI timelines

I expect the examples Ajeya has in mind are more like sharing one-line summaries in places that tend to be positively selected for virality and anti-selected for nuance (like tweets), but that substantive engagement by individuals here or in longer posts will be much appreciated.

Open & Welcome Thread - September 2020

I also didn't understand what your sentence was saying. It read to me as "I don't want people to update on this post". When you pointed specifically to LW's culture (which is very argumentative) possibly being a key cause it was clearer what you were saying. Thanks for the clarification (and for trying to avoid negative misinterpretations of your comment).

The Steering Problem


Yeah, I think we've made substantial progress in the last couple of years.

rohinmshah's Shortform

I like this experiment! Keep 'em coming.

The Steering Problem

The motivation seems trivial to me, which might be part of the proble

Yeah, for a long time many people have been very confused about the motivation of Paul's research, so I don't think you're typical in this regard. I think that due to this sequence, Alex Zhu's FAQ, lots of writing by Evan, Paul's post "What Failure Looks Like", and more posts on LW by others, a many more people understand Paul's work on a basic level that was not at all the case 3 years ago. Like, you say "Most training procedures are obviously outer-misaligned", and 'outer-alignment' was not a concept with a name or a write-up at the time this sequence was published.

I agree that this post talks about assuming human-level performance, whereas much of iterated amplification also relaxes that assumption. My sense is that if someone were to just read this sequence, it would still help them focus on the brunt of the problem, that being 'helpful' or 'useful' is not well-defined in the way many other tasks are, and help realize the urgency of this task and why it's possible to make progress now.

Load More