All of Raymond D's Comments + Replies

A slightly sideways argument for interpretability: It's a really good way to introduce the importance and tractability of alignment research

In my experience it's very easy to explain to someone with no technical background that

  • Image classifiers have got much much better (like in 10 years they went from being impossible to being something you can do on your laptop)
  • We actually don't really understand why they do what they do (like we don't know why the classifier says this is an image of a cat, even if it's right)
  • But, thanks to dedicated research, we have be
... (read more)

My main takeaway from this post is that it's important to distinguish between sending signals and trying to send signals, because the latter often leads to goodharting.

It's tricky, though, because obviously you want to be paying attention to what signals you're giving off, and how they differ from the signals you'd like to be giving off, and sometimes you do just have to try to change them. 

For instance, I make more of an effort now than I used to, to notice when I appreciate what people are doing, and tell them, so that they know I care. And I think ... (read more)

My main takeaway from this post is that it's important to distinguish between sending signals and trying to send signals, because the latter often leads to goodharting.

That is a wonderful summary.


For instance, I make more of an effort now than I used to, to notice when I appreciate what people are doing, and tell them, so that they know I care. And I think this has basically been very good. This is very much not me dropping all effort to signal.

But I think what you're talking about is very applicable here, because if I were just trying to maximise th

... (read more)

if you think timelines are short for reasons unrelated to biological anchors, I don't think Bio Anchors provides an affirmative argument that you should change your mind.


Eliezer:  I wish I could say that it probably beats showing a single estimate, in terms of its impact on the reader.  But in fact, writing a huge careful Very Serious Report like that and snowing the reader under with Alternative Calculations is probably going to cause them to give more authority to the whole thing.  It's all very well to note the Ways I Could Be Wrong

... (read more)

The Bio Anchors report is intended as a tool for making debates about AI timelines more concrete, for those who find some bio-anchor-related bound helpful (e.g., some think we should lower bound P(AGI) at some reasonably high number for any year in which we expect to hit a particular kind of "biological anchor"). Ajeya's work lengthened my own timelines, because it helped me understand that some bio-anchor-inspired arguments for shorter timelines didn't have as much going for them as I'd thought; but I think it may have shortened some other folks'.

(The pre... (read more)

The belief that people can only be morally harmed by things that causally affect them is not universally accepted. Personally I intuitively would like my grave to not be desecrated, for instance.

I think we have lots of moral intuitions that have become less coherent as science has progressed. But if my identical twin started licensing his genetic code to make human burgers for people who wanted to see what cannibalism was like, I would feel wronged.

I'm using pretty charged examples here, but the point I'm trying to convey is that there are a lot of moral l... (read more)

You ask a number of good questions here, but the crucial point to me is that they are still questions. I agree it seems, based on my intuitions of the answers, like this isn't the best path. But 'how much would it cost' and 'what's the chance a clone works on something counterproductive' are, to me, not an argument against cloning, but rather arguments for working out how to answer those questions.

Also very ironic if we can't even align clones and that's what gets us.

1Yair Halberstadt2y
This seems like the sort of thing that would be expensive to investigate, has low potential upside and just investigating would have enormous negatives (think loss of wierdness point, and potential for scandal).

I think there are extra considerations to do with what the clone's relation to von Neumann. Plausibly, it might be wrong to clone him without his consent, which we can now no longer get. And the whole idea that you might have a right to your likeness, identity, image, and so on, becomes much trickier as soon as you have actually been cloned.

Also there's a bit of a gulf between a parent deciding to raise a child they think might do good and a (presumably fairly large) organisation funding the creation of a child.

I don't have strongly held convictions on these points, but I do think that they're important and that you'd need to have good answers before you cloned somebody.

How could it be wrong to clone him without his consent? He’s dead, and thus cannot suffer. Moreover, the right to your likeness is to prevent people from being harmed by misuse of said likeness; it doesn’t strike me as a deontological prohibition on copying (or as a valid moral principle to the extent that it is deontological), and he can’t be harmed anymore. Also, how could anyone have a right to their genome that would permit them to veto others having it? If that doesn’t sound absurd to you prima facie, consider identical twins (or if they’re not quite identical enough, preexisting clones). Should one of them have a right to dictate the existence or reproduction of the other? And if not, how can we justify such a genetic copyright in the case of cloning? Cloning, at least when the clone is properly cared for, is a victimless offense, and thus ought not be offensive at all.

Well, I basically agree with everything you just said. I think we have quite different opinions about what politics is, though, and what it's for. But perhaps this isn't the best place to resolve those differences.

Ok I think this is partly fair, but also clearly our moral standards are informed by our society, and in no small part those standards emerge from discussions about what we collectively would like those standards to be, and not just a genetically hardwired disloyalty sensor.

Put another way: yes, in pressured environments we act on instinct, but those instincts don't exist in a vacuum, and the societal project of working out what they ought to be is quite important and pretty hard, precisely because in the moment where you need to refer to it, you will be acting on System 1.

Yes, these discussions set / update group norms. Perceived defection from group norms triggers the genetically hardwired disloyalty sensor. Right, System 1 contains adaptations optimized to signal adherence to group norms. The societal project of working out what norms other people should adhere to is known as "politics", and lots of people would agree that it's important.

I'm not sure I'm entirely persuaded. Are you saying that the goal of ethics is to accurately predict what people's moral impulse will be in arbitrary situations?

I think moral impulses have changed with times, and it's notable that some people (Bentham, for example) managed to think hard about ethics and arrive at conclusions which massively preempted later shifts in moral values.

Like, Newton's theories give you a good way to predict what you'll see when you throw a ball in the air, but it feels incorrect to me to say that Newton's goal was to find order in... (read more)

4Yair Halberstadt2y
I'm not saying that's the explicit goal. I'm saying that in practice, if someone suggests a moral theory which doesn't reflect how humans actually feel about most actions nobody is going to accept it. The underlying human drive behind moral theories is to find order in our moral impulses, even if that's not the system's goal
I like this framing! The entire point of having a theory is to predict experimental data, and the only way I can collect data is through my senses. You could construct predictive models of people's moral impulses. I wouldn't call these models laws, though.

Migration - they have a team that will just do it for you if you're on the annual plan, plus there's an exporting plugin (

Setup - yeah there are a bunch of people who can help with this and I am one of them

I'll message you

Massive conflict of interest: I blog on ghost, know and like the people at ghost, and work at a company that moved from substack to ghost, get paid to help people use ghost, and a couple more COIs in this vein. 

But if you're soliciting takes from somebody from wordpress I think you might also appreciate the case for ghost, which I simply do think is better than substack for most bloggers above a certain size.

Re your cons, ghost:

1 - has a migration team and the ability to do custom routing, so you would be able to migrate your content

3 - supports total... (read more)

Strong upvoting after our conversation so more people see it. Raymond made a strong case, I'm seriously considering it and would like everyone else's take on Ghost, good or bad. Getting the experiences of others who've used it, and can verify that it works and can be trusted (or not, which would be even more useful if true!), would be very helpful. The basic downside versus Substack is lack of Substack's discovery, such as it is, not sure of magnitude of that, and that people won't be used to it and won't have already entered CC info, which will hurt revenue some (but again, how much? Anyone have estimates?) and the start-up costs would be more annoying.  In exchange you get full customization, open source that can easily be self-hosted in a pinch, lower costs given expected size of the audience, better analytics, better improvement in feature sets over time given track records, etc. But I'd have to do at least some work to get that (e.g. you need to add a comment section on your own). 
Thank you for being up front. My basic answer is that I'm vaguely aware Ghost exists, and I'd be open to a pitch/discussion to try and convince me it's superior to Substack or Wordpress, although it would be an uphill battle. If there's human support willing to make the migration and setup easy and help me figure out how to do things, then... maybe? Could set up a call to discuss.

I'd like to throw out some more bad ideas, with fewer disclaimers about how terrible they are because I have less reputation to hedge against.

Inline Commenting

I very strongly endorse the point that it seems bad that someone can make bad claims in a post, which are then refuted in comments which only get read by people who get all the way to the bottom and read comments. To me the obvious (wrong) solution is to let people make inline comments. If nothing else, having a good way within comments to point to what part of the post you want to address feels like... (read more)

You can link to comments, so that is an easy technical solution. As ever , it's mainly a cultural problem: if good quality criticism were upvoted, it would appear at the top of the comments anyway, and bit be buried.

I'm really enjoying the difference between the number of people who claimed they opted out and the number of people who explicitly wrote the phrase

I mean I thought the entire point was to say it out loud, but if you want me to write it: I no longer consent to being in a simulation.
Plan to cryo-preserve yourself at some future time, then create a trust fund with the mission of creating a million simulations of your life as soon as brain simulation becomes feasible and cheap. The fund's mission is to wake up the simulations the instant that the cryo-preservation is set to start. It will distribute the remaining money (which has been compounding, of course) among the simulations it has just woken up and instantiated in the real world.

Follow the white rabbit

There's not just one. We default into several overlapping simulations. Each simulation requires a different method of getting out. One of them is to just stare at a blank wall for long enough.