mattmacdermott — LessWrong

Which side of the AI safety community are you in?

Some people likely think

don't build ASI until it can be done safely > build ASI whenever but try to make it safe > never build ASI

Those people might give different prescriptions to the "never build ASI" people, like not endorsing actions that would tank the probability of ASI ever getting built. (Although in practice I think they probably mostly make the same prescriptions at the moment.)

Bubble, Bubble, Toil and Trouble

mattmacdermott9d194

I think "Will there be a crash?" is a much less ambiguous question than "Is there a bubble?"

Charbel-Raphaël's Shortform

mattmacdermott13d20

Yeah, I think “training for transparency” is fine if we can figure out good ways to do it. The problem is more training for other stuff (e.g. lack of certain types of thoughts) pushes against transparency.

abramdemski's Shortform

mattmacdermott22d*5536

I often complain about this type of reasoning too, but perhaps there is a steelman version of it.

For example, suppose the lock on my front door is broken, and I hear a rumour that a neighbour has been sneaking into my house at night. It turns out the rumour is false, but I might reasonably think, "The fact that this is so plausible is a wake-up call. I really need to change that lock!"

Generalising this: a plausible-but-false rumour can fail to provide empirical evidence for something, but still provide 'logical evidence' by alerting you to something that is already plausible in your model but that you hadn't specifically thought about. Ideal Bayesian reasoners don't need to be alerted to what they already find plausible, but humans sometimes do.

The quotation mark

mattmacdermott23d70

But then we have to ask — why two ‘ marks, to make the quotation mark? A quotidian reason: when you only use one, it’s an apostrophe. We already had the mark that goes in “don’t”, in “I’m”, in “Maxwell’s”; so two ‘ were used to distinguish the quote mark from the existing apostrophe.

Incidentally I think in British English people normally do just use single quotes. I checked the first book I could find that was printed in the UK and that’s what it uses:

Markets in Democracy: What happens when you can sell your vote?

mattmacdermott24d*77

He'd be a fool to part with his vote for less than the amount of the benefits he gets.

Doesn't seem right. Even assuming the person buying his vote wants to use it to remove his benefits, that one vote is unlikely to be the difference between the vote-buyer's candidate winning and losing. The expected effect of the vote on the benefits is going to be much less than the size of the benefits.

Checking in on AI-2027

mattmacdermott25d30

An intuition you might be able to invoke is that the procedure they describe is like greedy sampling from an LLM, which doesn’t get you the most probable completion.

CFAR update, and New CFAR workshops

mattmacdermott1mo1612

“A Center for Applied Rationality” works as a tagline but not as a name

Notes on fatalities from AI takeover

mattmacdermott1mo72

We have a ~25% chance of extinction

Maybe add the implied 'conditional on AI takeover' to the conclusion so people skimming don't come away with the wrong bottom line? I had to go back through the post to check whether this was conditional or not.

leogao's Shortform

mattmacdermott1mo20

Fair enough yeah. But at least (1)-style effects weren’t strong enough to prevent any significant legislation in the near future.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments