ceselder — LessWrong

Five very good reasons to not write down literally every single thought you have

This is very hard to answer. I just tried to write down basically everything. The noise kind of stopped after a while. it was a very strange sensation

Pepperoni and the end of morality

ceselder21d10

It's fiction, I'm vaguely talking about myself as "you" here but I'm getting at some instinct here basically. Thanks for linking that, I hadn't seen it and that's kind of exactly what I was getting at.

Emergent Introspective Awareness in Large Language Models

ceselder24d42

Possibly yes, but I don't think that's a legitimate safety concern since this can already be done very easily with other techniques. And for this technique you would need to model diff with a nonrefusal prompt of the bad concept in the first place, so the safety argument is moot. But sounds like an interesting research question

Why you shouldn't eat meat if you hate factory farming

ceselder24d10

This makes sense honestly. I guess you would still run the risk of a non-vegan seeing you do these things and going "ha! hypocrite!" but I don't know how real that risk is honestly.

ceselder's Shortform

ceselder2mo10

Maybe a term like Extinction-(risk)-Level-Super-Intelligence or ELSI for short may be a more productive term to use than asi or agi

ceselder's Shortform

ceselder2mo20

you are completely correct

ceselder's Shortform

ceselder2mo10

I think this is true to an extent. But not fully.

I think its quite unlikely that funding certain kinds of essential AI safety research leads you to more profitable AI.

Namely mechinterp, preventing stuff like scheming. Not all AI safety research is aimed at getting the user to follow a prompt, yet the research may be very important for stuff like existential risk.

The opportunity cost is funding research into how you can make your model more engaging, performant or cheaper. I would be suprised if these things aren't way more effective for your dollar.

ceselder's Shortform

ceselder2mo10

Yeah I can see that analogy, I just don't think most non-rationalist types have realized this

ceselder's Shortform

ceselder2mo20

Isn't it very likely that AI safety research is one of the very first things to be cut if AI companies start to have less access to VC money? I don't think the company has a huge incentive for AI safety training, particularly in a way that people allocating funding would understand. Isn't this a huge problem? Maybe this has been adressed and I missed it.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments