Mordechai Rorvig — LessWrong

Plans to build AGI with nuclear reactor-like safety lack 'systematic thinking,' say researchers

Thank you for the feedback. I feel that that is a valid criticism. I will think about this in future articles on the topic. This was my first foray into thinking seriously about defense in depth for powerful AI design, and looking at recent research on the topic. The research is pretty marginal, and there was not much to go on.

Publishing academic papers on transformative AI is a nightmare

Mordechai Rorvig8d82

This is a very interesting personal account, thanks for sharing this. I would imagine and be curious about whether this kind of issue crops up with any number of economics research topics, like research around environmental impacts, unethical technologies more generally, excessive (and/or outright corrupt) military spending, and so on.

There are perhaps (good-faith) questions to be asked about the funding sources and political persuasions of the editors of these journals, or the journal businesses themselves, and why they might be incentivized to stay clear of such topics. Of course, we are actively seeing a chill in the US right now on research into many other areas of social science. One can imagine how you might be seeing something related.

So, I do imagine things like the psychological phenomenon of denial of mortality might be at play here, and that's an interesting insight. But I would also guess there to be many other phenomena, as well, and frankly of a more unsavory nature.

Social media feeds 'misaligned' when viewed through AI safety framework, show researchers

Mordechai Rorvig11d32

Ok I see. So, in the context of my question (which I'm not exactly sure if that's what you're speaking to, or just speaking more generally), you see misalignment to broad human values as indeed being misalignment, just not a misalignment that is unexpected.

Social media feeds 'misaligned' when viewed through AI safety framework, show researchers

Mordechai Rorvig11d20

One discussion question I'd be interested in hearing from people about, which has to do with how I used the word 'misalignment' in the headline:

Do people think that companies like Twitter/X/xAI who don't (seemingly) align their tools to broader human values are indeed creating tools that exhibit 'misalignment'; or are these tools seen not as 'misaligned,' but as only aligned to their own motives (e.g., profit), which is to be expected? In other words, or relatedly, how should we be thinking about the alignment framework, especially in its historical context—as a program that was perhaps overly idealistic or optimistic about what companies would do to make AI generally safe and beneficial, or as a program that is and was always meant to only be about making AI aligned with its corporate controllers?

I imagine the framing of this question, itself, might be objected to in various ways—just dashed this out.

Can someone, anyone, make superintelligence a more concrete concept?

Mordechai Rorvig9mo10

Thanks - didn't see his remarks about this, specifically. I'll try to look them up, thanks.

If Neuroscientists Succeed

Mordechai Rorvig9mo80

I can see what you mean. However, I would say that just claiming "that's not what we are trying to do" is not a strong rebuttal. For example, we would not accept such a rebuttal from a weapons company, which was seeking to make weapons technology widely available without regulation. We would say - it doesn't matter how you are trying to use the weapons, it matters how others are, with your technology.

In the long term, it does seem correct to me that the greater concern is issues around superintelligence. However, in the near term it seems the issue is we are making things that are not at all superintelligent, and that's the problem. Smart at coding and language, but coupled e.g. with a crude directive to 'make me as much money as possible,' with no advanced machinery for ethics or value judgement.

If Neuroscientists Succeed

Mordechai Rorvig9mo10

Thank you!

If Neuroscientists Succeed

Mordechai Rorvig9mo10

Thank you for the feedback. Did you know of any similar writing making similar points that were more readable, in your mind? What was an example of a place that you found it meandering or overlong? This could help me improve future drafts. I appreciate your interest, and I'm sorry you felt it wasn't concise and was overly 'vibey.'

how do the CEOs respond to our concerns?

Answer by Mordechai RorvigFeb 12, 202532

I don't know, but I would suspect that Sam Altman and other OpenAI staff have strong views in favor of what they're doing. Isn't there probably some existing commentary out there on what he thinks? I haven't checked. But also, it is problematic to assume that science is rational. It isn't, and many people often hold differing views up to and including the time that something becomes incontrovertibly established.

Further, an issue here is when someone has a strong conflict of interest. If a human is being paid millions of dollars per year to pursue their current course of action—buying vacation homes on every continent, living the most luxurious lifestyle imaginable—then it is going to be hard for them to be objective about serious criticisms. This is why I can't just go up to the Exxon Mobil CEO and say, 'Hey! Stop drilling, it's hurting the environment!' and expect them to do anything about it, even though it would be a completely non-controversial statement.

Can someone, anyone, make superintelligence a more concrete concept?

Mordechai Rorvig9mo20

Honestly, I think the journalism project I was working on in the last year may be most important for the way it sheds light on your question.

The purpose of the project, to be as concise as possible, was to investigate the evidence from neuroscience that there may be a strong analogy between modern AI programs, like language models, and distinctive subregions of the brain, like the so-called language network, responsible for our abilities to process and generate language.

As such, if you imagine coupling such a frontier language model with a crude agent architecture, then what you will wind up with might be best viewed as a form of hyperintelligent machine sociopath, with all the extremely powerful language machinery of a human—perhaps even much more powerful, considering its inhuman scaling—but none of the machinery necessary for say, emotional processing, empathy, and emotional reasoning—aside from the superficial facsimile of such, that you get from a mastery of human language. (For example, existing models lack any and all machinery corresponding to the human dopamine and serotonin systems.)

This is, for me, a frankly terrifying realization that I am still trying to wrap my head around, and plan to be posting about more, soon. Does this help at all?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments