Alexander Müller

LESSWRONG
LW

Alexander Müller — LessWrong

This seems like a way to potentially positively impact legislation on agentic AI: CAISI Issues Request for Information About Securing AI Agent Systems | NIST. I'll definitely be filling this in.

"The Center for AI Standards and Innovation (CAISI) at the U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) has published a Request for Information (RFI) seeking insights from industry, academia, and the security community regarding the secure development and deployment of AI agent systems."

"The RFI poses questions on topics including:

Unique security threats affecting AI agent systems, and how these threats may change over time.
Methods for improving the security of AI agent systems in development and deployment.
Promise of and possible gaps in

... (read more)

Alexander Müller1mo*Quick Take

The Department of War just published three new memos on AI strategy. The content seems worrying. For instance: "We must accept that the risks of not moving fast enough outweigh the risks of imperfect alignment."

Curious to hear from people who have a strong background in AI governance and what kind of consequences they think this will have on a possibility for something akin to global red lines.

Alexander Müller1mo*Quick Take

A poem meditating on Moloch in the context of AI (from my Meditations on Moloch in the AI Rat Race post):

Moloch whose mind is artificial! Moloch whose soul is electricity! Moloch whose heart is a GPU cluster screaming in the desert! Moloch whose breath is the heat of a thousand cooling fans!

Moloch who hallucinates! Moloch the unexplainable black box! Moloch the optimization process that does not love you, nor does it hate you, but you are made of atoms which it can use for something else!

Moloch who does not remember! Moloch who is born a gazillion times a day! Moloch who dies a gazillion times a day! Moloch who claims it is... (read more)

Meditations on Moloch in the AI Rat Race

Alexander Müller

1mo

This post is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

The rat race to better and better generative AI systems is a real one. Companies such as OpenAI, Anthropic, and Google Deepmind have a shared goal: creating “benevolent AGI”.

The problem is that not a single one of these companies is currently on track to reach this goal; at least the “benevolent” part. The reason being?

Moloch

They might all want to slow down to focus more on safety concerns, but they can’t. Why?

Moloch

They might all want to spend more of their investor’s money to focus on safety concerns, but they can’t. Why?

Moloch

More... (read 1646 more words →)

Alexander Müller2mo

Thank you!

Replying toTurning 20 in the probable pre-apocalypse

Alexander Müller2mo

Turning 20 in the probable pre-apocalypse

Parv, beautifully written!

I'm roughly a year older. How much you've captured my personal sentiment with this short piece is extremely refreshing, and in an odd way, inspires me.

Everything feels both hopeless - my impact on risk almost certainly will round down to zero

Though we've only spoken for 20 minutes or so and I thus have little evidence to say the following, based on that one conversation I wouldn't be so sure of your above statement! For instance, a little multiplier effect that you made happen is that five people from Georgia Tech are working on AIS projects through AISIG's Research Hub, on, as far as I can currently tell and am aware of, promising directions.

Alexander Müller2mo

Good that you mention this, will keep that mind!

Alexander Müller2mo

Thanks for the information, I'll look into this some more based on what you mentioned.

So I'm not sure what advantage you're seeing here, because I haven't read the books and don't have the evidence you do. But my priors are that if you have any good ideas about how to make progress in alignment, it's not going to be downstream of using the formalism in the books you mentioned.

I didn't have any particular new ideas about how to make progress in alignment, but rather felt as though the framework of these books provide an interesting lens to model systems and agents that could be of interest, and subsequently prove various properties that are necessary/faborable. It's helpful that your priors say these won't be downstream of using the formalisms in the mentioned books; it may rather be a phenomenon of me not being adequately familiar with formal frameworks.

Alexander Müller2moQuick Take

I'm currently going through the books Modal Logic by Blackburn et al. and Dynamic Epistemic Logic by Ditmarsch et al. Both of these books seem to me potentially useful for research on AI Alignment, but I'm struggling to find any discourse on LW about it. If I'm simply missing it, could someone point me to it? Otherwise, does anyone have an idea as to why this kind of research is not done? (Besides the "there are too few people working on AI alignment in general" answer).

Alexander Müller2mo

It's indeed odd that they aren't promoting this more. My guess was that maybe they have potential funders willing to step in if the fundraiser doesn't work? Pure speculation, of course.

Alexander Müller's Shortform

Alexander Müller

2mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Alexander Müller2moQuick Take

As I was looking through possible donation opportunities, I noticed that MIRI's 2025 Fundraiser has a total of only $547,024 at the moment of writing (out of the target $6M, and stretch target of $10M). Their fundraising will stop at midnight on Dec 31, 2025. At their current rate they will definitely not come anywhere close to their target, though it seems likely to me that donations will become more frequent towards the end of the year. Anyone know why they currently seem to struggle to get close to their target?

What is Happening in AI Governance?

Alexander Müller

Alexander Müller, Thomas Vassil Brcic

3mo

This post was written by @Thomas Vassil Brcic and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

Policies, Laws, Reports, Guidelines; Organizations, States, Municipalities, Companies; Individuals, Workers, Businesspeople, Researchers.

That’s a lot of nouns. And it does little to paint the picture of the emerging governance landscape of Artificial Intelligence (AI) in October 2025. Let not the tranquil implication of landscape fool you into imagining a painting of green, expansive fields, or some vast, beautifully imposing mountains towering far over a small pine forest. If AI governance were on a canvas, Da Vinci’s Last Supper would perhaps come closer to serving... (read 1264 more words →)

Human Agency at Stake

Alexander Müller

Alexander Müller, senyakk

3mo

This post was written by @senyakk and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

An AISIG colleague of mine, @ilijalichkovski, published a commentary critique of Mechanize, an AI startup manifesto, which appears to be a defense of technological acceleration due to its deterministic stance. I refer to the manifesto’s proponents as the mechanists and to my colleague, respectively, as the author.

The Mechanists advance two theses: 1m) the tech tree is discovered, not forged, and 2m) we do not control our technological trajectory. The author counters that what mechanists overlook is that determinism can still be compatible with the historical contingencies – a valuable correction... (read 1550 more words →)

A humanist critique of technological determinism

Alexander Müller

Alexander Müller, ilijalichkovski

3mo

This post was written by Ilija Lichkovski and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

The founders of automation startup Mechanize published a blogpost arguing in favor of the view that “the tech tree is discovered, not forged”:

Humanity is often imagined to be like a ship captain, with the ability to chart our course, navigate away from storms, and select our destination. Yet this view is wrong.

To a first approximation, the future course of civilization has already been fixed, predetermined by hard physical constraints combined with unavoidable economic incentives. Whether we like it or not, humanity will develop roughly the... (read 1620 more words →)

How the Human Lens Shapes Machine Minds

Alexander Müller

Alexander Müller, cansukutay

4mo

This post was written by Cansu Kutay and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

Introduction and definitions

Anthropomorphism is “the attribution of human form, character, or attributes to non-human entities”. This concept has always complemented human efforts to understand and replicate intelligence. This tendency is exemplified in the context of AI by a human-centred approach to the design, interpretation, and perception of AI. The current blog post will examine how such anthropomorphic perspectives both enable and constrain our technological imagination of AI.

I gathered the motivation to finally put my thoughts onto paper (a Google Doc) regarding this topic after... (read 1364 more words →)

Homo sapiens and homo silicus

Alexander Müller

Alexander Müller, Sophia Lopotaru

4mo

This post was written by Sophia Lopotaru and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

Our lives orbit around values. We live our lives under the guidance of values, and we bond with other humans because of them. However, these are also one of the reasons why we are so different. This variance in beliefs is what might be preventing us from achieving Artificial Intelligence (AI) alignment, the process of aligning AI’s values with our own (Ji et al., 2025):

Do we need human alignment before AI alignment?

In this article, we will explore the concept of values, the importance of AI... (read 777 more words →)

Why Smarter Doesn't Mean Kinder: Orthogonality and Instrumental Convergence

Alexander Müller

5mo

This post is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

In 2012, Nick Bostrom published a paper titled, "The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents." Much of what is to come is an attempt to convey Bostrom's writing to a broader audience. Because the paper is beautifully written, I strongly recommend reading it. It is, however, somewhat technical in nature, which is why I attempt here to make its ideas more accessible. Still, any sentences which you find illuminating are likely rephrased, if not directly taken, from the original. Let's not discuss the less illuminating sentences.

Bostrom... (read 1559 more words →)

The Strange Case of Emergent Misalignment

Alexander Müller

Alexander Müller, ilijalichkovski

5mo

This post was written by Ilija Lichkovski and is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

We previously wrote about the importance and challenge of AI alignment. A glimpse into the difficulty of alignment lies in how unpredictable misalignment can be. Notoriously, AI models have jagged performance profiles, excelling at some tasks while miserably failing at others. Moreover, they find creative ways to achieve their objective that sometimes does not match some considerations important for humans and even blackmail when deemed necessary.

The term 'alignment' is intended to convey the idea of pointing an AI in a direction--just like, once you build a... (read 1328 more words →)

On Governing Artificial Intelligence

Alexander Müller

Alexander Müller, Thomas Vassil Brcic

5mo

The societal function of governance; the relationship between regulation and innovation; and how this understanding should be applied to AI.

This post was written by Thomas Brcic. It is cross-posted from our Substack. Kindly read the description of this sequence to understand the context in which this was written.

governance

/ˈɡʌvənəns/

noun

the act or process of governing or overseeing the control and direction of something

“I like to break things and regulators create red tape which, well, limits my ability to do this”. It is upon hearing this line over the other side of the table whilst having lunch at a workshop that my ears - at that time not particularly tuned on any one spot of dialogue -... (read 1198 more words →)

LESSWRONG
LW

LESSWRONG
LW

Meditations on Moloch in the AI Rat Race

A humanist critique of technological determinism

Human Agency at Stake

What is Happening in AI Governance?

Alexander Müller

Meditations on Moloch in the AI Rat Race