Dan MacKinlay — LessWrong

LESSWRONG
LW

Dan MacKinlay5d

@megasilverfist there are quite a few of us based in Melbourne. HMU.

Dan MacKinlay5d

We're not free at the Melbourne AI Safety Hub, but we are all terribly charming.

Dan MacKinlay6d

Tom Everitt did his PhD in Australia too. (As did I, FWIW.)

Replying toDSLT 2. Why Neural Networks obey Occam's Razor

Dan MacKinlay1mo

DSLT 2. Why Neural Networks obey Occam's Razor

If contains one true parameter $w^{(0)} \in W_{0}$ ,

Having trouble parsing this. Does this mean that one element of the parameter vector is “true”?

The deep history of intelligence

Dan MacKinlay

6mo

I'm trying to sort the woo and the tautology from the content in the dangerously crackpotty world of “big pictures” on intelligence. Assistance appreciated.

“Opponent shaping” as a model for manipulation and cooperation

Dan MacKinlay

7mo

How do autonomous learning agents figure each other out? The question of how they learn to cooperate—or manipulate one another—is a classic problem at the intersection of game theory and AI. The concept of “opponent shaping”—where an agent models how its actions will influence an opponent’s learning—has always been a promising framework for this. For years, however, the main formalism, LOLA, felt powerful but inaccessible. Vanilla reinforcement learning is built around complicated nested expectations and inner and outer loops which layer upon one another in confusing ways. LOLA added high-order derivatives to the mix, making it computationally expensive, not-terribly plausible, and, frankly, hard to build a clean intuition around.

That changed recently. A... (read 5069 more words →)

Replying toSelective regularization for alignment-focused representation engineering

Dan MacKinlay9mo

Selective regularization for alignment-focused representation engineering

Interesting! Ingenious choice of "color learning" to solve the problem of plotting the learned representations elegantly.
This puts me in mind of the "disentangled representation learning" literature (review e.g. here). I've thought about disentangled learning mostly in terms of the Variational Auto-Encoder and GANs, but I think there is work there that applies to any architecture with a bottleneck, so your bottleneck MLP might find some interesting extensions there,
I wonder: what is the generalisation of your regularisation approach to architectures without a bottleneck? I think you gesture at it when musing on how to generalise to transformers. If the latent/regularised content space needs to "share" with lots of concepts, how do we get "nice mappings" there?

Replying toWill Jesus Christ return in an election year?

Dan MacKinlay9mo

Will Jesus Christ return in an election year?

I'm enjoying envisaging this as an alternative explanation for the classic Lizardman's Constant, which is a smidge larger than 3% but then, in cheap talk markets you have less on the line, so…

Dan MacKinlay9mo

Ideally you would wish to calibrate your EV calcs against the benefit of a UAE AISI, though, no, not the expected budget? We could estimate the value of such an institute being more than the running cost (or, indeed, less) depending on the relative leverage of such an institute.