mtaran — LessWrong

Social drives 2: “Approval Reward”, from norm-enforcement to status-seeking

Every time one of these comes out it feels like you've successfully dissected another piece of my soul. In a good way!

There's your approval reward dose. Do keep them coming :)

Having worked at METR for some months last year, I just want to chime in to add that they have indeed seen the skulls. This post does a great service to the broader public by going into many important points at length. But these issues and others are also very much top of mind at METR, which is one of the reasons why they caveat results extensively in their publications.

If you haven't been in touch or visited them already, I highly recommend it. They're pretty awesome and love to discuss this sort of stuff!

johnswentworth's Shortform

mtaran1y30

Re: LLMs for coding: One lens on this is that LLM progress changes the Build vs Buy calculus.

Low-power AI coding assistants were useful in both the "build" and "buy" scenarios, but they weren't impactful enough to change the actual border between build-is-better vs. buy-is-better. More powerful AI coding systems/agents can make a lot of tasks sufficiently easy that dealing with some components starts feeling more like buying than building. Different problem domains have different peak levels of complexity/novelty, so the easier domains will start being affected more and earlier by this build/buy decision boundary shift. Many people don't travel far from their primary domains, so to some of them it will look like the shift is happening quickly (because it is, in their vicinity) even though on the larger scale it's still pretty gradual.

Supposing the 1bit LLM paper pans out

mtaran2y51

Perhaps if you needed a larger number of ternary weights, but the paper claims to achieve the same performance with ternary weights as one gets with 16-bit weights using the same parameter count.

Supposing the 1bit LLM paper pans out

Answer by mtaranMar 02, 202410-4

I think this could be a big boon for mechanistic interpretability, since it's can be a lot more straightforward to interpret a bunch of {-1, 0, 1}s than reals. Not a silver bullet by any means, but it would at least peel back one layer of complexity.

Meaning & Agency

mtaran2y10

Wouldn't the granularity of the action space also impact things? For example, even if a child struggles to pick up some object, you would probably do an even worse job if your action space was picking joint angles, or forces for muscles to apply, or individual timings of action potentials to send to separate nerves.

align your latent spaces

mtaran2y30

This is a cool model. I agree that in my experience it works better to study sentence pairs than single words, and that having fewer exact repetitions is better as well. Probably paragraphs would be even better, as long as they're tailored to be not too difficult to understand (e.g. with a limited number of unknown words/grammatical constructions).

One thing various people recommend for learning languages quickly is to talk with native speakers, and I also notice that this has an extremely large effect. I generally think of it as having to do with more of one's mental subsystems involved in the interaction, though I only have vague ideas as to the exact mechanics of why this should be so helpful.

Do you think this could somehow fit parsimoniously into your model?

Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"

mtaran2y446

A few others have commented about how MSFT doesn't necessarily stifle innovation, and a relevant point here is that MSFT is generally pretty good at letting its subsidiaries do their own thing and have their own culture. In particular GitHub (where I work), still uses Google Workspace for docs/email, slack+zoom for communication, etc. GH is very much remote-first whereas that's more of an exception at MSFT, and GH has a lot less suffocating bureaucracy, and so on. Over the years since the acquisition this has shifted to some extent, and my team (Copilot) is more exposed to MSFT than most, but we still get to do our own thing and at worst have to jump through some hoops for compute resources. I suspect if OAI folks come under the MSFT umbrella it'll be as this sort of subsidiary with almost complete ability to retain whatever aspects of its previous culture that it wants.

Standard disclaimer: my opinions are my own, not my employer's, etc.

Trying to understand John Wentworth's research agenda

mtaran2y2-3

It'd be great if one of the features of these "conversation" type posts was that they would get an LLM-genererated summary or a version of it not as a conversation. Because at least for me this format is super frustrating to read and ends up having a lower signal to noise ratio.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments