Seems like a good New Year activity for rationalists, so I thought I'd post it early instead of late.

Here are some steps I recommend:

  • Go looking through old writings and messages. 
    • If you keep a journal, go look at a few entries from throughout the last year or two.
    • Skim over your LessWrong activity from the last year.
    • If you're active on a slack or discord, do a search of from: [your username] and skim your old messages from the last year (or the last 3 months on the free tier of slack).
    • Sample a few of your reddit comments and tweets from the last year.
    • Same for text messages.
  • Think back through major life events (if you had any this year) and see if they changed your mind about anything. Maybe you changed jobs or turned down an offer; maybe you tried a new therapeutic intervention or recreational drug; maybe you finally told your family something important; maybe you finally excommunicated someone from your life; maybe you tried mediating a conflict between your friends.
  • Obvious, but look over your records of Manifold trades, quantitative predictions, and explicit bets. See if there's anything interesting.

Here are some emotional loadings that I anticipate seeing:

  • "Well, that aged poorly."
  • "Wow, bullet dodged!"
  • "But I could I have known?"
  • "Ah. Model uncertainty strikes again."
  • "Yeah ok, but this year it'll happen for sure!"
  • "[Sigh] I did have an inkling, but I guess I just didn't want to admit it at the time."
  • "I tested that hypothesis and got a result."
  • "Okay, they were right about this one thing. Doesn't mean I have to like them."
  • "Now I see what people mean when they say X"
  • "This is huge, why does no one talk about this?!"
New Answer
New Comment

8 Answers sorted by

Ege Erdil


I thought cryonics was unlikely to work because a bunch of information might be lost even at the temperatures that bodies are usually preserved in. I now think this effect is most likely not serious and cryonics can work in principle at the temperatures we use, but present-day cryonics is still unlikely to work because of how much tissue damage the initial process of freezing can do.

Out of curiosity, what makes you think that the initial freezing process causes too much information loss? 



My obvious changed my mind moment was about alignment difficulty, and a generalized update away from AI x-risk being real/relevant in general.

The things I've changed my mind about are the following:

  1. I no longer believe that deceptive alignment is very likely to happen. A large part of this is that I think that aligned behavior is probably quite low complexity, whether it's via model-based RL as Steven Byrnes would argue, via Direct Preference Optimization which throws away reward, etc. The point is that I no longer believe that value is as complex as LWers believe it to be, which informs my general skepticism of deceptive alignment. More generally, I think that the deceptively aligned program and the actually aligned program is separated only by 10s-100s of bits in program space.

For some reasoning on why this might be true, I think the main post here has to be the inaccessibility post, which points out that the genome has some fairly harsh limitations on how much it can encode priors on values, and thus it needs to use indirect influence, and that limits how much it can use specific priors for values instead of modifying the algorithms for within life-time RL or self-learning.

  1. I no longer believe that the security mindset is appropriate for AI in general, primarily because computer security/rocket engineering in general is both a bad mindset to a lot of problems, because you will usually both need more trust that your system works to get results than security mindset would tell you, and also that this works far more than LWers generally realize. More specifically, there are also very severe disanalogies between computer security and AI alignment, so much so that security mindset is an anti-helpful framework for aligning AI.

Quintin Pope has the point better than I do, here:

  1. I agree with the claim made by Jaime Sevilla that AI alignment/AI control is fundamentally profitable, and plausibly wildly so, and as a consequence, a lot of money will already be spent to control AI, and there is no reason to assume that profit motives move in the direction of trading safety for capabilities, due to several differences:

A. There's much more negative externalities internalized than is the case usually, because the capitalists share a far larger portion of the costs if they fail to align AI.

B. Some amount of alignment is necessary for AI to be in the world at all, and thus there will be efforts to align AI by default, which is either duplicative or strictly better than LWer's attempts to align AI.

Bill Benzon


At the beginning of the year I thought a decent model of how LLMs work was 10 years or so out. I’m now thinking it may be five years or less. What do I mean? 

In the days of classical symbolic AI, researchers would use a programming language, often some variety of LISP, but not always, to implement a model of some set of linguistic structures and processes, such as those involved in story understanding and generation, or question answering. I see a similar division of conceptual labor in figuring out what’s going on inside LLMs. In this analogy I see mechanistic understanding as producing the equivalent of the programming languages of classical AI. These are the structures and mechanisms of the virtual machine that operates the domain model, where the domain is language in the broadest sense. I’ve been working on figuring out a domain model and I’ve had unexpected progress in the last month. I’m beginning to see how such models can be constructed. Call these domain models meta-models for LLMs.

It’s those meta models that I’m thinking are five years out. What would the scope of such a meta model be? I don’t know. But I’m not thinking in terms of one meta model that accounts for everything a given LLM can do. I’m thinking of more limited meta models. I figure that various communities will begin creating models in areas that interest them. 

I figure we start with some hand-crafting to work out some standards. Then we’ll go to work on automating the process of creating the model. How will that work? I don’t know. Noone’s ever done it.

My confidence in this project has just gone up. It seems that I now have a collaborator. That is, he's familiar with my work in general and my investigations of ChatGPT in particular, we've had some email correspondence, and a couple of Zoom conversations. During today's conversation we decided to collaborate on a paper on the theme of 'demystifying LLMs.' 

A word of caution. We haven't written the paper yet, so who knows? But all the signs are good. He's an expert on computer vision systems on the faculty of Goethe University in Frankfurt: Visvanathan... (read more)

To clarify: do you think in about 5 years we will be able to do such thing to then state of the art big models?

-1Bill Benzon
Yes. It's more about the structure of language and cognition than about the mechanics of the models. The number of parameters and layers and functions assigned to layers shouldn't change things, nor going multi-modal, either. Whatever the mechanics of the mechanics of the models, they have to deal with language as it is, and that's not changing in any appreciable way.

Bruce Lewis


At the beginning of 2023 I thought Google was a good place to work. I changed my mind after receiving new evidence.



I have become less sceptic about the ability of western government to act and solve issues in a reasonable timeframe. In general, I tend to think political actions are doomed and are mostly only able to let the statu quo evolve by itself. But recent relatively fast reactions to the evolution of mainstream AI tools have led me to think that I am too cynical on this. I do not know what to think instead, but I am now less confident in my old opinion.



Many projects are left undone simply because people don't step up to do them. I had heard this a lot, but I now feel it more deeply.

A number of times this year, I sharply changed the mind of a trusted advisor by arguing with them, even though I thought they knew more and should be able to change my mind. It now seems marginally more valuable to argue with people and ask them to show their work.

My antipathy toward Twitter had waned, but then I asked people about it, and did some intentional browsing, and I am back to being as anti-Twitter as ever. Twitter is harming the minds of some of my smartest friends & allies, and they seem to be unable to fully realize this, presumably due to the addiction impairing their judgment.

I have become highly uncertain about public sentiment around AI progress. I have heard multiple conflicting claims about what the median American thinks, always asserted with conviction, but never by anyone anywhere near the median.

Oh also, I am no longer surprised to find out that someone has an eloquent, insightful online presence while also being perpetually obnoxious and maladjusted in real life. Turns out lots of people have both of those.



At the beginning of the year, I had never heard of ChatGPT, and thought AI would continue to progress slowly, in a non-disruptive fashion. At this point, I believe 2023 will be at least as significant as 2007 (iPhone) in terms of marking the beginning of a technological transformation.



Off the top of my head: Q1 2023 I was vaguely scornful of asymptotics, Q4 2023 I think they are a useful tool.

can you say more about what evidence produced this change?

I started working on a asymptotics problem, partly to see if I would change idea. I try to keep my eyes on the ball in general, so I started noticing the applications and practical implications of it. Previously, I had encountered the topic mostly reading up-in-the-clouds theoretical stuff.  I also think a tribal instinct was tinging my past thoughts; asymptotics were "Frequentist" while I was "Bayesian".
1 comment, sorted by Click to highlight new comments since:

Someone has downvoted and disagreed almost every comment on this post.