Lightcone Infrastructure FundraiserGoal 1:$630,106 of $1,000,000
Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
Agency = Prediction + Decision AIXI is an idealized model of a superintelligent agent that combines "perfect" prediction (Solomonoff Induction) with "perfect" decision-making (sequential decision theory). OpenAI's o1 is a real-world "reasoning model" that combines a superhuman predictor (an LLM like GPT-4) with advanced decision-making (implicit search via chain of thought trained by RL). To be clear: o1 is no AIXI. But AIXI, as an ideal, can teach us something about the future of o1-like systems. AIXI teaches us that agency is simple. It involves just two raw ingredients: prediction and decision-making. And we know how to produce these ingredients. Good predictions come from self-supervised learning, an art we have begun to master over the last decade of scaling pretraining. Good decisions come from search, which has evolved from the explicit search algorithms that powered DeepBlue and AlphaGo to the implicit methods that drive AlphaZero and now o1. So let's call "reasoning models" like o1 what they really are: the first true AI agents. It's not tool-use that makes an agent; it's how that agent reasons. Bandwidth comes second. Simple does not mean cheap: pretraining is an industrial process that costs (hundreds of) billions of dollars. Simple also does not mean easy: decision-making is especially difficult to get right since amortizing search (=training a model to perform implicit search) requires RL, which is notoriously tricky. Simple does mean scalable. The original scaling laws taught us how to exchange compute for better predictions. The new test-time scaling laws teach us how to exchange compute for better decisions. AIXI may still be a ways off, but we can see at least one open path that leads closer to that ideal. The bitter lesson is that "general methods that leverage computation [such as search and learning] are ultimately the most effective, and by a large margin." The lesson from AIXI is that maybe these are all you need. The lesson from o1 is
habryka13231
17
Reputation is lazily evaluated When evaluating the reputation of your organization, community, or project, many people flock to surveys in which you ask randomly selected people what they think of your thing, or what their attitudes towards your organization, community or project are.  If you do this, you will very reliably get back data that looks like people are indifferent to you and your projects, and your results will probably be dominated by extremely shallow things like "do the words in your name invoke positive or negative associations". People largely only form opinions of you or your projects when they have some reason to do that, like trying to figure out whether to buy your product, or join your social movement, or vote for you in an election. You basically never care about what people think about you while engaging in activities completely unrelated to you, you care about what people will do when they have to take any action that is related to your goals. But the former is exactly what you are measuring in attitude surveys. As an example of this (used here for illustrative purposes, and what caused me to form strong opinions on this, but not intended as the central point of this post): Many leaders in the Effective Altruism community ran various surveys after the collapse of FTX trying to understand what the reputation of "Effective Altruism" is. The results were basically always the same: People mostly didn't know what EA was, and had vaguely positive associations with the term when asked. The people who had recently become familiar with it (which weren't that many) did lower their opinions of EA, but the vast majority of people did not (because they mostly didn't know what it was).  As far as I can tell, these surveys left most EA leaders thinking that the reputational effects of FTX were limited. After all, most people never heard about EA in the context of FTX, and seemed to mostly have positive associations with the term, and the average like
links 12/9/24 * https://gasstationmanager.github.io/ai/2024/11/04/a-proposal.html * a proposal that tentatively makes a lot of sense to me, for making LLM-generated code more robust and trustworthy. * the goal: give a formal specification (in e.g. Lean) of what you want the code to do; let the AI generate both the code and a proof that it meets the specification. * as a means to this end, a crowdsourced website called "Code With Proofs: The Arena", like LeetCode, where "players" can compete to submit code + proofs to solve coding challenges. This provides a source of training data for LLMs, producing both correct and incorrect (code, proof) pairs for each problem specification. A model can then be trained "given a problem specification, produce code that provably meets the specification". * In real life, the model would probably use the proof assistant's verifier directly at inference time, to ensure it only returned code + proofs that the automatic verifier confirmed were valid. It could use the error messages and intermediate feedback of the verifier to more efficiently search for code + proofs that were likely to be correct. * https://en.wikipedia.org/wiki/Post-quantum_cryptography  I know nothing about this field but it sure looks like the cryptography people have come a long way towards being ready, if and when quantum computers start being able to break RSA * https://en.m.wikipedia.org/wiki/Freik%C3%B6rperkultur  the German tradition of public nudity * https://theconversation.com/japanese-scientists-were-pioneers-of-ai-yet-theyre-being-written-out-of-its-history-243762  this piece is gratuitously anti-Big Tech, but does present an interesting part of the history of neural networks. * In general I wonder why Americans tend to be blind to Japanese scientific/technological innovation these days! A lot of great stuff was invented in Japan! * https://scratch.mit.edu/projects/editor/?tutorial=getStarted  a popular kids' programming language de
sapphire7637
57
Don't Induce psychosis intentionally. Don't take psychedelics while someone probes your beliefs. Don't let anyone associated with Michael Vasser anywhere near you during an altered state. Edit: here is a different report from three years ago with the same person administering the methods:  Mike Vasser followers practice intentionally inducing psychosis via psychedelic drugs. Inducing psychosis is a verbatim self report of what they are doing. I would say they practice drug induced brain washing. TBC they would dispute the term brain washing and probably would not like the term 'followers' but I think the terms are accurate and they are certainly his intellectual descendants.  Several people have had quite severe adverse reactions (as observed by me). For example rapidly developing serious literal schizophrenia. Schizophrenia in the very literal sense of paranoid delusions and conspiratorial interpretations of other people's behavior. The local Vasserite who did the 'therapy'/'brainwashing' seems completely unbothered by this literal schizophrenia.  As you can imagine this behavior can cause substantial social disruption. Especially since the Vasserite's don't exactly believe in social harmony.  This has all precipitated serious mental health events in many other parties. Though they are less obviously serious than "they are clinically schizophrenic now".But that is a high bar. I have been very critical of cover ups in lesswrong. I'm not going to name names and maybe you don't trust me. But I have observed this all directly. If you are let people toy with your brain while you are under the influence of psychedelics you should expect high odds of severe consequences. And your friends mental health might suffer as well.   Edit: these are recent events. To my knowledge never referenced on lesswrong.    edit: For anyone who feels the connection to Michael Vassar is too tenuous. The Local Vasserite in question has directly stated "i purposefully induce mania in p
leogao5512
9
it's quite plausible (40% if I had to make up a number, but I stress this is completely made up) that someday there will be an AI winter or other slowdown, and the general vibe will snap from "AGI in 3 years" to "AGI in 50 years". when this happens it will become deeply unfashionable to continue believing that AGI is probably happening soonish (10-15 years), in the same way that suggesting that there might be a winter/slowdown is unfashionable today. however, I believe in these timelines roughly because I expect the road to AGI to involve both fast periods and slow bumpy periods. so unless there is some super surprising new evidence, I will probably only update moderately on timelines if/when this winter happens

Popular Comments

Recent Discussion

A new year has come. It's 2024 and note-taking isn’t cool anymore. The once-blooming space has had its moment. Moreover, the almighty Roam Research isn’t the only king anymore.

The hype is officially over.

At this time of year, when many are busy reflecting on the past year while excitingly looking into the future, I realized it's a good opportunity to look back at Roam’s madness timeline. The company that took Twitterverse and Silicon Valley by storm is now long after its breakthrough.

Roam was one of those phenomena that happen every other few years. Its appearance in our lives not only made the “tools for thought” niche fashionable. It marked a new era in the land of note-taking apps. In conjunction with a flourishing movement of internet intellectuals[1], it...

I'm still on Roam and using it every day. For me, it's not "a lot of work", it's what's necessary to keep track of my thoughts to the point that I feel like my mental workspace is clean. I've journaled a lot since I was a kid. I think better in writing. 

This is my permanent diary. I will probably have it for the rest of my life, if they keep supporting it. Twenty years from now, I'll want to know what I was doing today!

I also log literally all links of "general interest" in my browsing history in my public Roam. does anyone care? Probably not, but it ... (read more)

1Elias711116
I am interested in all the ways we could improve our thinking. It was my initial impression that Andy Matuschak's Tools for Thought seem to aim at this, and I was convinced by his Evergreen Notes thesis. You're probably already familiar with his work. Can I get your input on why current note-taking systems fail at supporting at thinking and some alternatives? Or if note-taking itself is missing the point, how can we augment good thinking?
7TsviBT
That's a big question, like asking a doctor "how do you make people healthy", except I'm not a doctor and there's basically no medical science, metaphorically. My literal answer is "make smarter babies" https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods , but I assume you mean augmenting adults using computer software. For the latter: the only thing I think I know is that you'd have to all of the following steps, in order: 1. Become really good at watching your own thinking processes, including/especially the murky / inexplicit / difficult / pretheoretic / learning-based parts. 2. Become really really good at thinking. Like, publish technical research that many people acknowledge is high quality, or something like that (maybe without the acknowledgement, but good luck self-grading). Apply 0. 3. Figure out what key processes from 1. could have been accelerated with software.
1Elias711116
Thank you for the response. Any physical or mental system meant to improve thinking and cognition for anyone right now. One obvious example is writing, which extends our capacity to think and remember. Another would be SRS, which helps solidify our memory. But your reply points at the more important inner mental systems. Sadly, I don't know any simple, obvious way to do the 0th step.

Introduces the idea of cognitive work as a parallel to physical work, and explains why concentrated sources of cognitive work may pose a risk to human safety.

Acknowledgements. Thanks to Echo Zhou and John Wentworth for feedback and suggestions.

Some of these ideas were presented originally in a talk in November 2024 at the Australian AI Safety Forum slides for which are here: Technical AI Safety (Aus Safety Forum 24) and the video is available on YouTube.

This post is the "serious" half of a pair, for the fun version see Causal Undertow.

Introduction

This essay explores the idea of cognitive work, by which we mean directed changes in the information content of the world that are unlikely to occur by chance. Just as power plants together with machines are sources of physical work, so too datacenters together...

14gate
This seems like a prety cool perspective, especially since it might make analysis a little simpler vs. a paradigm where you kind of need to know what to look out for specifically. Are there any toy mathematical models or basically simulated words/stories, etc... to make this more concrete? I briefly looked at some of the slides you shared but it doesn't seem to be there (though maybe I missed something, since I didn't watch he entire video(s)). I'm not honestly sure exactly what this would look like since I don't fully understand much here beyond the notion that concentration of intelligence/cognition can lead to larger magnitude outcomes (which we probably already knew) and the idea that maybe we could measure this or use it to reason in some way (which maybe we aren't doing so much). Maybe we could have some sort of simulated game where different agents get to control civilizations (i.e. like Civ 5) and among the things they can invest (their resources) in, there is some measure of "cognition" (i.e. maybe it lets them plan further ahead or gives them the ability to take more variables into consideration when making decisions or to see more of the map...). With that said, it's not clear to me what would come out of this simulation other than maybe getting a notion of the relative value (in different contexts) of cognitive vs. physical investments (i.e. having smarter strategists vs. building a better castle). There's not clear question or hypothesis that comes to mind right now. It looks like from some other comments that literature on agents foundations might be relevant, but I'm not familiar. If I get time I might look into it in the future. Are these sorts of frameworks are useable for actual decision making right now (and if so how can we tell/not) or are they still exploratory? Generally just curious if there's a way to make this more concrete i.e. to understand it better.
5Thomas Kwa
I'm pretty skeptical of this because the analogy seems superficial. Thermodynamics says useful things about abstractions like "work" because we have the laws of thermodynamics. What are the analogous laws for cognitive work / optimization power? It's not clear to me that it can be quantified such that it is easily accounted for: * We all come from evolution. Where did the cognitive work come from? * Algorithms can be copied * Passwords can unlock optimization It is also not clear what distinguishes LLM weights from the weights of a model trained on random labels from a cryptographic PRNG. They have the same amount of optimization done to them, but since CSPRNGs can't be broken just by training LLMs on them, the latter model is totally useless while the former is potentially transformative. My guess is this way of looking at things will be like memetics in relation to genetics: likely to spawn one or two useful expressions like "memetically fit", but due to the inherent lack of structure in memes compared to DNA life, not a real field compared to other ways of measuring AIs and their effects (scaling laws? SLT?). Hope I'm wrong.

The analogous laws are just information theory. 

Re: a model trained on random labels. This seems somewhat analogous to building a power plant out of dark matter; to derive physical work it isn't enough to have some degrees of freedom somewhere that have a lot of energy, one also needs a chain of couplings between those degrees of freedom and the degrees of freedom you want to act on. Similarly, if I want to use a model to reduce my uncertainty about something, I need to construct a chain of random variables with nonzero mutual information linking the ... (read more)

2Daniel Murfet
Yes this seems like an important question but I admit I don't have anything coherent to say yet. A basic intuition from thermodynamics is that if you can measure the change in the internal energy between two states, and the heat transfer, you can infer how much work was done even if you're not sure how it was done. So maybe the problem is better thought of as learning to measure enough other quantities that one can infer how much cognitive work is being done. For all I know there is a developed thermodynamic theory of learning agents out there which already does this, but I didn't find it yet...

In light of reading through Raemon's shortform feed, I'm making my own. Here will be smaller ideas that are on my mind.

2Raemon
(I have not engaged with this thread deeply) I've talked to Michael Vassar many times in person. I'm somewhat confident he has taken LSD based on him saying so (although if this turned out wrong I wouldn't be too surprised, my memory is hazy) I definitely have the experiencing of him saying lots of things that sound very confusing and crazy, making pretty outlandish brainstormy-style claims that are maybe interesting, which he claims to take as literally true, that seem either false, or, at least require a lot of inferential gap. I have also heard him make a lot of morally charged, intense statements that didn't seem clearly supported. (I do think I have valued talking to Michael, despite this, he is one of the people who helped unstick me in certain ways, but, the mechanism by which he helped me was definitely via being kinda unhinged sounding.) 
habryka20

I've talked to Michael Vassar many times in person. I'm somewhat confident he has taken LSD based on him saying so (although if this turned out wrong I wouldn't be too surprised, my memory is hazy)

I would take bets at 9:1 odds that Michael has taken large amounts of psychedelics. I would also take bets at similar odds that he promotes the use of psychedelics.

2Viliam
After sleeping on it, it seems to me that the topic were are talking about is "staring into the abyss": whether, when, and how to do it properly, and for what outcome. The easiest way is to not do it at all. Just pretend that everything is flowers and rainbows, and refuse to talk about the darker aspects of reality. This is what we typically do with little children. A part of that is parental laziness: by avoiding difficult topics we avoid difficult conversations. But another part is that children are not cognitively ready to process nontrivial topics, so we try to postpone the debates about darker things until later, when they get the capability. Some lazy parents overdo it; some kids grow up living in a fairy tale world. Occasional glimpses of darkness can be dismissed as temporary exceptions to the general okay-ness of the world. "Grandma died, but now she is happy in Heaven." At this level, people who try to disrupt the peace are dismissed relatively gently, accused of spoiling the mood and frightening the kids. When this becomes impossible because the darkness pushes its way beyond our filters, the next lazy strategy is to downplay the darkness. Either it is not so bad, or there is some silver lining to everything. "Death gives meaning to life." "The animals don't mind dying so that we can have meat to eat; they understand it is their role in the system." "Slavery actually benefits the blacks; they do not have the mental capacity to survive without a master." "What doesn't kill you, makes you stronger." At this point the pushback against those trying to disrupt the peace is stronger; people are aware that their rationalizations are fragile. Luckily, we can reframe the rationalizations as a sign of maturity, and dismiss those who disagree with us as immature. "When you grow up, you will realize that..." Another possible reaction is trying to join the abyss. Yes, bad things happen, but since they are inevitable, there is no point worrying about that. Heck, if
11Jesse Hoogland
Agency = Prediction + Decision AIXI is an idealized model of a superintelligent agent that combines "perfect" prediction (Solomonoff Induction) with "perfect" decision-making (sequential decision theory). OpenAI's o1 is a real-world "reasoning model" that combines a superhuman predictor (an LLM like GPT-4) with advanced decision-making (implicit search via chain of thought trained by RL). To be clear: o1 is no AIXI. But AIXI, as an ideal, can teach us something about the future of o1-like systems. AIXI teaches us that agency is simple. It involves just two raw ingredients: prediction and decision-making. And we know how to produce these ingredients. Good predictions come from self-supervised learning, an art we have begun to master over the last decade of scaling pretraining. Good decisions come from search, which has evolved from the explicit search algorithms that powered DeepBlue and AlphaGo to the implicit methods that drive AlphaZero and now o1. So let's call "reasoning models" like o1 what they really are: the first true AI agents. It's not tool-use that makes an agent; it's how that agent reasons. Bandwidth comes second. Simple does not mean cheap: pretraining is an industrial process that costs (hundreds of) billions of dollars. Simple also does not mean easy: decision-making is especially difficult to get right since amortizing search (=training a model to perform implicit search) requires RL, which is notoriously tricky. Simple does mean scalable. The original scaling laws taught us how to exchange compute for better predictions. The new test-time scaling laws teach us how to exchange compute for better decisions. AIXI may still be a ways off, but we can see at least one open path that leads closer to that ideal. The bitter lesson is that "general methods that leverage computation [such as search and learning] are ultimately the most effective, and by a large margin." The lesson from AIXI is that maybe these are all you need. The lesson from o1 is

Marcus Hutter on AIXI and ASI safety 

links 12/9/24

  • https://gasstationmanager.github.io/ai/2024/11/04/a-proposal.html
    • a proposal that tentatively makes a lot of sense to me, for making LLM-generated code more robust and trustworthy.
    • the goal: give a formal specification (in e.g. Lean) of what you want the code to do; let the AI generate both the code and a proof that it meets the specification.
    • as a means to this end, a crowdsourced website called "Code With Proofs: The Arena", like LeetCode, where "players" can compete to submit code + proofs to solve coding challenges. This provides a source of
... (read more)

A fool learns from their own mistakes
The wise learn from the mistakes of others.

– Otto von Bismarck 

A problem as old as time: The youth won't listen to your hard-earned wisdom. 

This post is about learning to listen to, and communicate wisdom. It is very long – I considered breaking it up into a sequence, but, each piece felt necessary. I recommend reading slowly and taking breaks.

To begin, here are three illustrative vignettes:

The burnt out grad student

You warn the young grad student "pace yourself, or you'll burn out." The grad student hears "pace yourself, or you'll be kinda tired and unproductive for like a week." They're excited about their work, and/or have internalized authority figures yelling at them if they aren't giving their all. 

They don't pace themselves. They burn out.

The

...
Raemon40

(FYI this is George from the essay, in case people were confused)

1Anders Lindström
  Great post! I personally have a tendency to disregard wisdom because it feels "too easy", that if I am given some advice and it works I think it was just luck or correlation, then I have to go and try "the other way (my way...)" and get a punch in the face from the universe and then be like "ohhh, so that why I should have stuck to the advice".  Now when I think about it, it might also be because of intellectual arrogance, that I think I am smarter than the advice or the person that gives the advice.  But I have lately started to think a lot about way we think that successful outcomes require overreaching and burnout. Why do we have to fight so hard for everything and feel kind of guilty if it came to us without much effort? So maybe my failure to heed advices of wisdom is based in a need to achieve (overdo, modify, add, reduce, optimize etc.) rather than to just be.
1Jonas Hallgren
First and foremost, it was quite an interesting post and my goal of the comment is to try to connect my own frame of thinking with the one presented here. My main question is about the relationship between emotions/implicit thoughts and explicit thinking. My first thought was on the frame of thinking versus feeling and how these flow together. If we think of emotions as probability clouds that tell us whether to go in one direction or another, we can see them as systems for making decisions in highly complex environments, such as when working on impossible problems. I think something like research taste is exactly this - highly trained implicit thoughts and emotions. Continuing from something like tuning your cognitive systems, I notice that this is mostly done with System 2 and I can't help but feel that it's missing some System 1 stuff here. I will give an analogy similar to a meditation analogy as this is the general direction I'm pointing in: If we imagine that we're faced with a wall of rock, it looks like a very big problem. You're thinking to yourself, "fuck, how in the hell are we ever going to get past that thing?" So first you just approach it and you start using a pickaxe to hack away at it, you make some local progress yet it is hard to reflect on where to go. You think hard, what are the properties of this rock that allows me to go through it faster? You continue yet you're starting to feel discouraged as you're not making any progress, you think to yourself "Fuck this goddamn rock man, this shit is stupid." You're not getting any feedback since it is an almost impossible problem. Above is the base analogy, following are two points on the post from this analogy: 1. Let's start with a continuation to the analogy, imagine that your goal, the thing behind huge piece of rock is a source of gravity and you're water.  You're continuously striving towards it yet the way that you do it is that you flow over the surface. You're probing for holes in the
2Raemon
My overall frame is it's best to have emotional understanding and system 2 deeply integrated. How to handle local tradeoffs unfortunately depends a lot on your current state, and where your bottlenecks are. Could you provide a specific, real-world example where the tradeoff comes up and you're either unsure of how to navigate it, or, you think I might suggested navigating it differently?
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

This is a personal post: I'm not speaking for SecureBio or BIDA.

I help organize a contra dance that requires high filtration masks (N95 etc) at half of our dances. When we restarted in 2022 we required masks at all our dances, before switching to half in 2023. We just ran a survey of our dancers, and while there are people who would like to not have to wear masks there are also a lot of people who are only willing to come if they know all the dancers will be masked. [1]

Last week I attended a conference for work with a lot of people thinking about biosecurity, which has me wondering about ways we could have a hall as safe as one where the dancers are all wearing N95s but without the ways N95s make it...

2lemonhope
How do you measure results?
jefftk40

Put particles in the air and measure how quickly they're depleted. ex: Evaluating a Corsi-Rosenthal Filter Cube

After the release of Ben Pace's extended interview with me about my views on religion, I felt inspired to publish more of my thinking about religion in a format that's more detailed, compact, and organized. This post is the second publication in my series of intended posts about religion.

Thanks to Ben Pace, Chris Lakin, Richard Ngo, Damon Pourtahmaseb-Sasi, Marcello Herreshoff, Renshin Lauren Lee, Mark Miller, Roger Thisdell, and Imam Ammar Amonette for their feedback on this post, and thanks to Kaj Sotala, Tomáš Gavenčiak, Paul Colognese, and David Spivak for reviewing earlier versions of this post. Thanks especially to Renshin Lauren Lee, Roger Thisdell, and Imam Ammar Amonette for their input on my claims about perennialism, and Mark Miller for vetting my claims about predictive processing.


In my previous...

terminal values in the first place, as opposed to active blind spots masquerading as terminal values.

Can't one's terminal values be exactly (mechanistically implemented as) active blind spots?

I predict that you would say something like "The difference is that active blind spots can be removed/healed/refactored 'just' by (some kind of) learning, so they're not unchanging as one's terminal values would be assumed to be."?

Yeah, that's a good point. I certainly don't claim that Michael is to blame for her actions.

4AprilSR
I don't necessarily agree with every line in this post—I'd say I'm better off and still personally kinda like Olivia, though it's of course been rocky at times—but it does all basically look accurate to me. She stayed at my apartment for maybe a total of 1-2 months earlier this year, and I've talked to her a lot. I don't think she presented the JD Pressman thing as about "lying" to me, but she did generally mention him convincing people to keep her out of things. There is a lot more I could say, and I am as always happy to answer dms and such, but I am somewhat tired of all this and I don't right at this moment really want to either figure out exactly what things I feel are necessary to disclose about a friend of mine or try to figure out what would be a helpful contribution to years old drama, given that it's 1:30am. But I do want to say that I basically think Melody's statements are all more-or-less reasonable.
6tailcalled
Reminder not to sell your soul(s) to the devil.
3sapphire
I don't think he is directly responsible. But recent events are imo further evidence his methods are bad. If I said some dangerous teacher was Buddhist I would not be implicating the Buddha directly. Though it would be some evidence for the Buddha failing as a teacher.