My mainline prediction scenario for the next decades.
My mainline prediction * :
governments will act quickly and (relativiely) decisively to bring these agents under state-control. national security concerns will dominate.
I dunno, like 20 years ago if someone had said “By the time somebody creates AI that displays common-sense reasoning, passes practically any written test up including graduate-level, (etc.), obviously governments will be flipping out and nationalizing AI companies etc.”, to me that would have seemed like a reasonable claim. But here we are, and the idea of the USA govt nationalizing OpenAI seems a million miles outside the Overton window.
Likewise, if someone said “After it becomes clear to everyone that lab leaks can cause pandemics costing trillions of dollars and millions of lives, then obviously governments will be flipping out and banning the study of dangerous viruses—or at least, passing stringent regulations with intrusive monitoring and felony penalties for noncompliance etc,” then that would also have sounded reasonable to me! But again, here we are.
So anyway, my conclusion is that when I ask my intuition / imagination whether governments will flip out in thus-and-such circumstance, my intuition / imagination is really ba...
I think this will look a bit outdated in 6-12 months, when there is no longer a clear distinction between LLMs and short term planning agents, and the distinction between the latter and LTPAs looks like a scale difference comparable to GPT2 vs GPT3 rather than a difference in kind. At what point do you imagine a national government saying "here but no further?".
I think scaffolding is the wrong metaphor. Sequences of actions, observations and rewards are just more tokens to be modeled, and if I were running Google I would be busy instructing all work units to start packaging up such sequences of tokens to feed into the training runs for Gemini models. Many seemingly minor tasks (e.g. app recommendation in the Play store) either have, or could have, components of RL built into the pipeline, and could benefit from incorporating LLMs, either by putting the RL task in-context or through fine-tuning of very fast cheap models.
So when I say I don't see a distinction between LLMs and "short term planning agents" I mean that we already know how to subsume RL tasks into next token prediction, and so there is in some technical sense already no distinction. It's a question of how the underlying capabilities are packaged and deployed, and I think that within 6-12 months there will be many internal deployments of LLMs doing short sequences of tasks within Google. If that works, then it seems very natural to just scale up sequence length as generalisation improves.
Arguably fine-tuning a next-token predictor on action, observation, reward sequences, or doing it in-context, is inferior to using algorithms like PPO. However, the advantage of knowledge transfer from the rest of the next-token predictor's data distribution may more than compensate for this on some short-term tasks.
I'm a bit confused by what you mean by "LLMs will not scale to AGI" in combination with "a single innovation is all that is needed for AGI".
E.g., consider the following scenarios:
IMO, these sound very similar to "LLMs scale to AGI" for many practical purposes:
Maybe it is really key in your view that the single innovation is really discontinuous and maybe the single innovation doesn't really require LLM scaling.
I think a single innovation left to create LTPA is unlikely because it runs contrary to the history of technology and of machine learning. For example, in the 10 years before AlphaGo and before GPT-4, several different innovations were required-- and that's if you count "deep learning" as one item. ChatGPT actually understates the number here because different components of the transformer architecture like attention, residual streams, and transformer++ innovations were all developed separately.
Then I think you should specify that progress within this single innovation could be continuous over years and include 10+ ML papers in sequence each developing some sub-innovation.
Novel Science is Inherently Illegible
Legibility, transparency, and open science are generally considered positive attributes, while opacity, elitism, and obscurantism are viewed as negative. However, increased legibility in science is not always beneficial and can often be detrimental.
Scientific management, with some exceptions, likely underperforms compared to simpler heuristics such as giving money to smart people or implementing grant lotteries. Scientific legibility suffers from the classic "Seeing like a State" problems. It constrains endeavors to the least informed stakeholder, hinders exploration, inevitably biases research to be simple and myopic, and exposes researchers to constant political tug-of-war between different interest groups poisoning objectivity.
I think the above would be considered relatively uncontroversial in EA circles. But I posit there is something deeper going on:
Novel research is inherently illegible. If it were legible, someone else would have already pursued it. As science advances her concepts become increasingly counterintuitive and further from common sense. Most of the legible low-hanging fruit has already been picked, and novel research requires venturing higher into the tree, pursuing illegible paths with indirect and hard-to-foresee impacts.
Novel research is inherently illegible.
I'm pretty skeptical of this and think we need data to back up such a claim. However there might be bias: when anyone makes a serendipitous discovery it's a better story, so it gets more attention. Has anyone gone through, say, the list of all Nobel laureates and looked at whether their research would have seemed promising before it produced results?
Thanks for your skepticism, Thomas. Before we get into this, I'd like to make sure actually disagree. My position is not that scientific progress is mostly due to plucky outsiders who are ignored for decades. (I feel something like this is a popular view on LW). Indeed, I think most scientific progress is made through pretty conventional (academic) routes.
I think one can predict that future scientific progress will likely be made by young smart people at prestigious universities and research labs specializing in fields that have good feedback loops and/or have historically made a lot of progress: physics, chemistry, medicine, etc
My contention is that beyond very broad predictive factors like this, judging whether a research direction is fruitful is hard & requires inside knowledge. Much of this knowledge is illegible, difficult to attain because it takes a lot of specialized knowledge etc.
Do you disagree with this ?
I do think that novel research is inherently illegible. Here are some thoughts on your comment :
1.Before getting into your Nobel prize proposal I'd like to caution for Hindsight bias (obvious reasons).
And perhaps to some degree I'd like to argue the burden of proo
I guess I'm not sure what you mean by "most scientific progress," and I'm missing some of the history here, but my sense is that importance-weighted science happens proportionally more outside of academia. E.g., Einstein did his miracle year outside of academia (and later stated that he wouldn't have been able to do it, had he succeeded at getting an academic position), Darwin figured out natural selection, and Carnot figured out the Carnot cycle, all mostly on their own, outside of academia. Those are three major scientists who arguably started entire fields (quantum mechanics, biology, and thermodynamics). I would anti-predict that future scientific progress, of the field-founding sort, comes primarily from people at prestigious universities, since they, imo, typically have some of the most intense gatekeeping dynamics which make it harder to have original thoughts.
Thank you, Thomas. I believe we find ourselves in broad agreement. The distinction you make between lay-legibility and expert-legibility is especially well-drawn.
One point: the confidence of researchers in their own approach may not be the right thing to look at. Perhaps a better measure is seeing who can predict not only their own approach will succed but explain in detail why other approaches won't work. Anecdotally, very succesful researchers have a keen sense of what will work out and what won't - in private conversation many are willing to share detailed models why other approaches will not work or are not as promising. I'd have to think about this more carefully but anecdotally the most succesful researchers have many bits of information over their competitors not just one or two. (Note that one bit of information means that their entire advantage could be wiped out by answering a single Y/N question. Not impossible, but not typical for most cases)
Why don't animals have guns?
Or why didn't evolution evolve the Hydralisk?
Evolution has found (sometimes multiple times) the camera, general intelligence, nanotech, electronavigation, aerial endurance better than any drone, robots more flexible than any human-made drone, highly efficient photosynthesis, etc.
First of all let's answer another question: why didn't evolution evolve the wheel like the alien wheeled elephants in His Dark Materials?
Is it biologically impossible to evolve?
Well, technically, the flagella of various bacteria is a proper wheel.
No the likely answer is that wheels are great when you have roads and suck when you don't. Roads are build by ants to some degree but on the whole probably don't make sense for an animal-intelligence species.
Aren't there animals that use projectiles?
Hold up. Is it actually true that there is not a single animal with a gun, harpoon or other projectile weapon?
Porcupines have quils, some snakes spit venom, a type of fish spits water as a projectile to kick insects of leaves than eats insects. Bombadier beetles can produce an explosive chemical mixture. Skunks use some other chemicals. Some snails shoot harpoons from very c...
My timelines are lengthening.
I've long been a skeptic of scaling LLMs to AGI *. To me I fundamentally don't understand how this is even possible. It must be said that very smart people give this view credence. davidad, dmurfet. on the other side are vanessa kosoy and steven byrnes. When pushed proponents don't actually defend the position that a large enough transformer will create nanotech or even obsolete their job. They usually mumble something about scaffolding.
I won't get into this debate here but I do want to note that my timelines have lengthened, primarily because some of the never-clearly-stated but heavily implied AI developments by proponents of very short timelines have not materialized. To be clear, it has only been a year since gpt-4 is released, and gpt-5 is around the corner, so perhaps my hope is premature. Still my timelines are lengthening.
A year ago, when gpt-3 came out progress was blindingly fast. Part of short timelines came from a sense of 'if we got surprised so hard by gpt2-3, we are completely uncalibrated, who knows what comes next'.
People seemed surprised by gpt-4 in a way that seemed uncalibrated to me. gpt-4 performance was basically in li...
Yes agreed.
What I don't get about this position: If it was indeed just scaling - what's AI research for ? There is nothing to discover, just scale more compute. Sure you can maybe improve the speed of deploying compute a little but at the core of it it seems like a story that's in conflict with itself?
Here are two arguments for low-hanging algorithmic improvements.
First, in the past few years I have read many papers containing low-hanging algorithmic improvements. Most such improvements are a few percent or tens of percent. The largest such improvements are things like transformers or mixture of experts, which are substantial steps forward. Such a trend is not guaranteed to persist, but that’s the way to bet.
Second, existing models are far less sample-efficient than humans. We receive about a billion tokens growing to adulthood. The leading LLMs get orders of magnitude more than that. We should be able to do much better. Of course, there’s no guarantee that such an improvement is “low hanging”.
Encrypted Batteries
(I thank Dmitry Vaintrob for the idea of encrypted batteries. Thanks to Adam Scholl for the alignment angle. Thanks to the Computational Mechanics at the receent compMech conference. )
There are no Atoms in the Void just Bits in the Description. Given the right string a Maxwell Demon transducer can extract energy from a heatbath.
Imagine a pseudorandom heatbath + nano-Demon. It looks like a heatbath from the outside but secretly there is a private key string that when fed to the nano-Demon allows it to extra lots of energy from the heatbath.
P.S. Beyond the current ken of humanity lies a generalized concept of free energy that describes the generic potential ability or power of an agent to achieve goals. Money, the golden calf of Baal is one of its many avatars. Could there be ways to encrypt generalized free energy batteries to constraint the user to only see this power for good? It would be like money that could be only spent on good things.
Imagine a pseudorandom heatbath + nano-Demon. It looks like a heatbath from the outside but secretly there is a private key string that when fed to the nano-Demon allows it to extra lots of energy from the heatbath.
What would a 'pseudorandom heatbath' look like? I would expect most objects to quickly depart from any sort of private key or PRNG. Would this be something like... a reversible computer which shuffles around a large number of blank bits in a complicated pseudo-random order every timestep*, exposing a fraction of them to external access? so a daemon with the key/PRNG seed can write to the blank bits with approaching 100% efficiency (rendering it useful for another reversible computer doing some actual work) but anyone else can't do better than 50-50 (without breaking the PRNG/crypto) and that preserves the blank bit count and is no gain?
* As I understand reversible computing, you can have a reversible computer which does that for free: if this is something like a very large period loop blindly shuffling its bits, it need erase/write no bits (because it's just looping through the same states forever, akin to a time crystal), and so can be computed indefinitely at arbitrarily low energy cost. So any external computer which syncs up to it can also sync at zero cost, and just treat the exposed unused bits as if they were its own, thereby saving power.
Paradox of Ignorance
Paul Christiano presents the "paradox of ignorance" where a weaker, less informed agent appears to outperform a more powerful, more informed agent in certain situations. This seems to contradict the intuitive desideratum that more information should always lead to better performance.
The example given is of two agents, one powerful and one limited, trying to determine the truth of a universal statement ∀x:ϕ(x) for some Δ0 formula ϕ. The limited agent treats each new value of ϕ(x) as a surprise and evidence about the generalization ∀x:ϕ(x). So it can query the environment about some simple inputs x and get a reasonable view of the universal generalization.
In contrast, the more powerful agent may be able to deduce ϕ(x) directly for simple x. Because it assigns these statements prior probability 1, they don't act as evidence at all about the universal generalization ∀x:ϕ(x). So the powerful agent must consult the environment about more complex examples and pay a higher cost to form reasonable beliefs about the generalization.
Is it really a problem?
However, I argue that the more powerful agent is act...
One of the interesting thing about AI minds (such as LLMs) is that in theory, you can turn many topics into testable science while avoiding the 'problem of old evidence', because you can now construct artificial minds and mold them like putty. They know what you want them to know, and so you can see what they would predict in the absence of knowledge, or you can install in them false beliefs to test out counterfactual intellectual histories, or you can expose them to real evidence in different orders to measure biases or path dependency in reasoning.
With humans, you can't do that because they are so uncontrolled: even if someone says they didn't know about crucial piece of evidence X, there is no way for them to prove that, and they may be honestly mistaken and have already read about X and forgotten it (but humans never really forget so X has already changed their "priors", leading to double-counting), or there is leakage. And you can't get people to really believe things at the drop of a hat, so you can't make people imagine, "suppose Napoleon had won Waterloo, how do you predict history would have changed?" because no matter how you try to participate in the spirit of the exerci...
Corrupting influences
The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off... too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Pockets of Deep Expertise
Why am I so bullish on academic outreach? Why do I keep hammering on 'getting the adults in the room'?
It's not that I think academics are all Super Smart.
I think rationalists/alignment people correctly ascertain that most professors don't have much useful to say about alignment & deep learning and often say silly things. They correctly see that much of AI congress is fueled by labs and scale not ML academia. I am bullish on non-ML academia, especially mathematics, physics and to a lesser extent theoretical CS, neuroscience, some parts of ML/ AI academia. This is because while I think 95 % of academia is bad and/or useless there are Pockets of Deep Expertise. Most questions in alignment are close to existing work in academia in some sense - but we have to make the connection!
A good example is 'sparse coding' and 'compressed sensing'. Lots of mech.interp has been rediscovering some of the basic ideas of sparse coding. But there is vast expertise in academia about these topics. We should leverage these!
Other examples are singular learning theory, computational mechanics, etc
Fractal Fuzz: making up for size
GPT-3 recognizes 50k possible tokens. For a 1000 token context window that means there are possible prompts. Astronomically large. If we assume the output of a single run of gpt is 200 tokens then for each possible prompt there are possible continuations.
GPT-3 is probabilistic, defining for each possible prompt () a distribution on a set of size , in other words a dimensional space. [1]
Mind-boggingly large. Compared to these numbers the amount of data (40 trillion tokens??) and the size of the model (175 billion parameters) seems absolutely puny in comparison.
I won't be talking about the data, or 'overparameterizations' in this short, that is well-explained by Singular Learning Theory. Instead, I will be talking about nonrealizability.
Nonrealizability & the structure of natural data
Recall the setup of (parametric) Bayesian learning: there is a sample space , a true distribution on and a parameterized family of probability distributions .
It is often assumed that the true distrib...
Q: What is it like to understand advanced mathematics? Does it feel analogous to having mastery of another language like in programming or linguistics?
A: It's like being stranded on a tropical island where all your needs are met, the weather is always perfect, and life is wonderful.
Except nobody wants to hear about it at parties.
level 0: A state of ignorance. you live in a pre-formal mindset. You don't know how to formalize things. You don't even know what it would even mean 'to prove something mathematically'. This is perhaps the longest. It is the default state of a human. Most anti-theory sentiment comes from this state. Since you've neve
You can't productively read Math books. You often decry that these mathematicians make books way too hard to read. If they only would take the time to explain things simply you would understand.
level 1 : all math is amorphous blob
You know the basic of writing an epsilon-delta proof. Although you don't know why the rules of maths are this or that way you can at least follow the recipes. You can follow simple short proofs, albeit slowly.
You know there are differen...
Why no prediction markets for large infrastructure projects?
Been reading this excellent piece on why prediction markets aren't popular. They say that without subsidies prediction markets won't be large enough; the information value of prediction markets is often nog high enough.
Large infrastructure projects undertaken by governments, and other large actors often go overbudget, often hilariously so: 3x,5x,10x or more is not uncommon, indeed often even the standard.
One of the reasons is that government officials deciding on billion dollar infrastructure projects don't have enough skin in the game. Politicians are often not long enough in office to care on the time horizons of large infrastructure projects. Contractors don't gain by being efficient or delivering on time. To the contrary, infrastructure projects are huge cashcows. Another problem is that there are often far too many veto-stakeholders. All too often the initial bid is wildly overoptimistic.
Similar considerations apply to other government projects like defense procurement or IT projects.
Okay - how to remedy this situation? Internal prediction markets theoretically could prove beneficial. All stakeholders &...
Feature request: author-driven collaborative editing [CITATION needed] for the Good and Glorious Epistemic Commons.
Often I find myself writing claims which would ideally have citations but I don't know an exact reference, don't remember where I read it, or am simply too lazy to do the literature search.
This is bad for scholarship is a rationalist virtue. Proper citation is key to preserving and growing the epistemic commons.
It would be awesome if my lazyness were rewarded by giving me the option to add a [CITATION needed] that others could then suggest (push) a citation, link or short remark which the author (me) could then accept. The contribution of the citator is acknowledged of course. [even better would be if there was some central database that would track citations & links like with crosslinking etc like wikipedia]
a sort hybrid vigor of Community Notes and Wikipedia if you will. but It's collaborative, not adversarial*
author: blablablabla
sky is blue [citation Needed]
blabblabla
intrepid bibliographer: (push) [1] "I went outside and the sky was blue", Letters to the Empirical Review
*community notes on twitter has been a universally lauded concept when it first launched. We are already seeing it being abused unfortunately, often used for unreplyable cheap dunks. I still think it's a good addition to twitter but it does show how difficult it is to create shared agreed-upon epistemics in an adverserial setting.
Wildlife Welfare Will Win
The long arc of history bend towards gentleness and compassion. Future generations will look with horror on factory farming. And already young people are following this moral thread to its logical conclusion; turning their eyes in disgust to mother nature, red in tooth and claw. Wildlife Welfare Done Right, compassion towards our pets followed to its forceful conclusion would entail the forced uploading of all higher animals, and judging by the memetic virulences of shrimp welfare to lower animals as well.
Morality-upon-reflex...
[see also Hanson on rot, generalizations of the second law to nonequilibrium systems (Baez-Pollard, Crutchfield et al.) ]
Imperfect Persistence of Metabolically Active Engines
All things rot. Indidivual organisms, societies-at-large, businesses, churches, empires and maritime republics, man-made artifacts of glass and steel, creatures of flesh and blood.
Conjecture #1 There is a lower bound on the amount of dissipation / rot that any metabolically-active engine creates.
Conjecture #2 Metabolic Rot of an engine is proportional to (1) size and complexity o...
"I dreamed I was a butterfly, flitting around in the sky; then I awoke. Now I wonder: Am I a man who dreamt of being a butterfly, or am I a butterfly dreaming that I am a man?"- Zhuangzi
Questions I have that you might have too:
In this shortform I will try and...
Clem's Synthetic- Physicalist Hypothesis
The mathematico-physicalist hypothesis states that our physical universe is actually a piece of math. It was famously popularized by Max Tegmark.
It's one of those big-brain ideas that sound profound when you first hear about it, then you think about it some more and you realize it's vacuous.
Recently, in a conversation with Clem von Stengel they suggested a version of the mathematico-physicalist hypothesis that I find provoking.
Synthetic mathematics
'Synthetic' mathematics is a bit of weird name...
Know your scientific competitors.
In trading, entering a market dominated by insiders without proper research is a sure-fire way to lose a lot of money and time. Fintech companies go to great lengths to uncover their competitors' strategies while safeguarding their own.
A friend who worked in trading told me that traders would share subtly incorrect advice on trading Discords to mislead competitors and protect their strategies.
Surprisingly, in many scientific disciplines researchers are often curiously incurious about their peers' work.
The long f...
Idle thoughts about UDASSA I: the Simulation hypothesis
I was talking to my neighbor about UDASSA the other day. He mentioned a book I keep getting recommended but never read where characters get simulated and then the simulating machine is progressively slowed down.
One would expect one wouldn't be able to notice from inside the simulation that the simulating machine is being slowed down.
This presents a conundrum for simulation style hypotheses: if the simulation can be slowed down 100x without the insiders noticing, why not 1000x or 10^100x or ...
Why do people like big houses in the countryside /suburbs?
Empirically people move out to the suburbs/countryside when they get children and/or gain wealth. Having a big house with a large yard is the quintessential American dream.
but why? Dense cities are economoically more productive, commuting is measurably one of the worst factors for happiness and productivity. Raising kids in small houses is totally possible and people have done so at far higher densities in the past.
Yet people will spend vast amounts of money on living in a large house wi...
I can report my own feelings with regards to this. I find cities (at least the American cities I have experience with) to be spiritually fatiguing. The constant sounds, the lack of anything natural, the smells - they all contribute to a lack of mental openness and quiet inside of myself.
The older I get the more I feel this.
Jefferson had a quote that might be related, though to be honest I'm not exactly sure what he was getting at:
I think our governments will remain virtuous for many centuries; as long as they are chiefly agricultural; and this will be as long as there shall be vacant lands in any part of America. When they get piled upon one another in large cities, as in Europe, they will become corrupt as in Europe. Above all things I hope the education of the common people will be attended to; convinced that on their good sense we may rely with the most security for the preservation of a due degree of liberty.
One interpretation of this is that Jefferson thought there was something spiritually corrupting of cities. This supported by another quote:
...
I view great cities as pestilential to the morals, the health and the liberties of man. true, they nourish some of the eleg
Why (talk-)Therapy
Therapy is a curious practice. Therapy sounds like a scam, quackery, pseudo-science but it seems RCT consistently show therapy has benefits above and beyond medication & placebo.
Therapy has a long history. The Dodo verdict states that it doesn't matter which form of therapy you do - they all work equally well. It follows that priests and shamans served the functions of a therapist. In the past, one would confessed ones sins to a priest, or spoken with the local shaman.
There is also the thing that therapy ...
Four levels of information theory
There are four levels of information theory.
Level 1: Number Entropy
Information is measured by Shannon entropy
Level 2: Random variable
look at the underlying random variable ('surprisal') of which entropy is the expectation.
Level 3: Coding functions
Shannon's source coding theorem says entropy of a source is the expected number of bits for an optimal encoding of samples of .
Related quantity like mutual information, relative entropy, cross e...
[This is joint thinking with Sam Eisenstat. Also thanks to Caspar Oesterheld for his thoughtful comments. Thanks to Steve Byrnes for pushing me to write this out.]
The Hyena problem in long-term planning
Logical induction is a nice framework to think about bounded reasoning. Very soon after the discovery of logical induction people tried to make logical inductor decision makers work. This is difficult to make work: one of two obstacles is
Obstacle 1: Untaken Actions are not Observable
Caspar Oesterheld brilliantly solved this problem by using auction ma...
Latent abstractions Bootlegged.
Let be random variables distributed according to a probability distribution on a sample space .
Defn. A (weak) natural latent of is a random variable such that
(i) are independent conditional on
(ii) [reconstructability] for all
[This is not really reconstructability, more like a stability property. The information is contained in many parts of the system... I might al...
Inspired by this Shalizi paper defining local causal states. The idea is so simple and elegant I'm surprised I had never seen it before.
Basically, starting with a a factored probability distribution over a dynamical DAG we can use Crutchfield causal state construction locally to construct a derived causal model factored over the dynamical DAG as where is defined by considering the past and forward lightcone of defined as all those points/ variables which influence respectively are influenced by (in a causal interventional sense) . Now take define the equivalence relatio on realization of (which includes by definition)[1] whenever the conditional probability distribution on the future light cones are equal.
These factored probability distributions over dynamical DAGs are called 'fields' by physicists. Given any field we define a derived local causal state field in the above way. Woah!
...
Reasons to think Lobian Cooperation is important
Usually the modal Lobian cooperation is dismissed as not relevant for real situations but it is plausible that Lobian cooperation extends far more broadly than what is proved currently.
It is plausible that much of cooperation we see in the real world is actually approximate Lobian cooperation rather than purely given by traditional game-theoretic incentives.
Lobian cooperation is far stronger in cases where the players resemble each other and/or have access to one another's blueprint. This is ...
Self- Rituals as Schelling loci for Self-control and OCD
Why do people engage in non-social Rituals 'self-rituals'? These are very common and can even become pathological (OCD).
High-self control people seem to more often have OCD-like symptoms.
One way to think about self-control is as a form of internal bargaining between internal subagents. From this perspective, Self-control, time-discounting can be seen as a resource. In the absence of self-control the superagent
D...
I feel like the whole "subagent" framework suffers from homunculus problem: we fail to explain behavior using the abstraction of coherent agent, so we move to the abstraction of multiple coherent agents, and while it can be useful, I don't think it displays actual mechanistic truth about minds.
When I plan something and then fail to execute plan it's mostly not like "failure to bargain". It's just when I plan something I usually have good consequences of plan in my imagination and this consequences make me excited and then I start plan execution and get hit by multiple unpleasant details of reality. Coherent structure emerges from multiple not-really-agentic pieces.
(conversation with Scott Garrabrant)
Destructive Criticism
Sometimes you can say something isn't quite right but you can't provide an alternative.
Difference between 'generation of ideas' and 'filtration of ideas' - i.e. babble and prune.
ScottG: Bayesian learning assumes we are in a babble-rich environment and only does pr...
Reasonable interpretations of Recursive Self Improvement are either trivial, tautological or false?
Trivial but important
Aumann agreement can fail for purely epistemic reasons because real-world minds do not do Bayesian updating. Bayesian updating is intractable so realistic minds sample from the prior. This is how e.g. gradient descent works and also how human minds work.
In this situation a two minds can end in two different basins with similar loss on the data. Because of computational limitations. These minds can have genuinely different expectation for generalization.
(Of course this does not contradict the statement of the theorem which is correct.)
Imprecise Information theory
Would like a notion of entropy for credal sets. Diffractor suggests the following:
let be a credal set.
Then the entropy of is defined as
where denotes the usual Shannon entropy.
I don't like this since it doesn't satisfy the natural desiderata below.
Instead, I suggest the following. Let denote the (absolute) maximum entropy distribution, i.e. and let .
Desideratum 1: ...
Roko's basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development.
Why Roko's basilisk probably doesn't work for simulation fidelity reasons:
Roko's basilisk threatens to simulate and torture you in the future if you don't comply. Simulation cycles cost resources. Instead of following through on torturing our wo...
All concepts can be learnt. All things worth knowing may be grasped. Eventually.
All can be understood - given enough time and effort.
For Turing-complete organism, there is no qualitive gap between knowledge and ignorance.
No qualitive gap but one. The true qualitative difference: quantity.
Often we simply miss a piece of data. The gap is too large - we jump and never reach the other side. A friendly hominid who has trodden the path before can share their journey. Once we know the road, there is no mystery. Only effort and time. Some hominids choose not to share their journey. We keep a special name for these singular hominids: genius.
Abnormalised sampling?
Probability theory talks about sampling for probability distributions, i.e. normalized measures. However, non-normalized measures abound: weighted automata, infra-stuff, uniform priors on noncompact spaces, wealth in logical-inductor esque math, quantum stuff?? etc.
Most of probability theory constructions go through just for arbitrary measures, doesn't need the normalization assumption. Except, crucially, sampling.
What does it even mean to sample from a non-normalized measure? What is unnormalized abnormal sampling?
I don't know....
SLT and phase transitions
The morphogenetic SLT story says that during training the Bayesian posterior concentrates around a series of subspaces with rlcts and losses . As the size of the data sample is scaled the Bayesian posterior makes transitions trading off higher complexity (higher ) for better accuracy (lower loss ).
This is the radical new framework of SLT: phase transitions happen i...
Alignment by Simulation?
I've heard this alignment plan that is a variation of 'simulate top alignment researchers' with an LLM. Usually the poor alignment researcher in question is Paul.
This strikes me as deeply unserious and I am confused why it is having so much traction.
That AI-assisted alignment is coming (indeed, is already here!) is undeniable. But even somewhat accurately simulating a human from textdata is a crazy sci-fi ability, probably not even physically possible. It seems to ascribe nearly magical abilities to LLMs.
Predicting...
[Edit 15/05/2024: I currently think that both forward and backward chaining paradigms are missing something important. Instead, there is something like 'side-chaining' or 'wide-chaining' where you are investigating how things are related forwardly, backwardly and sideways to make use of synergystic information ]
Optimal Forward-chaining versus backward-chaining.
In general, this is going to depend on the domain. In environments for which we have many expert samples and there are many existing techniques backward-chaining is key. (i.e. deploying r...
Thin versus Thick Thinking
Thick: aggregate many noisy sources to make a sequential series of actions in mildly related environments, model-free RL
carnal sins: failure of prioritization / not throwing away enough information , nerdsnipes, insufficient aggegration, trusting too much in any particular model, indecisiveness, overfitting on noise, ignoring consensus of experts/ social reality
default of the ancestral environment
CEOs, general, doctors, economist, police detective in the real world, trader
Thin: precise, systematic analysis, preferably ...
[Thanks to Vlad Firoiu for helping me]
An Attempted Derivation of the Lindy Effect
Wikipedia:
The Lindy effect (also known as Lindy's Law[1]) is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age.
Laplace Rule of Succesion
What is the probability that the Sun will rise tomorrow, given that is has risen every day for 5000 years?
Let denote the probability that the Sun will rise tomorrow. A priori we have no information on the value of&...
Generalized Jeffrey Prior for singular models?
For singular models the Jeffrey Prior is not well-behaved for the simple fact that it will be zero at minima of the loss function.
Does this mean the Jeffrey prior is only of interest in regular models? I beg to differ.
Usually the Jeffrey prior is derived as parameterization invariant prior. There is another way of thinking about the Jeffrey prior as arising from an 'indistinguishability prior'.
The argument is delightfully simple: given two weights if they encode the same distributi...
"The links between logic and games go back a long way. If one thinks of a debate as a kind of game, then Aristotle already made the connection; his writings about syllogism are closely intertwined with his study of the aims and rules of debating. Aristotle’s viewpoint survived into the common medieval name for logic: dialectics. In the mid twentieth century Charles Hamblin revived the link between dialogue and the rules of sound reasoning, soon after Paul Lorenzen had connected dialogue to constructive foundations of logic." from the Stanford Encyclopedia ...
Ambiguous Counterfactuals
[Thanks to Matthias Georg Mayer for pointing me towards ambiguous counterfactuals]
Salary is a function of eXperience and Education
We have a candidate with given salary, experience and education .
Their current salary is given by
We 'd like to consider the counterfactual where they didn't have the education . How do we evaluate their salary in this counterfactual?
This is slightly ambiguous - there are two counterfactuals:
or
In the second c...
Insights as Islands of Abductive Percolation?
I've been fascinated by this beautiful paper by Viteri & DeDeo.
What is a mathematical insight? We feel intuitively that proving a difficult theorem requires discovering one or more key insights. Before we get into what the Dedeo-Viteri paper has to say about (mathematical) insights let me recall some basic observations on the nature of insights:
(see also my previous shortform)
Evidence Manipulation and Legal Admissible Evidence
[This was inspired by Kokotaljo's shortform on comparing strong with weak evidence]
In the real world the weight of many pieces of weak evidence is not always comparable to a single piece of strong evidence. The important variable here is not strong versus weak per se but the source of the evidence. Some sources of evidence are easier to manipulate in various ways. Evidence manipulation, either consciously or emergently, is common and a large obstactle to truth-finding.
Consider aggregating many ...
Imagine a data stream
assumed infinite in both directions for simplicity. Here represents the current state ( the "present") and while and represents the future
Predictible Information versus Predictive Information
Predictible information is the maximal information (in bits) that you can derive about the future given the access to the past. Predictive information is the amount of bits that you need from the past to make that optimal prediction.
Suppose you are...
Agent Foundations Reading List [Living Document]
This is a stub for a living document on a reading list for Agent Foundations.
Causality
Book of Why, Causality - Pearl
Probability theory
Logic of Science - Jaynes
Hopfield Networks = Ising Models = Distributions over Causal models?
Given a joint probability distributions famously there might be many 'Markov' factorizations. Each corresponds with a different causal model.
Instead of choosing a particular one we might have a distribution of beliefs over these different causal models. This feels basically like a Hopfield Network/ Ising Model.
You have a distribution over nodes and an 'interaction' distribution over edges.
The distribution over nodes corresponds to the joint probability di...