LESSWRONG
LW

230
Kaarel
1012Ω317980
Message
Dialogue
Subscribe

kaarelh AT gmail DOT com

personal website

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2kh's Shortform
3y
15
kh's Shortform
Kaarel2d20

on seeing the difference between profound and meaningless radically alien futures

Here's a question that came up in a discussion about what kind of future we should steer toward:

  • Okay, a future in which all remotely human entities promptly get replaced by alien AIs would soon look radically incomprehensible and void to us — like, imagine our current selves seeing videos from this future world, and the world in these videos mostly not making sense to them, and to an even greater extent not seeming very meaningful in the ethical sense. But a future in which [each human]/humanity has spent a million years growing into a galaxy-being would also look radically incomprehensible/weird/meaningless to us.[1] So, if we were to ignore near-term stuff, would we really still have reason to strive for the latter future over the former?

a couple points in response:

  1. The world in which we are galaxy-beings will in fact probably seem more ethically meaningful to us in many fairly immediate ways. Related: (for each past time t) a modern species typically still shares meaningfully more with its ancestors from time t than it does with other species that were around at time t (that diverged from the ancestral line of the species way before t).
    1. A specific case: we currently already have many projects we care about — understanding things, furthering research programs, creating technologies, fashioning families and friendships, teaching, etc. — some of which are fairly short-term, but others of which could meaningfully extend into the very far future. Some of these will be meaningfully continuing in the world in which we are galaxy-beings, in a way that is not too hard to notice. That said, they will have grown into crazy things, yes, with many aspects that one isn't going to immediately consider cool; I think there is in fact a lot that's valuable here as well; I'll argue for this in item 3.
  2. The world in which we have become galaxy-beings had our own (developing) sense/culture/systems/laws guide decision-making and development (and their own development in particular), and we to some extent just care intrinsically/terminally about this kinda meta thing in various ways.
  3. However, more importantly: I think we mostly care about [decisions being made and development happening] according to our own sense/culture/systems/laws not intrinsically/terminally, but because our own sense/culture/systems/laws is going to get things right (or well, more right than alternatives) — for instance, it is going to lead us more to working on projects that really are profound. However, that things are going well is not immediately obvious from looking at videos of a world — as time goes on, it takes increasingly more thought/development to see that things are going well.
    1. I think one is making a mistake when looking at videos from the future and quickly being like "what meaningless nonsense!". One needs to spend time making sense of the stuff that's going on there to properly evaluate it — one doesn't have immediate access to one's true preferences here. If development has been thoughtful in this world, very many complicated decisions have been made to get to what you're now seeing in these videos. When evaluating this future, you might want to (for instance) think through these decisions for yourself in the order in which they were made, understanding the context in which each decision was made, hearing the arguments that were made, becoming smart enough to understand them, maybe trying out some relevant experiences, etc.. Or you might do other kinds of thinking that gets you into a position from which you can properly understand the world and judge it. After a million years[2] of this, you might see much more value in this human(-induced) world than before.
    2. But maybe you'll still find that world quite nonsensical? If you went about your thinking and learning in a great deal of isolation, without much attempting to do something together with the beings in that world, then imo you probably will indeed find that world quite bad/empty compared to what it could have been[3] [4] (though I'd guess that you would similarly also find other isolated rollouts of your own reflection quite silly[5], and that different independent sufficiently long further rollouts from your current position would again find each other silly, and so on). However, note that like the galaxy-you that came out of this reflection, this world you're examining has also gone through an [on most steps ex ante fairly legitimate] process of thoughtful development (by assumption, I guess), and the being(s) in that world now presumably think there's a lot of extremely cool stuff happening in it. In fact, we could suppose that a galaxy-you is living in that world, and that they contributed to its development throughout its history, and that they now think that their world (or their corner of their world) is extremely based.[6]
    3. Am I saying that the galaxy-you looking at this world from the outside is actually supposed to think it's really cool, because it's supposed to defer to the beings in that world, or because it's supposed to think any development path consisting of ex ante reasonable-seeming steps is fine, or because some sort of relativism is right, or something? I think this isn't right, and so I don't want to say that — I think it's probably fine for the galaxy-you to think stuff has really gone off the rails in that world. But I do want to say that when we ourselves are making this decision of which kind of future to have from our own embedded point of view, we should expect there to be a great deal of incomprehensible coolness in a human future (if things go right) — for instance, projects whose worth we wouldn't see yet, but which we would come to correctly consider really profound in such a future (indeed, we would be tracking what's worthwhile and coming up with new worthwhile things and doing those) — whereas we should expect there to instead be a great deal of incomprehensible valueless nonsense in an alien future.
  4. If you've read the above and still think a galaxy-human future wouldn't be based, let me try one more story on you. I think this looking-at-videos-of-a-distant-world framing of the question makes one think in terms of sth like assigning value to spacetime blocks "from the outside", and this is a framing of ethical decisions which is imo tricky to handle well, and in particular can make one forget how much one cares about stuff. Like, I think it's common to feel like your projects matter a lot while simultaneously feeling that [there being a universe in which there is a you-guy that is working on those projects] isn't so profound; maybe you really want to have a family, but you're confused about how much you want to make there be a spacetime block in which there is a such-and-such being with a family. This could even turn an ordinary ethical decision that you can handle just fine into something you're struggling to make sense of — like, wait, what kind of guy needs to live in this spacetime block (and what relation do they need to have to me-now-answering-this-question); also, what does it even mean for a spacetime block to exist (what if we should say that all possible spacetime blocks exist?)? One could adopt the point of view that the spacetime block question is supposed to just be a rephrasing of the ordinary ethical question, and so one should have the same answer for it, and feel no more confused about what it means. One could probably spend some time thinking of one's ordinary ethical decisions in terms of spacetime-block-making and perhaps then come to have one's answers be reasonably coherent under having (arguably) the same decision problem presented in the ordinary way vs in some spacetime block way.[7] But I think this sort of thing is very far from being set up in almost any current human. So: you might feel like saying "whatever" way too much when ethical questions are framed in terms of spacetime-block-making, and the situation we're considering could push one toward that frame; I want to alert you that maybe this is happening, maybe you really care more than it seems in that frame, and that maybe you should somehow imagine yourself being more embedded in this world when evaluating it.

  1. I guess one could imagine a future in which someone tiles the world with happy humans of the current year variety or something, but imo this is highly unlikely even conditional on the future being human-shaped, and also much worse than futures in which a wild variety of galaxy-human stuff is going on. Background context: imo we should probably be continuously growing more capable/intelligent ourselves for a very long time (and maybe forever), with the future being determined by us "from inside human life", as opposed to ever making an artificial system that is more capable than humanity and fairly separate/distinct from humanity that would "design human affairs from the outside" (really, I think we shouldn't be making [AIs more generally capable than individual humans] of any kind, except for ones that just are smarter versions of individual humans, for a long time (and maybe forever); see this for some of my thoughts on these topics). ↩︎

  2. maybe we should pick a longer time here, to be comparing things which are more alike? ↩︎

  3. I think this is probably true even if we condition the rollout on you coming to understand the world in the videos quite well. ↩︎

  4. But if you disagree here, then I think I've already finished [the argument that the human far future is profoundly better] which I want to give to you, so you could stop reading here — the rest of this note just addresses a supposed complication you don't believe exists. ↩︎

  5. much like you could grow up from a kid into a mathematician or a philosopher or an engineer or a composer, thinking in each case that the other paths would have been much worse ↩︎

  6. Unlike you growing up in isolation, that galaxy-you's activities and judgment and growth path will be influenced by others; maybe it has even merged with others quite fully. But that's probably how things should be, anyway — we probably should grow up together; our ordinary valuing is already done together to a significant extent (like, for almost all individuals, the process determining (say) the actions of that individual already importantly involves various other individuals, and not just in a way that can easily be seen as non-ethical). ↩︎

  7. There might be some stuff that's really difficult to make sense of here — it is imo plausible that the ethical cognition that a certain kind of all-seeing spacetime-block-chooser would need to have to make good choices is quite unlike any ethical cognition that exists (or maybe even could exist) in our universe. That said, we can imagine a more mundane spacetime-block-chooser, like a clone of you that gets to make a single life choice for you given ordinary information about the decision and that gets deleted after that; it is easier to imagine this clone having ethical cognition that leads to it making reasonably good decisions. ↩︎

Reply
Four ways learning Econ makes people dumber re: future AI
Kaarel1mo40

I won't address why [AIs that humans create] might[1] have their own alien values (so I won't address the "turning against us" part of your comment), but on these AIs outcompeting humans[2]:

  • There is immense demand for creating systems which do ≈anything better than humans, because there is demand for all the economically useful things humans do — if someone were to create such a thing and be able to control it, they'd become obscenely rich (and probably come to control the world[3]).
  • Also, it's possible to create systems that do ≈anything better than humans. In fact, it's probably not that hard — it'll probably happen at some point in this century by default (absent an AGI ban).

  1. and imo probably will ↩︎

  2. sorry if this is already obvious to you, but I thought from your comment that there was a chance you hadn't considered this ↩︎

  3. if moderately ahead of other developers and not shut down or taken over by others promptly ↩︎

Reply
A Conservative Vision For AI Alignment
Kaarel1mo60

While I'm probably much more of a lib than you guys (at least in ordinary human contexts), I also think that people in AI alignment circles mostly have really silly conceptions of human valuing and the historical development of values.[1] I touch on this a bit here. Also, if you haven't encountered it already, you might be interested in Hegel's work on this stuff — in particular, The Phenomenology of Spirit.


  1. This isn't to say that people in other circles have better conceptions... ↩︎

Reply
Leon Lang's Shortform
Kaarel1mo*134

It's how science works: You focus on simple hypotheses and discard/reweight them according to Bayesian reasoning.

There are some ways in which solomonoff induction and science are analogous[1], but there are also many important ways in which they are disanalogous. Here are some ways in which they are disanalogous:

  • A scientific theory is much less like a program that prints (or predicts) an observation sequence than it is like a theory in the sense used in logic. Like, a scientific theory provides a system of talking which involves some sorts of things (eg massive objects) about which some questions can be asked (eg each object has a position and a mass, and between any pair of objects there is a gravitational force) with some relations between the answers to these questions (eg we have an axiom specifying how the gravitational force depends on the positions and masses, and an axiom specifying how the second derivative of the position relates to the force).[2]
  • Science is less in the business of predicting arbitrary observation sequences, and much more in the business of letting one [figure out]/understand/exploit very particular things — like, the physics someone knows is going to be of limited help when they try to predict the time sequence of intensities of pixel (x,y) on their laptop screen, but it is going to help them a lot when solving the kinds of problems that would show up in a physics textbook.
  • Even for solving problems that a theory is supposed to help one solve (and for the predictions it is supposed to help one make), a scientific theory is highly incomplete — in addition to the letter of the theory, a human solving the problems in a classical mechanics textbook will be majorly relying on tacit understanding gained from learning classical mechanics and their common-sense understanding.
  • Making scientific progress looks less like picking out a correct hypothesis from some set of pre-well-specified hypotheses by updating on data, and much more like coming up with a decent way to think about something where there previously wasn't one. E.g. it could look like Faraday staring at metallic filings near a magnet and starting to talk about the lines he was seeing, or Lorentz, Poincaré, and Einstein making sense of the result of the Michelson-Morley experiment. Imo the bayesian conception basically completely fails to model gaining scientific understanding.
  • Scientific theories are often created to do something — I mean: to do something other than predicting some existing data — e.g., to make something; e.g., see https://en.wikipedia.org/wiki/History_of_thermodynamics.
  • Scientific progress also importantly involves inventing new things/phenomena to study. E.g., it would have been difficult to find things that Kirchhoff's laws could help us with before we invented electric circuits; ditto for lens optics and lenses).
  • Idk, there is just very much to be said about the structure of science and scientific progress that doesn't show up in the solomonoff picture (or maaaybe at best in some cases shows up inexplicitly inside the inductor). I'll mention a few more things off the top of my head:
    • having multiple ways to think about something
    • creating new experimental devices/setups
    • methodological progress (e.g. inventing instrumental variable methods in econometrics)
    • mathematical progress (e.g. coming up with the notion of a derivative)
    • having a sense of which things are useful/interesting to understand
    • generally, a human scientific community doing science has a bunch of interesting structure; in particular, the human minds participating in it have a bunch of interesting structure; one in fact needs a bunch of interesting structure to do science well; in fact, more structure of various kinds is gained when making scientific progress; basically none of this is anywhere to be seen in solomonoff induction

  1. for example, that usually, a scientific theory could be used for making at least some fairly concrete predictions ↩︎

  2. To be clear: I don't intend this as a full description of the character of a scientific theory — e.g., I haven't discussed how it gets related to something practical/concrete like action (or maybe (specifically) prediction). A scientific theory and a theory-in-the-sense-used-in-logic are ultimately also disanalogous in various ways — I'm only claiming it's a better analogy than that between a scientific theory and a predictive model. ↩︎

Reply
Agent foundations: not really math, not really science
Kaarel1mo*80

However, the reference class that includes the theory of computation is one possible reference class that might include the theory of agents.[1] But for all (I think) we know, the reference class we are in might also be (or look more like) complex systems studies, where you can prove a bunch of neat things, but there's also a lot of behavior that is not computationally reducible and instead you need to observe, simulate, crunch the numbers. Moreover, noticing surprising real-world phenomena can serve as a guide to your attempts to explain the observed phenomena in ~mathematical terms (e.g., how West et al. explained (or re-derived) Kleiber's law from the properties of intra-organismal resource supply networks[2]). I don't know what the theory will look like; to me, its shape remains an open a posteriori question.

along an axis somewhat different than the main focus here, i think the right picture is: there is a rich field of thinking-studies. it’s like philosophy, math, or engineering. it includes eg Chomsky's work on syntax, Turing’s work on computation, Gödel’s work on logic, Wittgenstein’s work on language, Darwin's work on evolution, Hegel’s work on development, Pascal’s work on probability, and very many more past things and very many more still mostly hard-to-imagine future things. given this, i think asking about the character of a “theory of agents” would already soft-assume a wrong answer. i discuss this here

i guess a vibe i'm trying to communicate is: we already have thinking-studies in front of us, and so we can look at it and get a sense of what it's like. of course, thinking-studies will develop in the future, but its development isn't going to look like some sort of mysterious new final theory/science being created (though there will be methodological development (like for example the development of set-theoretic foundations in mathematics, or like the adoption of statistics in medical science), and many new crazy branches will be developed (of various characters), and we will surely ≈resolve various particular questions in various ways (though various other questions call for infinite investigations))

Reply
kh's Shortform
Kaarel1mo30

Hmm, thanks for telling me, I hadn't considered that. I think I didn't notice this in part because I've been thinking of the red-black circle as being "canceled out"/"negated" on the flag, as opposed to being "asserted". But this certainly wouldn't be obvious to someone just seeing the flag.

Reply
kh's Shortform
Kaarel1mo3-1

I designed a pro-human(ity)/anti-(non-human-)AI flag:

  • The red-black circle is HAL's eye; it represents the non-human in-all-ways-super-human AI(s) that the world's various AI capability developers are trying to create, that will imo by default render all remotely human beings completely insignificant and cause humanity to completely lose control over what happens :(.
  • The white star covering HAL's eye has rays at the angles of the limbs of Leonardo's Vitruvian Man; it represents humans/humanity remaining more capable than non-human AI (by banning AGI development and by carefully self-improving).
  • The blue background represents our potential self-made ever-better future, involving global governance/cooperation/unity in the face of AI.

Feel free to suggest improvements to the flag. Here's latex to generate it:

% written mostly by o3 and o4-mini-high, given k's prompting
% an anti-AI flag. a HAL "eye" (?) is covered by a vitruvian man star
\documentclass[tikz]{standalone}
\usetikzlibrary{calc}
\usepackage{xcolor}                 % for \definecolor
\definecolor{UNBlue}{HTML}{5B92E5}

\begin{document}
\begin{tikzpicture}
%--------------------------------------------------------
% flag geometry
%--------------------------------------------------------
\def\flagW{6cm}     % width  -> 2 : 3 aspect
\def\flagH{4cm}     % height
\def\eyeR {1.3cm}     % HAL-eye radius


% light-blue background
\fill[UNBlue] (0,0) rectangle (\flagW,\flagH);

%--------------------------------------------------------
% concentric “HAL eye” (outer-most ring first)
%--------------------------------------------------------
\begin{scope}[shift={(\flagW/2,\flagH/2)}] % centre of the flag
 \foreach \f/\c in {%
     1.00/black,
     .68/{red!50!black},
     .43/{red!80!orange},
     .1/orange,
     .05/yellow}%
 {%
   \fill[fill=\c,draw=none] circle ({\f*\eyeR});
 }

%── parameters ───────────────────────────────────────
\def\R{\eyeR}        % distance from centre to triangle’s tip
\def\Alpha{10}       % full apex angle (°)
%── compute half-angle & half-base once ─────────────
\pgfmathsetmacro\halfA{\Alpha/2}               
\pgfmathsetlengthmacro\halfside{\R*tan(\halfA)}

%── loop over Vitruvian‐man angles ───────────────────
\foreach \Beta in {0,30,90,150,180,240,265,275,300} {%
 % apex on the eye‐rim
 \coordinate (A) at (\Beta:\R);
 % base corners offset ±90°
 \coordinate (B) at (\Beta+90:\halfside);
 \coordinate (C) at (\Beta-90:\halfside);
 % fill the spike
 \path[fill=white,draw=none] (A) -- (B) -- (C) -- cycle;
}

\end{scope}
\end{tikzpicture}
\end{document}
 

Reply
the jackpot age
Kaarel2mo90

https://www.lesswrong.com/posts/DMxe4XKXnjyMEAAGw/the-geometric-expectation

Reply
‘AI for societal uplift’ as a path to victory
Kaarel2mo32
  1. Conversely, there is some (potentially high) threshold of societal epistemics + coordination + institutional steering beyond which we can largely eliminate anthropogenic x-risk, potentially in perpetuity

Note that this is not a logical converse of your first statement. I realize that the word "conversely" can be used non-strictly and might in fact be used this way by you here, but I'm stating this just in case.

My guess is that "there is some (potentially high) threshold of societal epistemics + coordination + institutional steering beyond which we can largely eliminate anthropogenic x-risk in perpetuity" is false — my guess is that improving [societal epistemics + coordination + institutional steering] is an infinite endeavor; I discuss this a bit here. That said, I think it is plausible that there is a possible position from which we could reasonably be fairly confident that things will be going pretty well for a really long time — I just think that this would involve one continuing to develop one's methods of [societal epistemics, coordination, institutional steering, etc.] as one proceeds.

Reply
‘AI for societal uplift’ as a path to victory
Kaarel2mo*32

Basically nobody actually wants the world to end, so if we do that to ourselves, it will be because somewhere along the way we weren’t good enough at navigating collective action problems, institutional steering, and general epistemics

... or because we didn't understand important stuff well enough in time (for example: if it is the case that by default, the first AI that could prove P≠NP would eat the Sun, we would want to firmly understand this ahead of time), or because we weren't good enough at thinking (for example, people could just be lacking in iq, or have never developed an adequate sense of what it is even like to understand something, or be intellectually careless), or because we weren't fast enough at disseminating or [listening to] the best individual understanding in critical cases, or because we didn't value the right kinds of philosophical and scientific work enough, or because we largely-ethically-confusedly thought some action would not end the world despite grasping some key factual broad strokes of what would happen after, or because we didn't realize we should be more careful, or maybe because generally understanding what will happen when you set some process in motion is just extremely cursed.[1] I guess one could consider each of these to be under failures in general epistemics... but I feel like just saying "general epistemics" is not giving understanding its proper due here.


  1. Many of these are related and overlapping. ↩︎

Reply
Load More
No wikitag contributions to display.
55An Advent of Thought
Ω
6mo
Ω
13
45Deep Learning is cheap Solomonoff induction?
9mo
1
8Finding the estimate of the value of a state in RL agents
1y
4
23Interpretability: Integrated Gradients is a decent attribution method
1y
7
108The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Ω
1y
Ω
4
45A starting point for making sense of task structure (in machine learning)
2y
2
206Toward A Mathematical Framework for Computation in Superposition
Ω
2y
Ω
19
75Grokking, memorization, and generalization — a discussion
2y
11
50Crystal Healing — or the Origins of Expected Utility Maximizers
2y
11
51Searching for a model's concepts by their shape – a theoretical framework
3y
0
Load More