Scientist by training, coder by previous session,philosopher by inclination, musician against public demand.
I'm specifically addressing the argument for a high probability of near extinction (doom) from AI...
Eliezer Yudkowsky: "Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. "
....not whether it is barely possible, or whether other, less bad outcomes (dystopias) are probable. I'm coming from the centre, not the other extreme
Doom, complete or almost complete extinction of humanity, requires a less than superintelligent AI to become superintelligent either very fast , or very surreptitiously ... even though it is starting from a point where it does not have the resources to do either.
The "very fast" version is foom doom...Foom is rapid recursive self improvement (FOOM is supposed to represent a nuclear explosion)
The classic Foom Doom argument (https://www.greaterwrong.com/posts/kgb58RL88YChkkBNf/the-problem) involves an agentive AI that quickly becomes powerful through recursive self improvement, and has a value/goal system that is unfriendly and incorrigible.
The complete argument for Foom Doom is that:-
The AI will have goals/values in the first place (it wont be a passive tool like GPT*),.
The values will be misaligned, however subtly, to be unfavorable to humanity.
That the misalignment cannot be detected or corrected.
That the AI can achieve value stability under self modification.
That the AI will self modify in way too fast to stop.
That most misaligned values in the resulting ASI are highly dangerous (even goals that aren't directly inimical to humans can be a problem for humans, because the AS I might want to director sources away from humans.
And that the AI will have extensive opportunities to wreak havoc: biological warfare (custom DNA can be ordered by email), crashing economic systems (trading can be done online), taking over weapon systems, weaponing other technology and so on.
It’s a conjunction of six or seven claims, not just one. ( I say "complete argument " because pro doomers almost always leave out some stages. I am not convinced that rapid self improvement and incorrigibility are both needed, both needed, but I am sure that one or the other is. Doomers need to reject the idea that misalignment can be fixed gradually, as you go along. . A very fast-growing ASI, foom, is way of doing that; and assumption that AI's will resist having their goals changed is another).
Obviously the problem is that to claim a high overall probability of doom, each claim in the chain needs to have a high probability. It is not enough for some of the stages to be highly probable, all must be.
There are some specific weak points.
Goal stability under self improvement is not a given: it is not possessed by all mental architectures, and may not be possessed by any, since noone knows how to engineer it, and humans appear not to have it.
The Orthogonality Thesis (https://www.lesswrong.com/w/orthogonality-thesis)is sometimes mistakenly called on to support to support goal stability. It implies that a lot of combinations of goals and intelligence levels are possible, but doesn't imply that all possible minds have goals, or that all goal driven agents have fixed, incorrigible goals. There are goalless and corrigible agents in mindspace, too. That's not just an abstract possibility. At the time of writing, 2025, our most advanced AI's, the Large Language Models, are non agentive and corrigible.
It is plausible that an agent would desire to preserve its goals, but the desire to preserve goals does not imply the ability to preserve goals. Therefore, no goal stable system of any complexity exists on this planet, and goal instability cannot be assumed as a default or given. So the orthogonality thesis is true of momentary combinations of goal and intelligence, given the provisos above, but not necessarily true of stable combinations.
Another thing that doesn't prove incorrigibility or goal stability is von Neumann rationality. Frequently appealed to in MIRI 's early writings , it is an idealised framework for thinking about rationality , that doesn't app!y to humans, and therefore doesn't have to apply to any given mind.
There are arguments that AI's will become agentive because that"s what humans want. Gwerns Branwen's confusingly titled "Why Tool AIs Want to Be Agent AIs" ( https://gwern.net/tool-ai) is an example. This is true, but in more than one sense:-
The basic idea is that humans want agentive AI's because they are more powerful. And people want power, but not at the expense of control. Power that you can't control is no good to you. Taking the brakes off a car makes it more powerful, but more likely to kill you. No army wants a weapon that will kill their own soldiers, no financial organisation wants a trading system that makes money for someone else, or gives it away to charity, or causes stick market crashes. The maximum amount of power and the minimum of control is an explosion.
One needs to look askance at what "agent" means as well. Among other things, it means an entity that acts on behalf of a human -- as in principal/agent.(https://en.m.wikipedia.org/wiki/Principal–agent_problem) An agent is no good to its principal unless it has a good enough idea of its principal's goals. So while people will want agents, they wont want misaligned ones -- misalgined with themselves, that is. Like the Orthogonality Thesis, the argument is not entirely bad news.
Of course, evil governments and corporations controlling obedient superintelligences isn't a particularly optimistic scenario, but it's dystopia, not doom.
Yudkowsky's much repeated argument that safe , well-aligned behaviour is a small target to hit ... could actually be two arguments.
One would be the random potshot version of the Orthogonality Thesis, where there is an even chance of hitting any mind, and therefore a high chance ideas of hitting an eldritch, alien mind. But equiprobability is only one way of turning possibilities into probabilities, and not particularly realistic. Random potshots aren't analogous to the probability density for action of building a certain type of AI, without knowing much about what it would be.
While, many of the minds in mindpsace are indeed weird and unfriendly to humans, that does not make it likely that the AIs we will construct will be. we are deliberately seeking to build certainties of mind for one thing, and have certain limitations, for another. Current LLM 's are trained in vast copora of human generated content, and inevitably pick up a version of human values from them.
Another interpretation of the Small Target Argument is, again , based on incorrigibility. Corrigibility means you can tweak an AI's goals gradually, as you go on, so there s no need to get them exactly right on the first try.
By far the best definition I’ve ever heard of the supernatural is Richard Carrier’s: A “supernatural” explanation appeals to ontologically basic mental things, mental entities that cannot be reduced to nonmental entities.
Physicalism, materialism, empiricism, and reductionism are clearly similar ideas, but not identical. Carrier's criterion captures something about a supernatural ontology, but nothing about supernatural epistemology. Surely the central claim of natural epistemology is that you have to look...you can't rely on faith , or clear ideas implanted in our minds by God.
it seems that we have very good grounds for excluding supernatural explanations a priori
But making reductionism aprioristic arguably makes it less scientific...at least, what you gain in scientific ontology, you lose in scientific epistemology.
I mean, what would the universe look like if reductionism were false
We wouldn't have reductive explanations of some apparently high level phenomena ... Which we don't.
I previously defined the reductionist thesis as follows: human minds create multi-level models of reality in which high-level patterns and low-level patterns are separately and explicitly represented. A physicist knows Newton’s equation for gravity, Einstein’s equation for gravity, and the derivation of the former as a low-speed approximation of the latter. But these three separate mental representations, are only a convenience of human cognition. It is not that reality itself has an Einstein equation that governs at high speeds, a Newton equation that governs at low speeds, and a “bridging law” that smooths the interface. Reality itself has only a single level, Einsteinian gravity. It is only the Mind Projection Fallacy that makes some people talk as if the higher levels could have a separate existence—different levels of organization can have separate representations in human maps, but the territory itself is a single unified low-level mathematical object. Suppose this were wrong.
Suppose that the Mind Projection Fallacy was not a fallacy, but simply true.
Note that there are four possibilities here...
I assume a one level universe, all further details are correct.
I assume a one level universe, some details may be incorrect
I assume a multi level universe, all further details are correct.
I assume a multi level universe, some details may be incorrect.
How do we know that the MPF is actually fallacious, and what does it mean anyway?
If all forms of mind projection projection are wrong, then reductive physicalism is wrong, because quarks, or whatever is ultimately real, should not be mind projected, either.
If no higher level concept should be mind projected, then reducible higher level concepts shouldn't be ...which is not EY's intention.
Well, maybe irreducible high level concepts are the ones that shouldn't be mind projected.
That certainly amounts to disbelieving in non reductionism...but it doesn't have much to do with mind projection. If some examples of mind projection are acceptable , and the unacceptable ones coincide with the ones forbidden by reductivism, then MPF is being used as a Trojan horse for reductionism.
And if reductionism is an obvious truth , it could have stood on its own as apriori truth.
Suppose that a 747 had a fundamental physical existence apart from the quarks making up the 747. What experimental observations would you expect to make, if you found yourself in such a universe?
Science isn't 100% observation,it's a mixture of observation and explanation.
A reductionist ontology is a one level universe: the evidence for it is the success of reductive explanation , the ability to explain higher level phenomena entirely in terms of lower level behaviour. And the existence of explanations is aposteriori, without being observational data, in the usual sense. Explanations are abductive,not inductive or deductive.
As before, you should expect to be able to make reductive explanations of all high level phenomena in a one level universe....if you are sufficiently intelligent. It's like the Laplace's Demon illustration of determinism,only "vertical". If you find yourself unable to make reductive explanations of all phenomena, that might be because you lack the intelligence , or because you are in a non reductive multi level universe or because you haven't had enough time...
Either way, it's doubtful and aposteriori, not certain and apriori.
If you can’t come up with a good answer to that, it’s not observation that’s ruling out “non-reductionist” beliefs, but a priori logical incoherence"
I think I have answered that. I don't need observations to rule it out. Observations-rule it-in, and incoherence-rules-it-out aren't the only options.
People who live in reductionist universes cannot concretely envision non-reductionist universes.
Which is a funny thing to say, since science was non-reductionist till about 100 years ago.
One of the clinching arguments for reductionism.was the Schrödinger equation, which showed that in principle, the whole of chemistry is reducible to physics, while the rise of milecular biology showeds th rreducxibility of Before that, educators would point to the de facto hierarchy of the sciences -- physics, chemistry, biology, psychology, sociology -- as evidence of a multi-layer reality.
Unless the point is about "concretely". What does it mean to concretely envision a reductionist universe? Pehaps it means you imagine all the prima facie layers, and also reductive explanations linking them. But then the non-reductionist universe would require less envisioning, because byit's the same thing without the bridging explanations! Or maybe it means just envisioing huge arrays of quarks. Which you can't do. The reductionist world view , in combination with the limitations of the brain, implies that you pretty much have to use higher level, summarised concepts...and that they are not necessarily wrong.
But now we get to the dilemma: if the staid conventional normal boring understanding of physics and the brain is correct, there’s no way in principle that a human being can concretely envision, and derive testable experimental predictions about, an alternate universe in which things are irreducibly mental. Because, if the boring old normal model is correct, your brain is made of quarks, and so your brain will only be able to envision and concretely predict things that can predicted by quarks.
"Your brain is made of quarks" is aposteriori, not apriori.
Your brain being made of quarks doesn't imply anything about computability. In fact, the computatbolity of the ultimately correct version of quantum physics is an open question.
Incomputability isn't the only thing that implies irreducibility, as @ChronoDas points out.
Non reductionism is conceivable, or there would be no need to argue for reductionism.
The Deutsch-Yudkowsky argument for the Many Worlds Interpretation states that you can take the core of Quantum Mechanics -- the Schrödinger wave equation, and the projection postulate -- remove the projection postulate (also known as collapse and reduction ), and end with a simpler theory that is still adequate to explain observation. The idea is that entanglement can replace collapse: a scientist observing a superposed state becomes becomes entangled with it, an effectively splits into two, each having made a definite observation.
Moreover Yudkowsky, following David Deutsch, holds the many worlds interpretation to be obviously correct, in contrast to the majority of philosophers and physicists, who regard the problem of interpreting QM as difficult and unsolved.
This has some problems.
(Which are to do with the specific argument, and the level of certainty ascribed to it. To say that you cannot be certain about a claim is not to say it is false. To point out that one argument for a claim does not work is likewise not to say that the claim itself is false. There could be better arguments for these versions of many worlds, or better many worlds theories, for that matter).
The first thing to note is that there is more than one quantum mechanical many worlds theory. What splittng is...how complete and irrevocable it is ... varies between particular theories. So does the rate of splitting, so does the mechanism of splitting.
The second thing to note is that many worlders are pointing at something implied the physical formalism and saying "that's a world"....but whether it qualifies as a world is a separate question from whether it's in the formalism , and a separate kind of question, from whether it is really there in the formalism. One would expect a world, or universe, to be large, stable, non-interacting, and so on . It's possible to have a theory that has collapse , without having worlds. A successful MWI needs to jump three hurdles: empirical correctness, mathematical correctness and conceptual correctness -- actually having worlds
The third problem to note is that all outstanding issues with MWI are connected in some way with quantum mechanical basis....a subject about which Deutsch and Yudkowsky have little to say.
There is an approach to MWI based on coherent superpositions, and a version based on decoherence. These are (for all practical purposes) incompatible opposites, but are treated as interchangeable in Yudkowsky's writings.
Quantum superposition is a fundamental principle of quantum mechanics that states that linear combinations of solutions to the Schrödinger equation are also solutions of the Schrödinger equation. This follows from the fact that the Schrödinger equation is a linear differential equation in time and position. (WP)
Coherent superpositions are straightforwardly implied by the core mathematics of Quantum mechanics. They are small scale in two senses: they can go down to the single particle level, and it is difficult to.maintain large coherent superpositions even if you want to. They are also possibly observer dependent, reversible, and continue to interact (strictly speaking , interfere) after "splitting". The last point is particularly problematical. because if large scale coherent superposition exist , that would create naked eye, macrocsopic evidence:, e.g. ghostly traces of a world where the Nazis won. All in all, a coherent superposition isn't a world you could live in.
I said complex coherent superpositions are difficult to maintain. What destroys them? Environmental induced decoherence!
Interference phenomena are a well-known and crucial aspect of quantum mechanics, famously exemplified by the two-slit experiment. There are many situations, however, in which interference effects are artificially or spontaneously suppressed. The theory of decoherence is precisely the study of such situations. (SEP)
Decoherence tries to explain why we don't notice "quantum weirdness" in everyday life -- why the world of our experience is a more-or-less classical world. From the standpoint of decoherence, sure there might not be any objective fact about which slit an electron went through, but there is an objective fact about what you ate for breakfast this morning: the two situations are not the same!
The basic idea is that, as soon as the information encoded in a quantum state "leaks out" into the external world, that state will look locally like a classical state. In other words, as far as a local observer is concerned, there's no difference between a classical bit and a qubit that's become hopelessly entangled with the rest of the universe.
(http://scottaaronson.com/democritus)
Decoherence is the study of interactions between a quantum system (generally a very small number of microscopic particles like electrons, photons, atoms, molecules, etc. - often just a single particle) and the larger macroscopic environment, which is normally treated "classically," that is, by ignoring quantum effects, but which decoherence theorists study quantum mechanically. Decoherence theorists attribute the absence of macroscopic quantum effects like interference (which is a coherent process) to interactions between a quantum system and the larger macroscopic environment.(www.informationphilosopher.com)
Decoherent branches are necessarily large, since decoherence is a high level phenomenon. They are also stable, non interacting and irreversible...everything that would be intuitively expected of a "world". But there is no empirical evidence for them (in the plural) , nor are they obviously supported by the core mathematics of quantum mechanics, the Schrödinger equation.
We have evidence of small scale coherent superposition, since a number of observed quantum effects depend on it, and we have evidence of decoherence, since complex superposition are difficult to maintain. What we don't have evidence of is decoherence into multiple branches. From the theoretical perspective, decoherence is a complex , entropy like process which occurs when a complex system interacts with its environment. But without decoherence, MW doesn't match observation. So there is no theory of MW that is both simple and empirically adequate, contra Yudkowsky and Deutsch.
The original, Everettian, approach is based on coherence. (Yudkowsky says "Macroscopic decoherence, a.k.a. many-worlds, was first proposed in a 1957 paper by Hugh Everett III" ... but the paper doesn't mention decoherence[1]) As such, it fails to predict classical observations -- at all -- it fails to predict the appearance of a broadly classical universe. If everything is coherently superposed, so are observers...but the naturally expected experience an observer in coherent superposition with themselves, is that they function as a single observer making ambiguous, superposed observations ... not two observers each making an unambiguous , classical observation, and each unaware of the other. Such observers would only ever see superpositions of dead and living cats, etc.
(A popular but mistaken idea is that full splitting happens microscopically, at every elementary interaction But that would make complex superpositions non-existent, whereas a number of instruments and technologies depend on them -- so it's empirically false).
Later, post 1970s, many world theorists started to include decoherence to make the theory more empirically adequate, but inasmuch as it is additional structure, it places the simplicity of MWI in doubt. In the worst case, the complexity is SWE+decoherence+preferred basis, whereas in the best case, it's SWE alone, because decoherence is implicit in SWE, and preferred basis is implicit in decoherence. Decoherentists hope to show that the theory can be reduced to core QM, such as the Schrödinger equation, but it currently uses more complex math, the "reduced density matrix". The fact that this research is ongoing is strong evidence that the whole problem was not resolved by Everetts's 1957 paper. In any case, without a single definitive mechanism of decoherence, there is no definitive answer to "how complex is MWI".
And single-universe decoherence is quite feasible. Decoherence adds something to many worlds, but many worlds doesn't add anything to decoherence.
So, coherent superpositions exist, but their components aren't worlds in any intuitive sense; and decoherent branches would be worlds in the intuitive sense, but decoherence isn't simple. Also, theoretically and observationally, decoherence could be a single world phenomenon. Those facts -- the fact that it doesn't necessarily involve multi way branching, and the fact that it is hard to evaluate its complexity because there is not a single satisfactory theory for it -- means it is not a "slam dunk" in Yudkowsky's sense.
The Yudkowsky-Deutsch claim is that there is a single MW theory, which explains everything that needed explaining, and is obviously simpler than its rivals. But coherence doesn't save appearances , and decoherence, while more workable, is not known to be simple. So neither theory has both virtues
Which makes the term *Everett branch" rather confusing. The writer possibly means a decohered branch, under the mistaken assumption that Everett was talking about them. Everett's dissertation can be found here ↩︎
Human value isn't a set preferences that can be pursued individually without conflict. Evolutionary psychology doesn't predict that , and we see the conflicts played out very day. there is more evidence for the incoherence of human value than there is for just about anything.
So "human value" can't be equated with the good or the right. It's a problem not a solution.
It also can't be equated with the safe. Copying human values into an AI won't give you an AI that won't kill you. You've mentioned self preservation as something that is part of Human value (even if instrumentally) and dangerous in an AI; , but the Will to Power, the cluster of values connected with ambition, hierarchy and dominance is more so.
It's odd that this point is so often missed by rationalists. Perhaps that's because they tend to have Hufflepuff values.
A handwavy argument that “training is a bit like evolution, so maybe the same social dynamics should apply to its products” is inaccurate: you can train in aligned behavior, so you should (in both the evolutionary and engineering senses of the word) — but you can’t evolve it, evolution just doesn’t do that.
Inasmuch as it is not like natural selection , it is like artificial selection. That's good news, because artificial selection doesn't copy human values blindly: if you want AI s that are helpful and self sacrificing, you can select them to be, much more so than a human.
It's generally hard to see what useful work is being done by "human value*. Value is only relevant to.alignment as opposed to control, for one thing...some human values are.down right unsafe for another. Taking "human value" out of the loop allows you to get to the conclusions that "if you doing want to be killed, build AI's that don't want to kill you" more quickly.
However, Evolutionary Psychology does make it very clear that (while morally anthropomorphizing aligned AIs is cognitively-natural for current humans), doing this is also maladaptive. This is because AIs aren’t in the right category – things whose behavior is predicted by evolutionary theory – for the mechanisms of Evolutionary Moral Psychology to apply to them.
Their behaviour is not predicted by evolutionary theory , but is predicted by something wider.
Those mechanisms make this behavior optimal when interacting with co-evolved intelligences that you can ally with (and thus instinctive to us) — whereas, for something you constructed, this behavior is suboptimal.
Things do what they do. It doesn't have to be an optimization.
But WF of a human is spatially extensive enough.
It factors into localised parts. Humans aren't Bose Einstein condensates.
our sensorium should look like a fine grained brain scan
Why not like a drawing of a head
Because you were saying that the binding comes from physics. That means the lack of it comes from physics, if the physics isn't right.
Anyway, the binding problem for qualia is no different from the binding problem for fire
Why is there a binding problem for fire?
There is just no reason to promote limits of human introspection into fundamental ontology, jus
Are you now saying that the binding comes from neurology?
C1: Let me put it this way then, how do you combine all of these tiny little microexperiences into a coherent macroexperience? You have a combination problem.
C2: Okay, I don’t know, but at least this gives us a good starting for solving the hard problem right? We’ve reduced the hard problem into a combination problem. That seems more tractable.
C1: It sounds epiphenomenal to me. You couldn’t explain any observable behaviour by postulating these properties.
C2: Sure, it wouldn’t have any third person physical observables. But it would do the job of fixing the phenomenal character of experience from a first person perspective.
The structural properties of matter, or whatever the underlying substance is, are sufficient to predict everything physicists want to predict. To say that the intrinsic , nonstructural properties of matter are some kind of Qualia therefore entails epiphenomenalism. It allows you to predict conscious experience , but at the expense of the binding problem: if Qualia are just the intrinsic nature of quarks and electrons, then our sensorium should look like a fine grained brain scan. So Russerlian Monism, the scent faction of Qualia with intrinsic properties , has a bad case of the Binding problem.
@Signer something is a WF doesn't mean it is nonlocal or particularly spatially extensive , since WFs can bunch down to any finite size. Most of the electrons in the human body are localised to orbitals that are some nanometers across (But not localised within them).
On the Ptolemaic system’s accuracy when you add epicycles: due to geocentrism turning out to be incorrect, the apparently high accuracy vanishes when you look from a vantage point not on Earth
You can build a different , arbitrarily complex, system for a vantage point other than the Earth. Since geocentricism is wrong , it can't be any worse.
Looking is still the ultimate arbiter between the different models.
Is it? Don't we use simplicity as a criterion , as well? How about conscillience? Does into are arbiter mean only arbiter? Is empirical correctness necessary or sufficient?
What you and @Ape in the coat are saying is mostly just vague.
>The interpretation of quantum mechanics is a philosophical puzzle that was baffling physicists and philosophers for about a century. In my view, this confusion is a symptom of us lacking a rigorous theory of epistemology and metaphysics
They are symptoms of each other. Epistemology depends on ontology, ontology depends on the right interpretation of physics...which depends on epistemology.
>What is so confusing about quantum mechanics?
Recovering the appearance of a classical reality from a profoundly unclassical substructure.
>and CI will prescribe contradictory beliefs to different agents (as in the Wigner’s friend thought experiment).
It's important to notice that Wigners Friend only implies an inconsistency about when collapse occurred: there is no inconsistency between what Wigner and the Friend actually measure. So CI is still consistent as an instrumentalist theory. Indeed, you can cast it as an instrumentalist theory, by refusing to speak about collapse as a real process, and only requiring that observers make sharp real-valued observations for some unknown reason -- substituting an irrealist notion of measurement for collapse.
>In CI, the wavefunction is merely a book-keeping device with no deep meaning of its own. In contrast
If you are going to be realist about collapse, you need to be realist about WF's -- otherwise what is collapsing? Realism about both is consistent, irrealism about both is consistent.
>However, the MWI has no mathematical rule for computing probabilities of observation sequences. If all “worlds” exist at the same time, there’s no obvious reason to expect to see one of them rather than another. MWI proponents address this by handwaving into existence some “degree of reality” that some worlds posses more than others. However, the fundamental fact remains that there is no well-defined prescription for probabilities of observation sequences, unless we copy the prescription of CI: however the latter is inconsistent with the intent of MWI in cases when decoherence fails, such as Wigner’s friend.
"The" MWI is more than one thing. If you are talking about coherent superposition, then there is no need to invent a probabilistic weighting , since standard quantum mechemical measure (WF amplitude) gives you that via the Born rule. But we can tell.we are not in a macroscopic coherent superposition, because it would look weird. If you are talking about decoherent branching, it's not clear how or whether branches are weighted. That might matter for the purposes of physics, since the experimenter is only operating in one branch, and any other can have no effect. But for some other purposes , such as large world ethics, or anthropics, it might be nice to have weights for branches.
>and fundamentally this information is all there is
If you want to do decision theory, and not just probability theory , you need ontology, not just epistemology. But why would the process stop there? Is decision theory the most ambitious goal you can have?
Most of the current debate about metaphysics centres on consciousness. Physicalism, of a substantial kind, materialism, seems unable to explain phenomenal consciousness -- why should an insubstantial, information-only ontology fare any better? Rather than having anything extra to offer, it is more minimal. The same information can appear in different forms .. drinking the wine is not reading the label .. which is a strong hint that information is not there is.
All the “contentiousness” evaporates as soon as we’ve fixed the definitions and got rid of the semantic confusion.
Of course not. Having clear semantics is a necessary condition for understanding the world, not a sufficient one. You have to look. Among other things.
You gather evidence about interpreting evidence
You can only gather theories about interpreting evidence. You can't see how well such theories work by direct inspection. It isn't looking.
This would work much better if you thought about it concretely. Alice says evidence includes introspections, subjective seemings; Bob says it is only ever objective. What do you do next?
When you are a result of successful replication of imperfect replicators in a competitive environment with limited resources there is quite a lot of evidence for some kind of “optimality”.
I doing see why a slug or wallaby is optimising anything, so why should I be? What makes humans the pinnacle of creation?
If the existed in some kind of separate magisterium where our common knowledge wouldn’t be applicable than yes
They exist in a separate magisterium where direct, sensory evidence isn't applicable, because they are about the causes and meaning of whatever sensory evidence you happen to have. The interpretation of evidence.is a separate magisterium from.You gathering evidence , and not in a spooky way.
So you use indirect feedback
Which is what? Give concrete examples.
I can keep applying it to the “tricky cases
Applying what? You can't keep applying evidence-gathering to solve the problem of interpreting evidence. It's unclear whether you are talking about pure empiricism, or.some.kind of vaguely defined solution everything.
And this way I can aggregate more and more evidence
Which is not self interpreting, so you are just creating a bigger and bigger problem.
We can try multiple of them and see how these models predict new data
But they dont, in the trickiest. I've already addressed that point: the Ptolemaic model can by adjusted to fit any data.
Being already selected for intuitions related to surviving in the world
I've already addressed that point too: you don't need ontological understanding to survive. You don't get direct feedback about ontological understanding. So it's a separate magisterium.
Okay I think I understand what is going on here. Are you under impression that I’m trying to bring back the old empiricism vs rationalism debate, arguing on the side of empiricism?
What's "looking" if not empiricism?
I'm not arguing for rationalism over empiricism, or against never using empirucism. I'm arguing against pure empiricism as being able to solve all problems. Which is not to say there is something else that does. It's a mixture of pluralism -- there's more than one kind of epistemic problem and solution -- and scepticism -- theres no guarantee of solving anything even using more tools than "looking".
I already said that here :-
And Its also not the case that you have to make positive claims about apriori reasoning to point out the limitations of empiricism. And Its also not the case that noticing the limitations of empiricism is the same as refusing to use it at all.
Yes, it’s all probabilities all the way down, without perfect certainty
No, it's worse than that. Probabilities require quantification of how true or likely something is. But there is no way of objectively quantifying that for ontological interpretation. And subjective probability leads to perennial disagreement , not convergence.
We can come up with adversarial examples where it means that we were completely duped, and our views are completely disentangled from “true reality” and were simply describing an “illusion”, but
But, that only allows us to reject N false theories , not home in on a single true one. Convergence is a problem as well as certainty.
Renaming “reality” to “illusion” doesn’t actually change anything of substance
If your beliefs are illusory, they are false. That might not make an difference instrumentally, to what you can predict, but you are not assuming instrumentalism and neither is Yudkowsky.
But generally, consider the fact that philosophy reasons in all direction and normality is only a relatively small space of all possible destinations
What's normality? If you just mean "saving appearances", rather than predicting something that is empirically disproveable, then most philosophy does that. What doesn't? Illusionism? But that's quite popular around here!
I also thought the robot:s answer missed the point quite badly …because it reduced the ought all the way down to an is—or rather a bunch of isses.
If you dismiss any reduction of ought to is,
I don't. As I said:-
Reducing ethical normativity isn’t bad
Not to what one would. Your ethical module may not be directly connected to the behavioral one and so your decisions are based on other considerations, like desires unrelated to ethics.
Are you saying that's the only problem? That the action you would have taken absent those issues is the right action, in an ultimate sense?
This doesn’t change the fact that what you ought to do is the output (or a certain generalization of multiple outputs) of the ethical module,
It's not a fact. There are any number of ethical theories where what you should do is not necessarily what you would do. e.g. Utilitarianism, which is quite popular round here. When you think about maths , that's neural activity, but it doesn't follow that it defines mathematical correctness. Errors are neural activity as.well. The normative question is quite separate. Even if want to reduce it, it doesn't follow that they only way to do so is to have eight billion correct answers.
which is a computation taking place in the real world, which can be observed.
That's quite irrelevant. The fact that it takes neural activity to output an action tells you nothing about the ethics of the action. "Ought" and "ethical" aren't just vacuous labels for anything you do or want to do.
there are potentially eight billion answers to what one ought to do.
Potentially but not actually.
Nothing hinges on having exactly right billion right answers. More than one right answer is enough of a problem.
Once again, when you look, turns out individual ethical views of people are not *that* different
Yes they are. Political divisions reflect profound ethical divisions.
There’s a consistent theme in rationalist writing on ethics, where the idea that everyone has basically the same values , or “brain algorithms”, is just assumed … but it needs to be based on evidence as much as anything else.
Not basically the same, but somewhat similar. And it’s not just assumed, it’s quite observable.
The differences are observable. Fraught debates are people disagreeing about the value of freedom.versus equality, etc.
I'm any case, the problem of subjectivism is that there are potentially multiple right answers.
Human ethical disagreements are mostly about edge cases. Like what is your objective claim here, that human values are not correlated at all?
No. I don't accept that ethics "is" whether values you happen to have, or whatever decision you happen to make.
It’s social constructivism of morality. Which is rooted in our other knowledge about game theory and evolution.
If morality is socially constructed , the robot is wrong about metaethics. What the robot should do.is follow the social rules, and if its programming is something different, then it's actions are object level wrong. .
Yes, this is exactly my point. A lot of things, which are treated as “applied missing the point answers” are in fact legitimately philosophically potent. At the very least, we should be paying much more attention to them.
Is the robot missing the point or not?
Therefore it’s not just “by looking” but “pretty much by looking”. I completely agree about the necessity to abandon the notion of certainty
That's just the start. The tricky question is now much else we need to abandon. In particular , it's not clear whether convergence on a single most likely theory of everything is possible, even if you have abandoned certainty.
MIRI’s plan, to build a Friendly AI to take over the world in service of reducing x-risks, was a good one.
How much was this MIRI’s primary plan?
It was Yudkowsky's plan before MIRI was MIRI
http://sl4.org/archive/0107/1820.html
"Creating Friendly AI"
https://intelligence.org/files/CFAI.pdf
Both from 2001.
"it" isn't a single theory.
The argument that Everettian MW is favoured by Solomonoff induction, is flawed.
If the program running the SWE outputs information about all worlds on a single output tape, they are going to have to be concatenated or interleaved somehow. Which means that to make use of the information, you gave to identify the subset of bits relating to your world. That's extra complexity which isn't accounted for because it's being done by hand, as it were..