No its not. There is no objective sense in which human suffering and extinction is bad. It's not even a matter of degree. Questions of morality are only meaningful from a subjective POV. That's Hume's Law in a nutshell. You can't solve the alignment problem by believing that we should aim as close as possible to the objective truth that everything is just particles and forces and alignment doesn't matter. It's circular reasoning.
Why do you assume that Yelsgib doesn't know or keep that in mind?
The problem is that Yudkowsky insists that a mechanistic view of the universe is the only correct perspective even though problems like alignment are inherently inaccessible from such a perspective due to Hume's Guillotine. It's only from a subjective POV that the idea that human suffering and/or extinction can be considered bad.
I always feel like I'm reading your response to some other argument, but you decide to use some indirect reference or straw-man instead of actually addressing the impetus for your posts. This article is a long way of saying that even when things aren't black and white, that doesn't mean shades of grey don't matter.
Also, I think people often reach for analogies as though they always provide clarification when sometimes they just muddle things. I have no idea what to make of your disappearing moon example. The odds that the entire moon could disappear and reappear seem very hard to compare to the odds that there's an invisible dragon that can cure cancer. Why not compare something highly probable with the cancer curing dragon instead of something so strangely contrived? You can't prove Big Ben will ring tomorrow, but the odds are much better than an invisible dragon baking you a cake!
Both AIXI and AIXItl will at some point drop an anvil on their own heads just to see what happens
You're confusing arbitrary optimization with a greedy algorithm which AIXI explicitly is not. It considers a future horizon. I see you commit this falacy often. You implicitly assume "what would an arbitrarily intelligent system do?" is equivalent to "what would a arbitrarily greedy algorithm do?"
Also, the math of AIXI assumes the environment is separably divisible
I don't know what you mean by this.
If you're talking about the fact that it considers models of the environment where dropping the anvil on its head will yield some reward are mixed in with models of the environment where dropping an anvil on its head will result in zero reward thereafter: That's not "assuming the environment is separably divisible". That's assuming the environment is seperably divisible, each program it checks is a different model for the environment as a whole. Specifically, the ones that are congruent with its experience.
Supposedly, the models that return a reward for dropping an anvil on its head would become vanishingly improbable pretty quickly, even if the system is incapable of learning about its own mortality (which it wouldn't have to do because AIXI only works in an Agent-Environment Loop where the agent is not embedded in the environment.
If we had enough CPU time to build AIXItl, we would have enough CPU time to build other programs of similar size, and there would be things in the universe that AIXItl couldn't model.
I think you're missing the point of AIXI if you're trying to think about it in practical terms. Math is a very useful human construct for studying patterns. The space of patterns describable by math is way larger than the set of patterns that exist in the real world. This creates a lot of confusion when trying to relate mathematical models to the real world.
For instance: It's trivial to use algorithmic information theory to prove that a universal lossless compression algorithm is impossible, yet we use lossless compression to zip files all the time because we don't live in a world that looks like TV static. Most files in the real world have a great deal of structure. Most N-length bit-strings of the set of all possible N-length bitstrings have no discernable structure at all.
I think it's easy to get distracted by the absurdity of AIXI and miss the insight that it provides.
AIXItl (but not AIXI, I think) contains a magical part: namely a theorem-prover which shows that policies never promise more than they deliver.
What does that even mean? How does AIXItl promise something?
My main problem is that I don't think Hutter should have titled his book "Universal Artificial Intelligence". For one because I don't think the word "artificial belongs", but mainly because I don't think optimality is equivalent to intelligence nor do I think intelligence should be defined in terms of an agent-environment loop. That implies that a system with less agency, like Stephen Hawking, would be less intelligent.
I think of intelligence as a measure of a system's ability to produce solutions to problems. Unlike optimality, I think the resources required to produce a given solution should also factor in. That ends up being highly circumstancial, so it ends up being a relative measure rather than something absolute. AIXIlt being as brute-force as an algorithm gets, would probably compare poorly to most heuristic systems.
Hey, G Gordon Worley III!
I just finished reading this post because Steve2152 was one of the two people (you being the other) to comment on my (accidentally published) post on formalizing and justifying the concept of emotions.
It's interesting to hear that you're looking for a foundational grounding of human values because I'm planning a post on that subject as well. I think you're close with the concept of error minimization. My theory reaches back to the origins of life and what sets living systems apart from non-living systems. Living systems are locally anti-entropic which means: 1) According to the second law of thermodynamics, a living system can never be a truly closed system. 2) Life is characterized by a medium that can gather information such as genetic material.
The second law of thermodynamics means that all things decay, so it's not enough to simply gather information, the system must also preserve the information it gathers. This creates an interesting dynamic because gathering information inherently means encountering entropy (the unknown) which is inherently dangerous (what does this red button do?). It's somewhat at odds with the goal of preserving information. You can even see this fundamental dichotomy manifest in the collective intelligence of the human race playing tug-of-war between conservatism (which is fundamentally about stability and preservation of norms) and liberalism (which is fundamentally about seeking progress or new ways to better society).
Another interesting consequence of the 'telos' of life being to gather and preserve information is: it inherently provides a means of assigning value to information. That is: information is more valuable the more it pertains to the goal of gathering and preserving information. If an asteroid were about to hit earth and you were chosen to live on a space colony until Earth's atmosphere allowed humans to return and start society anew, you would probably favor taking a 16 GB thumb drive with the entire English Wikipedia article text than a server-rack full several petabytes of high-definition recordings of all the reality television ever filmed, because that won't be super helpful toward the goal of preserving knowledge *relevant* to man kind's survival.
The theory also opens interesting discussions like, if all living things have a common goal; why do things like paracites, conflict, and war exist? Also, how has evolution led to a set of instincts that imperfectly approximate this goal? How do we implement this goal in an intelligent system? How do we guarantee such an implementation will not result in conflict? Etc.
Anyway, I hope you'll read it when I publish it and let me know what you think!
How? The person I'm responding to gets the math of probability wrong and uses it to make a confusing claim that "there's nothing wrong" as though we have no more agency over the development of AI than we do over the chaotic motion of a dice.
It's foolish to liken the development of AI to a roll of the dice. Given the stakes, we must try to study, prepare for, and guide the development of AI as best we can.
This isn't hypothetical. We've already built a machine that's more intelligent than any man alive and which brutally optimizes toward a goal that's incompatible with the good of man kind. We call it, "Global Capitalism". There isn't a man alive who knows how to stock the shelves of stores all over the world with #2 pencils that cost only 2 cents each, yet it happens every day because *the system* knows how. The problem is: that system operates with a sociopathic disregard for life (human or otherwise) and has exceeded all limits of sustainability without so much as slowing down. It's a short-sighted, cruel leviathan and there's no human at the reigns.
At this point, it's not about waiting for the dice to settle, it's about figuring out how to wrangle such a beast and prevent the creation of more.
This is a pretty lame attitude towards mathematics. If William Rowan Hamilton showed you his discovery of quaternions, you'd probably scoff and say "yeah, but what can that do for ME?".
Occam's razor has been a guiding principal for science for centuries without having any proof for why it's a good policy, Now Solomonoff comes along and provides a proof and you're unimpressed. Great.
If the "cartesian barrier" is such a show-stopper, then why is it non-trivial to prove that I'm not a brain in a vat remotely puppetering a meat robot?
Was nobody intelligent before the advent of neuroscience? Do people need to know neuroscience before they qualify as intelligent agents? Are there no intelligent animals?
I'm really not sure how to interpret the requirement that an agent know about software upgrades. There is a system called a Gödel Machine that's compatible with AIXI(tl) and it's all about self modification, however; I don't know of many real-world examples of intelligent agents concerned with whatever the equivalent of a software upgrade would be for a brain.
Rewards help by filtering out world models where doing dangerous things has a high expected reward. Remember that AIXI includes reward in its world models and exponentially devalues long world models. If the reward signal drops as AIXI pilots its body close to fire, lava, acid, sharks, etc. The world model that says "don't damage your body" is much shorter than the model that says, "Don't go near fire, lava, acid, sharks, etc. but maybe dropping an anvil on your head is a great idea!".
That's not an AIXI thing. That's a problem for all agents.
The anvil "paradox" simply illustrates the essential intractability of tabula rasa in general, but its not like you couldn't initialize AIXI with some apriori.
In a totally tabula rasa set-up, an agent can't know if anything it outputs will yield arbitrarily high or low reward. That's not unique to AIXI. It's also not unique to AIXI that it can only infer the concept of mortality.
Did your parents teach you what they think is deadly or were you born with inate knowledge of death? How exactly is it that you came to suspect that dropping an anvil on your head isn't a good idea? Were your parents perfect programmers?
So you are saying it can't generalize. That's exactly what you're saying.
Teenagers do horribly dangerous things all the time for the dubious reward of impressing their peers, yet this machine that's diligently applying inductive inference to determine the provably optimal actions fails to meet your yet-to-be described standard for intelligence if it's unlucky?
Also, why is the reward function the only means of feeding this agent data. Couldn't you just, tell it that jumping off a cliff is a bad idea? Do you think it undermines the intelligence of a child to tell it to look both ways before crossing the street?
Why? Prove that a small punishment wouldn't work. If you give the AIXI heat sensors so it gradually gets more and more punishment as it approaches a fire, show me how Occam's razor wouldn't prevail and say "I bet you'll get more punishment if you get even closer to the fire". Where does the model that says there's a pot of gold in a lava pit come from? How does it end up drowning out litterally every other world model? Explain it. Don't just say "It could happen therefore AIXI isn't perfect and it has to be perfect to be intelligent".
No agent can perfectly emulate itself. AIXItl can have an approximate self-model just like any other agent. It would have an incomplete world-model otherwise. It can "think it's like other brains" too. That's also Occam's razor. A world model where you assume others are like you is shorter than a world model where other agents are completely alien. Imperfect ≠ incapable. You're applying a double standard.
Why not?
AIXItl can absolutely develop a world model that includes a mortal self-model. I think what you're arguing is that, since there will always be a world model where jumping off a cliff yields some reward, It will never hypothisize that its future reward goes to zero. It will always assume there's some chance it will live. That's not irrational. There is some chance it will live. That chance never technically goes to zero. That's very different from thinking jumping off a cliff is the optimal action.
Or maybe you're saying that if you expand the tree of future actions, you are supposing that you can take those actions? Not in the world models that say your dead. Those will continue to yield zero after all your agency is obliterated.
Imagine you had a mech suit, and you could lift a car. Your world model will include that mech suit. It will also include the posibility that said mech suit is destroyed. Then you can't lift a car.
I'm done for now. I really don't like this straw-man conversation style of article. Why can't you argue actual points against actual people?