Eliezer is very explicit and repeats many times in that essay, including in the very segment you quote, that his code of meta-honesty does in fact compel you to never lie in a meta-honesty discussion. The first 4 paragraphs of your comment are not elaborating with what Eliezer really meant, they are disagreeing with him. Reasonable disagreements too, in my opinion, but conflating them with Eliezer's proposal is corrosive to the norms that allows people to propose and test new norms.
I had trouble making the connection between the first two paragraphs and the rest. Are you introducing what you mean by an "alarm" and then giving a specific proposal for an alarm afterwards? Is there significance in how the example alarms are in response to specific words being misleading?
Writing suggestion: Expand the acronym "ELK" early in the piece. I looked at the title and my first question was what ELK is, I quickly skimmed the piece and wasn't able to find out until I clicked on the link to the ELK document. I now see it's also expanded in the tag list, which I normally don't examine. I haven't read the article more closely than a skim.
On further thought I want to walk back a bit:
in practice the problems of infinite ethics are more likely to be solved at the level of maths, as opposed on the level of ethics and thinking about what this means for actual decisions.
I highly doubt this problem will be solved purely on the level of math, and expect it will involve more work on the level of ethics than on the level of foundations of mathematics. However, I think taking an overly realist view on the conventions mathematicians have chosen for dealing with infinities is an impediment to thinking about these issues, and studying alternative foundations is helpful to ward against that. The problems of infinite ethics, especially for uncountable infinities, seem to especially rely on such realism. I do expect a solution to such issues, to the extent it is mathematical at all, could be formalized in ZFC. The central thing I liked about the comment is the call to rethink the relationship of math and mathematical infinity to reality, and that doesn't necessary require changing our foundations, just changing our attitude towards them.
If the only alternative you can conceive of for ZFC is removing the axiom of choice then you are proving Jan_Kulveit's point.
I was reading the story for the first quotation entitled "The discovery of x-risk from AGI", and I noticed something around quotation that doesn't make sense to me and I'm curious if anyone can tell what Eliezer Yudkowsky was thinking. As referenced in a previous version of this post, after the quoted scene highest Keeper commits suicide. Discussing the impact of this, EY writes,
And in dath ilan you would not set up an incentive where a leader needed to commit true suicide and destroy her own brain in order to get her political proposal taken seriously. That would be trading off a sacred thing against an unsacred thing. It would mean that only true-suicidal people became leaders. It would be terrible terrible system design.So if anybody did deliberately destroy their own brain in attempt to increase their credibility - then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.
And in dath ilan you would not set up an incentive where a leader needed to commit true suicide and destroy her own brain in order to get her political proposal taken seriously. That would be trading off a sacred thing against an unsacred thing. It would mean that only true-suicidal people became leaders. It would be terrible terrible system design.
So if anybody did deliberately destroy their own brain in attempt to increase their credibility - then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.
The second paragraph is clearly a reference to acausal decision theory, people making a decision because how they anticipate others react to expecting that this is how they make the decision rather than the direct consequences of the decision. I'm not sure if it really makes sense, a self-indulgent reminder that nobody has knows any systematic method for producing prescriptions from acausal decision theories in cases where purportedly they differs from causal decision theory in everyday life. Still, it's fiction, I can suspend my disbelief.
The confusing thing is that in the story the actual result of the suicide is exactly what this passage says shouldn't be the result. It convinces the Representatives to take the proposal more seriously and implement it. This passage is just used to illustrate how shocking the suicide was, no additional considerations are described why for the reasoning is incorrect in those circumstances. So it looks like the Representatives are explicitly violating the Algorithm which supposedly underlies the entire dath ilan civilization and is taught to every child at least in broad strokes, in spite of being the second-highest ranked governing body of dath ilan.
Really all I need is that a strategy that takes n bits to specify will be performed by 1 in ∼2n of all random strategies. Maybe a random strategy consists of a bunch of random motions that cancel each other out, and in 1 in ∼2n of strategies in between these random motions are directed actions that add up to performing this n-bit strategy. Maybe 1 in ∼2n strategies start off by typing this strategy to another computer and end with shutting yourself off, so that in the remaining bits of the strategy will be ignored. A prefix-free encoding is basically like the latter situation except ignoring the bits after a certain point is built into the encoding rather than being an outcome of the agent's interaction with the environment.
How do you make spoiler tags?
A neat thought experiment! At the end of it all, you no longer need to exchange fruit, you can just keep the fruit in place and exchange the identity of the people instead.
Thanks too for responding. I hope our conversation will be productive.
A crucial notion that plays into many of your objections is the distinction between "inner intelligence" and "outer intelligence" of an object (terms derived from "inner vs. outer optimizer"). Inner intelligence is the intelligence the object has in itself as an agent, determined through its behavior in response to novel situation, and outer intelligence is the intelligence that it requires to create this object, and is determined through the ingenuity of its design. I understand your "AI hypothesis" to mean that any solution to the control problem must have inner intelligence. My response is claiming that while solving the control problem may require a lot of outer intelligence, I think it only a requires a small amount of inner intelligence. This is because it seems like the environment in Conway's Game of Life with random dense initial conditions is very low variety and requires a small number of strategies to handle. (Although just as I'm open-minded about intelligent life somehow arising in this environment, it's possible that there are patterns much frequent than abiogenesis that make the environment much more variegated.)
Matter and energy and also approximately homogeneously distributed in our own physical universe, yet building a small device that expands its influence over time and eventually rearranges the cosmos into a non-trivial pattern would seem to require something like an AI.
The universe is only homogeneous at the largest scales, at smaller scales it is highly inhomogeneities in highly diverse ways like stars and planets and raindrops. The value of our intelligence comes from being able to deal with the extreme diversity of intermediate-scale structures. Meanwhile, at the computationally tractable scale in CGOL, dense random initial conditions do not produce intermediate-scale structures between the random small-scale sparks and ashes and the homgeneous large-scale. That said, conditional on life being rare in the universe, I expect that the control problem for our universe requires lower-than-human inner intelligence.
You mention the difficulty of "building a small device that...", but that is talking about outer intelligence. Your AI hypothesis states that, however such a device can or cannot be built, the device itself must be an AI. That's where I disagree.
Now it could actually be that in our own physical universe it is also possible to build not-very-intelligent machines that begin small but eventually rearrange the cosmos. In this case I am personally more interested in the nature of these machines than in "intelligent machines", because the reason I am interested in intelligence in the first place is due to its capacity to influence the future in a directed way, and if there are simpler avenues to influence in the future in a directed way then I'd rather spend my energy investigating those avenues than investigating AI. But I don't think it's possible to influence the future in a directed way in our own physical universe without being intelligent.
Again, the distinction between inner and outer intelligence is crucial. In a pure mathematical sense of existence there exist arrangements of matter that solve the control problem for our universe, but for that to be relevant for our future there has also has to be a natural process that creates these arrangements of matter at a non-negligible rate. If the arrangement requires a high outer intelligence then this process must be intelligent. (For this discussion, I'm considering natural selection to be a form of intelligent design.) So intelligence is still highly relevant for influencing the future. Machines that are mathematically possible cannot practically be created are not "simpler avenues to influence in the future".
"to solve the control problem in an environment full of intelligence only requires marginally more intelligence at best"
What do you mean by this?
Sorry. I meant that the solution to the control problem need only be marginally more intelligent than the intelligent beings in its environment. The difference in intelligence between a controller in an intelligent environment and a controller in a unintelligent environment may be substantial. I realize the phrasing you quote is unclear.
In chess, one player can systematically beat another if the first is ~300 ELO rating points higher, but I'm considering that as a marginal difference in skill on the scale from zero-strategy to perfect play. If our environment is creating the equivalent of a 2000 ELO intelligence, and the solution to the control problem has 2300 ELO, then the specification of the environment contributed 2000 ELO of intelligence, and the specification of the control problem only contributed an extra 300 ELO. In other words, open-world control problems need not be an efficient way of specifying intelligence.
But if one entity reliably outcompetes another entity, then on what basis do you say that this other entity is the more intelligent one?
On the basis of distinguishing narrow intelligence from general intelligence. A solution to the control problem is guaranteed to outcompete other entities in force or manipulation, but it might be worse at most other tasks. The sort of thing I had in mind for "NP-hard problems in military strategy" would be "this particular pattern of gliders is particularly good at penetrating a defensive barrier, and the only way to find this pattern is through a brute force search". Knowing this can the controller a decisive advantage at military conflicts without making it any better at any other tasks, and can permit the controller to have lower general intelligence while still dominating.