I see, this is something I've wondered about... If the set is allowed to 'half contain' itself, or contain half of itself, then the paradox is kind of resolved. On your point about languages, I remember reading about the 'Santa claus sentence' which states: " If this sentence is true, then Santa Claus exists". Clearly, if the sentence were true, then Santa Claus would exist. But this is what the statement says, so Santa Claus must exist. Of course, if we instead ask what would happen if the statement is false, then we are not drawn to conclude that Santa Claus exists, merely that he wouldn't even if the sentence were true...
When thinking about this I realized that the problem might just be that the way language is structured is inherently linear, so that to expand the statement fully would require it to be an infinitely long sentence of the form: "If " If " If "If " ....infinitely deep ...." is true, then Santa Claus exists." is true, then Santa Claus exists. " is true, then Santa Claus exists. " is true, then Santa Claus exists." . This is the only way to parse the sentence using linear language, and clearly it depends on what the text 'infinitely deep' that I've used to signify the 'rock bottom' actually means. This is indeterminate, similar to how the equation can be expanded out into an infinite expression which converges to two possible values when you iterate upon it starting from a particular number. One of them is what's commonly called the golden ratio, the other is the negative reciprocal of the (what's commonly called) golden ratio.
I might think of more to say about this but don't currently have the time to do so, or to reply to your other comment. Hopefully I will soon.
In my other comment I forgot to ask you what you mean by "One application for this might be that if you could throw out all inconsistency forcing structures you could (perhaps) find the subset of programs for which the halting problem doesn't apply." Why might this be? Is there a mapping between all programs and the truth-value - propagation system?
Your comment is much appreciated and touched on an idea I find interesting but didn't intend to reference directly in the post. Thanks for your kind words about this post. You make a very interesting point about networks of interconnected statements and arguments having more than simply a true or false value. I think the simplest such network which exhibits the kind of interesting structure you mention is that of the statement: "This statement is false." . If you represent truth values of statements as numbers between 1 and -1, with 1 representing truth and -1 falsehood, then that statement can be written down as the equation x = -x, where x is the truth of the statement. The fact that this statement is neither true or false is a consequence of its actual solution being 0, which I think of as representing 'tralsehood'. More generally, the process you describe sounds a lot like the google page rank algorithm, which is a process by which a monstrously complicated family of simultaneous equations can be approximately satisfied because they have basins of attraction around the solutions, which would be consistent sets of truth values in this context. Clearly, not all of these networks have integer solutions as you point out, but there will always be real solutions, and I would interpret these as representing the statements having varying levels of 'futhhood' or 'tralsity'.
Thanks for your clear explanation, understanding the topology of the space seems fascinating. If it's a vector space, I would assume its topology is simple, but I can see why you would be interested in the subspaces of it where meaningful information might actually be stored. I imagine that since topology is the most abstract form of geometry, the topological structure would represent some of the most abstract and general ideas the neural network thinks about.
Hi. I am interested in much of the mathematics which underlies theories of physics, such as complex analysis, as well as most of mathematics, although I sadly do not have the capacity to learn about the majority of it. Your interests seem interesting to me, but I do not understand enough about AI to know exactly what you mean. What is the residual stream of a transformer?
Thanks for your detailed reply. One thing I doubt is that it's possible to accurately predict or determine how someone or something is likely to behave without attempting to represent their internal state, but I expect your model would do that to some degree. I think the valence of their behavior is an important component as you point out, although you could certainly make use of a model which simply allows you to determine how easily you can predict their behavior. For example, if someone invariably 'puts a - sign in front of the consensus' , then you would not be able to trust them to accurately represent it directly, but you could ask them for their opinion and then invert it to obtain the consensus . When it comes to actually implementing such a model, I wouldn't know how to begin.
Maybe the appropriate mathematical object to represent trust might be related to those used to represent uncertainty in complex systems, such as wave functions associated with probabilities. After all, you can trust someone to precisely the extent to which you can constrain your own uncertainty about whether they will do things you wouldn't want. These things , while a kind of scalar, certainly contain lots of information in the form of their distribution throughout space, as well as being complex numbers.
I don't think you need to worry about individual humans aligning ASI only with themselves because this is probably much more difficult than ensuring it has any moral value system which resembles a human one. It is much more difficult to justify only caring about Sam Altman's interests than it is for humans or life forms in general, which will make it unlikely that specifying this kind of allegiance in a way which is stable under self modification is possible, in my opinion.
Hello, I am an entity interested in mathematics! I'm interested in many of the topics common to LessWrong, like AI and decision theory. I would be interested in discussing these things in the anomalously civil environment which is LessWrong, and I am curious to find out how they might interface with the more continuous areas of mathematics I find familiar. I am also interested in how to correctly understand reality and rationality.
Very interesting, I am in a similar position with respect to learning the relevant mathematics as you know from my first comment. One thing that your sequence resembles to me is the divergent infinite sum 1-1+1−1+1−1+1−1... This sequence does not get closer and closer to any particular value, so from the most standard perspective, its sum is undefined. However, the partial sums alternate from 1, to 0, back to 1 again and continue to do so ad infinitum, which means that their average is 1/2. A different way of looking at this sum is through the formula 1+1x+1x2+1x3+1x4+...=11−x . If x = -1, then the series 1−1+1−1+... becomes 11−x=12 . This suggests that, to the extent that this sum has a value, it is 12. Although this series is not the same as your sequence, if we take 1 to represent true, 0 to represent false and subtraction from 1 to represent logical negation, then the equivalent sequence is the sequence of partial sums:1,0,1,0,1,0,... and 12 now represents the concept of 'tralse' . In the case of your sequence, it would be the set of partial products of an infinite product(−1)(−1)(−1)(−1)(−1)(−1)(−1)... . How to obtain a generalization of the value of an infinite product when it does not converge, I am not sure. One possibility is to operate on a logarithmic scale, on which multiplication is equivalent to addition. Multiplication by -1 is equivalent to raising e to the power of iπ , which suggests that the infinite product is given by eiπ(1+1+1+1+1+1+...)=eiπ+iπ+iπ+iπ+iπ+iπ+iπ...=eiπ1−11
. In complex analysis, there is a well defined infinity which is the reciprocal of 0, therefore eiπ+iπ+iπ+iπ+iπ+iπ+iπ...=eiπ1−11=e∞=∞. Though ∞ is not 0, it is the single other number on the Riemann sphere which is its own negative, and is therefore also a solution to the equation x=−x !
I have found another way to shoehorn into your interpretation the notion of 'tralse' . I'm not sure if this meaningful or not, but I didn't know that calculation would produce a result compatible with the idea of 'tralsity' until I carried it out, and it did produce such a result in an unexpected way .
I will mirror your disclaimer about the idea being newly encountered and not clearly explained.