I am interested to know some set theory. Here’s what I know from reading Naïve Set Theory by Paul Halmos, a recommended book by MIRI. It is well under a hundred pages. I think it's going to be useful for me, perhaps not immediately, but soon. So my goal is not just comprehending the material but also retaining it. To this end, I will attempt to write these posts mostly from memory and whenever I draw a blank I will refer to the book but resume writing much later to make sure I can recall.

No definition of a set

Instead of formally defining what a set is, let's just stick to the naïve and informal impression that we have of a set, which is that a set is a collection of elements. Whenever we attempt to write down a set, we use the curly braces  and mention its elements inside it. Suppose a set has elements  and  then we may express the set as 

Consider the question: what does it mean for two sets to be equal? And for that, we will first see a couple of basic relations between sets. 

One is belonging and the other is inclusion. They sound similar but there's a subtle difference. We say element  belongs to , when  is an element of . It is denoted as . We say  includes  or  is a subset of  when each element of  is also an element of . It is denoted as  or 

Now let us define the equality of two sets. Two sets  and  are equal if and only if  is a subset of  and  is a subset of . This is the axiom of extension. It describes what it means for two sets to be equal and also suggests how one may prove two sets to be equal.

Constructing sets by constructing sentences

One way to build a set out of a given set is to make assertions about the elements of the given set and the new set contains exactly the elements for which the assertion is true. So, given a set  and an assertion  we construct a set  which contains exactly those elements of A for which S(x) is true.

One can construct  by using first-order logical symbols along with the belongs to and inclusion relations.  must be a free variable in . This way of constructing a subset out of a given set is the axiom of specification.

There's no universe

Consider the assertion . Let's apply this assertion to an arbitrary set  and call the resulting subset . Does  ? 

We will show that it's not the case that . The proof is by contradiction:

Assume . There are two possibilities. 

  1. . If this is true then S(B) is false therefore  by the axiom of specification. Contradiction.
  2. . If this is true then S(B) is true therefore  by the axiom of specification. Contradiction.

Hence, our assumption must be false. Therefore 

What we have shown is that for any arbitrary set , we can construct a set  such that it does not belong to . Perhaps the axiom of specification is too strong because this result indicates that there's no universal set containing everything. For now we will not consider such  as described by  as sets.

Construction via pairing

Let us assume some set  exists. One consequence of the axiom of specification is the existence of an empty set, denoted by . The corresponding specification  would be .

There's another way to construct sets by pairing two sets. For any two sets  and , there exists another set  such that  and  belong to . This is the axiom of pairing and as a direct consequence of applying axiom of specification we can show the existence of a set with only  and  as its elements. Such a set is called the unordered pair of  and , where unordered simply implies we don't care about the order of the elements  and .

We can apply axiom of pairing to a single set  and what we obtain is  which is simply . Such a set is called the singleton of . Assuming some set exists, we have shown the existence of the empty set  and using the axiom of pairing we can construct new sets resulting in sets such as  ad infinitum.

1

23 comments, sorted by Click to highlight new comments since: Today at 5:18 PM
New Comment

Just to make sure, by "books recommended by MIRI" you mean this list or is there a better resource?

My impression is that the concept of "set" seems quite straightforward when we talk about sets of other objects. Saying "a set of all natural numbers" seems like a wannabe-academic way of saying "natural numbers". The problems seem to start when we try to apply the concept of "set" to itself; when we talk about "sets of sets", "sets of sets of sets", and worse.

(Although, I am not completely sure about this part: the Banach-Tarski paradox is about sets of points.)

The problems are related to the concept of "infinity". Korzybski (the "map is not the territory" guy) would probably dismiss this whole line of thinking as confused -- actual infinities do not exist; there are only processes that can go on indefinitely, but in finite time they obviously can't arrive to infinity literally. For every natural number n, we can imagine a natural number n+1, but we can never complete the entire set of all natural numbers. So if you pretend to have a complete set of all natural numbers... plus some even more unreal sets... and then you arrive at paradox, that's entirely your fault. You should have stopped at the first moment you started talking nonsense.

(Though with this approach, we would probably have to give up the concept of real numbers, too; at least the non-computable ones. Because the concept of the real number involves the infinite amount of digits, infinite precision, and sometimes even infinite complexity.)

Even with infinite sets, the problems seems to be not their infinite size, but rather their infinite complexity. A set containing natural numbers, like "a set of all prime numbers"? No problem. A set containing sets of natural numbers? Oh, you mean like "all pairs of natural numbers"? Yeah, I guess that's still okay. A set containing sets and natural numbers? Uhm, what does that even mean? If comparing apples to oranges is wrong, surely comparing apples to sets of apples is not-even-wrong. And when we get the infinitely nested hierarchies of sets, it's time to admit this has no relation to the actual apples.

From that perspective, it's kind of a relief when the pretense of apples is removed completely, and we start talking about sets containing sets containing... ultimately empty sets and nothing else. Just a long maze of parentheses that would make even a devoted Lisp programmer admit that this has gone too far.

I think that at that moment it would be even better to completely drop the pretense that the words "set" and "contains" in set theory have any correspondence whatsoever to our usual intuition of a set as a collection of something, and containing as... the notion that there was something, and some other thing, and someone labeled all these things as beloning to the same whole. (Specifically, I suspect, although I do not really understand the set theory that far yet, that the whole approach of "forcing" is just some kind of trolling; creating arbitrary structures that technically satisfy the axioms of set theory, without having anything in common with sets qua collections of things.)

.

When I think about mathematical objects, I naturally picture them as constructed in some (partial) order. Like the number 5 comes before number 10 not just the sense of "the 'less than' operator", but fundamentally; that it is possible to imagine a universe where "5 apples" exist but "10 apples" do not, but impossible to do the other way round. From that perspective, minus 5 comes before minus 10; integers come before rationals (at least of comparable size; I am not really sure that 10^^^10 comes before 1/2); and rationals come before reals.

From that perspective, integers come before sets of integers, because first you need to create the integers, and only then you can start collecting them to sets. The entire process can be done in two clearly separated steps. Step 1: create integers. When you are done, step 2: create sets of integers.

Except with sets of sets this process doesn't work, because sets of sets are of the same type as sets. Hence the paradox with "set of all sets (not containing themselves" and similar. Step 1: create sets. When you are done, step 2: create the set of all sets... oops, it turns out you were actually not done with the step 1. With integers, you can imagine that an infinite amount of time has passed, and you have created them all. With sets, there is no amount of time, finite or infinite, is enough. With integers, the idea of "last integer" does not make sense, but the idea of "all integers" does. With sets, the "last set" and the "set of all sets" are both the same. So even the generous assumption that we have unlimited infinities of time does not allow us to create all sets.

Mathematics does not really require the concept of gradual construction (of integers, or sets). We can just assume we have all integers, magically created at the same moment. And it will be the same as when we create the integers step by step, using an infinitely fast machine. We cannot do the same thing with sets though, lest we get a paradox. Unless we give up on some notions, such as having a "set of all sets". (Which notions exactly do we have to give up? I guess those that would prevent the infinitely fast machine from even completing the construction of "all sets". So at the end, even if we pretend it is not a gradual construction, it kinda behaves as if it was.)

The gradual-construction idea is actually quite common when thinking about set theory. Write V(0) = {}, then V(1) = all subsets of V(0), then V(2) = all subsets of V(1), and so on. If "and so on" just means "on through all the non-negative integers" then this doesn't get you "enough" sets, but what we can do after constructing all the V(n) is to take the union of all those and call it , after which we can start playing the take-all-the-subsets game again to get  which again runs out of steam after all the  at which point we take the union again to get . (For technical reasons this should actually be  but never mind that.) At this point we can see how to get  but then what? Take the union again and call it . You can continue this process indefinitely, in a very strong sense of "indefinitely", and with the usual axioms of set theory it turns out that every set appears in one of the V(something). (The somethings are the ordinals, a generalization of the non-negative integers.)

When you do this, any given set appears strictly after all its elements have appeared. Exactly when e.g. any given number appears depends on exactly how you choose to "implement" them in terms of sets. The usual process goes something like this.

  • We start with the non-negative integers. 0 is the empty set, 1 is {0}, 2 is {0,1}, 3 is {0,1,2}, and so on. Or: n+1 is the union of n and {n}. So 0 appears first in V(1) -- it equals V(0) but it's not in it. 1 appears first in V(2). And in general n appears first in V(n+1).
  • Then we construct the integers by first defining the notion of ordered pair by (a,b) := {a,{a,b}} and then saying that the integer n is the set of all ordered pairs of integers (x,y) such that x-y=n. Note that this means that e.g. the integer 3 is not at all the same object as the nonnegative integer 3, a pattern that will be repeated as we construct other kinds of number. The ordered pair of the nonnegative integers m,n appears in V(m+n+2) or something like that, which means that all the integers appear together in .
  • Then we construct the rational numbers in essentially the exact same way: the rational number r is the set of all pairs of integers (x,y) such that x=ry, or maybe the set of all such pairs with y not zero. All the integers are in , so all the ordered pairs of integers are in , so sets of ordered pairs of integers are in . So that's where all the rationals live. Note that again the rational number 3 is not at all the same thing as the integer 3, which in turn is not at all the same thing as the nonnegative integer 3.
  • Then we construct the real numbers, maybe via "Dedekind cuts"; the idea is that a real number x "is" the set of all rational numbers less than x. So real numbers are sets of rational numbers, which means they all live in . Sets of real numbers live in .

Early attempts at formalizing set theory made the sort of distinction you describe more fundamental. E.g., Principia Mathematica (Russell & Whitehead, not Newton) had a sort of "type theory" where objects of type 0 aren't sets at all, objects of type 1 are allowed to have objects of type 2 as elements, objects of type 2 are allowed to have objects of type 1 as elements, etc. (I think their actual type theory was more complicated than this suggests, but I don't know the details.) This turned out to be very painful, and the idea was largely abandoned.

It got kinda-sorta picked up again later by Quine, who proposed a set of axioms for set theory somewhat different from the usual ones. A key question any version of set theory has to answer is: how do you avoid things like Russell's paradox? Principia Mathematica does it by having sets of different types, so that it's ungrammatical to talk about a set being, or not being, an element of itself. Zermelo-Fraenkel set theory (the sort mathematicians mostly use these days) does it by restricting the comprehension axiom -- the one that goes from what the article here calls S(x) to "the set of all x such that S(x)" -- to work only when the x you're considering are restricted to the elements of a given set. Quine's "New Foundations" does it by restricting the comprehension axiom in a different way: S(x) has to be something that you could have written if you were working with types like PM's. That is, you have to be able to attach numerical labels to all the variables in S in such a way that whenever "" occurs the label on b is 1 higher than the label on a. To be clear, you never actually have to write down those labels, variables don't actually have types, etc.; but type theory gives a motivation for the way in which the comprehension axiom is restricted. So far as anyone knows, NF is consistent (this is the same statement as one could make about ZF, but mathematicians have put a lot more work into ZF so maybe we have less grounds for confidence about NF). One expert on NF thinks he's proved it's consistent using the axioms of ZF (which implies that in some sense NF is a "weaker" theory than ZF, for Goedel-ish reasons), but I don't think anyone else in the field is convinced yet. But it seems like NF is probably OK, and you can definitely do actual mathematics in it; there is also a related theory called NFU that is definitely weaker than ZF (and in particular known to be consistent if ZF is) in which it's possible to do pretty much everything any real mathematician needs to do.

The point of all that rambling is that the approach you have in mind, thinking of things as being constructed from simpler things, and of having "sets" and "sets of sets" etc., as different kinds of things, is very reasonable; some bits of it match up exactly with what happens in the usual practice of set theory, and some bits don't but match up quite well with other viable ways of doing set theory.

Thanks for the explanation, especially the part about different set theories. My knowledge of set theory is very unsystematic -- I have picked up and read a few books, but never attended actual lessons. So there are moment where I read some words and wonder "did they actually mean X or Y?" but I have no one to ask. I wonder whether reading a few more books would fix the problem. Looking at random comments e.g. at StackExchange sometimes fills a gap, but this happens unpredictably.

So far I am only familiar with ZF. I have a strong suspicion that there are models of ZF which cannot be created by the infinite iterative construction you described. Like, models that do not even correspond to anything meaningful; monstrosities that technically follow the letter of the ZF axioms but have nothing in common with "sets" and "elements" as any sane person might imagine them. (Because of the first-order logic, which is unable to keep the monstrosities out... for reasons that I still don't grok, but hopefully one day I will.)

I wish I could give this topic more time and attention, unfortunately, real life gets in the way.

In ZF, every set is somewhere in the hierarchy I described; this is a consequence of the Axiom of Foundation. (Which doesn't explicitly say "every set is in that hierarchy", but it turns out to be an equivalent proposition. The axiom is sometimes called Regularity rather than Foundation.)

There are other versions of set theory in which not every set lies in such a hierarchy. ZF is an improvement on an earlier attempt at axiomatizing set theory, due to Zermelo and usually called Z. (Same Z and same Zermelo as in ZF = Zermelo-Fraenkel.) One of the key differences is that Z didn't have an axiom of foundation/regularity, so in Z that hierarchy need not contain all the sets. (Another key difference is that Z lacked what's now called the Axiom of Replacement, which I think has the consequence that you can't actually construct the hierarchy in Z.)

There are even set theories explicitly designed to make sure they don't obey a foundation/regularity axiom. For instance, Aczel has suggested working in ZFC (ZF + the axiom of choice) with the foundation axiom replaced with a so-called "anti-foundation axiom" that effectively says that every connected directed graph is a "picture" of a set (an edge a->b meaning that b is an element of a), so e.g. a graph with just one vertex a and an edge a->a corresponds to a set a with a={a}.

If ZF is consistent, it has lots of models with a variety of properties (e.g., in some of them the continuum hypothesis holds, and in some it doesn't). I don't know whether you'd consider any of them monstrosities. Personally, I'd say that if a system obeys the axioms of ZF then its elements have something in common with sets and elements as usually imagined; e.g., all the theorems you can prove about sets and elements in ZFC will apply to them. Of course the membership relation in that system might be something entirely other than actual set membership, but I don't see that that's a problem. And there are curiosities such as the existence of a countable model of ZF (countable "from the outside" only, of course) which you might perhaps consider a bit of a monstrosity.

In this context, by "monstrosity" I meant some technically-a-set that couldn't be constructed. Like the a = {a} example, only I suspect that you can create an example of that type that would technically satisfy the Axiom of Foundation. I am probably wrong here, but I need to think about this more until I see how exactly I am wrong. (That will probably take weeks.)

But that "every connected directed graph is a picture of a set" is the kind of perspective I have in mind. Being given the entire structure of sets, interconnected, at once. Now the question is whether the structure can be designed in a way that technically satisfies the Axiom of Foundation, while somehow is not construable, e.g. because it is infinite in both directions. So no set contains itself, directly nor indirectly, it's just a line of sets infinite in both directions, like the integers, where each set contains an infinite amount of "previous" sets; and probably also something else, to make the Axiom of Foundation happy. -- Chances are, I am just talking complete nonsense here.

One formulation of the axiom of foundation says that there's no "infinite descending chain" of sets. That is, if you have a set a with an element b with an element c with ..., then this cannot go on for ever. So at least some versions of your "infinite in both directions" are precisely ruled out by Foundation.

The version from Wikipedia seems okay with an infinite descending chain of sets, as long as each of them would also contain e.g. an empty set. In combination with other axioms, though... well, that's where my knowledge of set theory becomes insufficient.

Okay, here is one of the questions I couldn't answer by self-study; let me use this opportunity to ask you: The Wikipedia page on Axiom schema of replacement says:

Suppose P is a definable binary relation (which may be a proper class) such that...

What exactly does "definable" mean in this context?

I am asking precisely because I want to play a "this is not a set in my model of set theory" card with regards to the set of all sets in the infinite descending chain, and I am not sure whether there is or isn't a loophole I could use.

I think you may be misunderstanding the definition in the Wikipedia article, which is understandable because the formulation there (which is the usual one) is rather hard to get one's head around.

So, this version of the axiom of foundation says: every nonempty set has an element disjoint from it. What does that have to do with infinite descending chains of sets? Well, suppose you have a containing b containing c, etc.; consider the set A = {a,b,c,...}. I claim that every element of this set fails to be disjoint from A. Consider, e.g., f. This contains g, and so does A, so f and A are not disjoint.

The usual way of formulating the replacement schema goes like this. Suppose you've got a formula P in the language of set theory, which has free variables x, y, A, and some finite number of w1,...,wn. (The idea is that this is going to specify a function on A, parameterized by w1,...,wn, with P meaning that the function maps x to y.) Replacement says that for each such formula the following holds: "For all A, w1, ..., wn, if for all x in A there is a unique y such that P, then there is a set Y such that for all x in A there is a y in Y such that P". In other words: if P describes a function on the set A, there's an actual set Y containing the values of that function.

consider the set A = {a,b,c,...}

What if I am an asshole mathematician, and I insist that there is no set {a,b,c...}?

I mean, I know that it exists, you know that it exists, but I propose a model of set theory where individual sets a, b, c... exist, but there is no set {a, b, c...}. Can you prove there is one?

Wikipedia -- if I understand it correctly -- assumes that there is a function 0->a, 1->b, 2->c... and uses axiom of replacement. Following my passive-aggressive strategy, I insist that there is no such function in my model either. (We see the function from outside, but there is no in-universe set {{0, a}, {1, b}, {2, c}... }.) Can you construct one, if you are only given the sets in the chain with no "metadata"?

Specifically: V(0) = an infinite descending chain {a, b, c...} such that a ∋ b ∋ c...; V(1) = everything you can directly build from V(0) using ZF axioms (all subsets, pairs, unions, replacements...); and so on, as usual.

This is probably my Dunning–Kruger moment, but I'd appreciate if you'd play along. Does this attempt to create a set universe contradict the ZF axioms somewhere (where exactly?), or is this a model containing an infinitely descending chain where the axiom of foundation is still technically true.

(Meta: I probably have no further questions. Except perhaps if I wouldn't understand some part of your answer. This is the most complicated thing I was able to think about set theory so far.)

EDIT:

Tried to do my homework, here is the part where I got stuck:

Imagine that you have a countably infinite amount of ur-elements. Is it possible to have a universe that would be a model of ZF with these ur-elements, but wouldn't contain a "set of all ur-elements" as a set?

(For example, because it would only contain sets that contain, however indirectly, only a finite number of the ur-elements. That is, the set of ur-elements is countable from our perspective, but not necessarily in-universe. As Wikipedia says: "there is no absolute notion of countability".)

Because if that is impossible, then if you take the infinite descending chain a, b, c..., the universe would also have to contain the set {a, b, c...}, which contradicts the Axiom of Foundation. So, what I tried, is really impossible.

On the other hand, if it is possible, then let's take the model of ZF that contains the ur-elements a, b, c..., but does not contain the set {a, b, c...}, and now replace the ur-elements with sets from the infinitely descending chain. I believe that what you would get after replacement, would satisfy all the ZF axioms. (I could try to prove it, but if the answer to the question above is "no", it would probably be a waste of time.)

My other comment was written before I saw your edit. Here are some remarks on the edited bit.

ZF can't have urelements, if that means objects that aren't sets but can be elements of sets; everything in ZF is a set. Nor if it means things other than "the empty set" that have no elements; the axiom of extensionality means that anything with no elements is the empty set.

It's possible that you mean something else by "urelements", but rather than trying to guess what I'll let you tell me.

Yes, by "urelements" I meant "elements that are not sets". However, this was just a different way to express the question that if we treat a, b, c... as black boxes, i.e. ignoring the question of what is inside them; whether from the facts that a is a set, b is a set, c is a set... inevitably follows that {a, b, c...} is also a set in given model of set theory.

For example, using the axiom of Pair, we can prove that {a, b} or {g, x} are also sets. Using also the axiom of Union, we can prove that any finite collections, such as {a, b, g, h, x} are sets. But using Pair and Union alone, we cannot prove that about any infinite collection.

The axiom of Infinity only proves the existence of one specific infinite set: ω.

My guess is that using all axioms together, we can only prove existence of sets that involve a finite number of a, b, c...

This needs to be put a bit more precisely, though, because by the very fact of them being an infinitely descending chain, including one of them means involving an infinite amount of them; for example {{}, c} is simultaneously {{}, {d}}, which is also {{}, {{e}}}, etc. But suppose that we stop expanding the expression whenever we hit one of these chain-sets. Thus, the set {{}, c} would involve c explicitly, but d, e... only implicitly.

Now we can rephrase my guess, that using all axioms together, we can only prove existence of sets that involve explicitly a finite number of a, b, c... That means, {a, b, c...} is not among them.

Axioms of Existence, Infinity, Pair, and Powerset do not increase the number of a, b, c... used explicitly. Axioms of Union, Comprehension can only "unpack" these sets one level deeper. -- Union applied to c would only prove the existence of d. Comprehension applied to c would result in either empty set or d. Each of them applied to a structure that explicitly contains a finite number of a, b, c... would result in a structure that also explicitly contains a finite, albeit maybe twice larger, number of a, b, c...

Axiom of Foundation should be okay with such structures, because even if you have a set containing finitely many of a, b, c..., just choose the one furthest down the chain. For example, for set {{{}}, a, c, e}, choose e = {f}, and the intersection is empty.

How to construct the universe: Take a ZF universe, and for each set in that universe also create all possible sets that replace some of the empty sets in the structure by sets a, b, c... under condition that during each replacement you only use a finite number of a, b, c... (but with each of the selected finite few, you can replace an arbitrary, finite or infinite, number of the empty sets in the original structure). My hypothesis is that this, together with the rules a = {b}, b = {c}... is also a ZF universe, and it is one containing an infinite descending chain of sets (a, b, c...).

You can only ever prove something about specific objects about which nothing else is known if there are only finitely many of them. Because, in the situation you describe, the only way we can say anything about them is by naming them explicitly, and any proof has only finitely many symbols in it.

But e.g. the following could be true in some system: if you have a set that ("really") begins an infinite descending chain, you can prove the existence of a set containing all the elements in some infinite descending chain, even though you couldn't do it if all you had were names for those elements and no information about the relationships between them.

I don't think I understand your prescription for a Strange Universe. What do you mean by "all possible sets that replace some of the empty sets in the structure"? In an actual ZF universe there is only one empty set.

What do you mean by "all possible sets that replace some of the empty sets in the structure"? In an actual ZF universe there is only one empty set.

I meant empty sets in the, uhm, description/graph of the set. For example, in the description of the set {{}, {{}}}, there are two instances of empty set: here {{}, {{}}}, and here {{}, {{}}}.

Visually, if you would draw the set as a tree -- the topmost node represents the set itself, each node has below it nodes representing the elements (if two nodes have the same set for a subset, it would be drawn on the diagram below each node separately; we are making a tree structure that only splits downwards, never joins) -- then every set in standard model of ZF is a tree, with some nodes having infinitely many nodes directly below them, but each individual path downwards is finite. And each path downwards ends with the empty set = a node that does not have any nodes below it.

By "replacing some empty sets" I meant replacing some of those nodes at the bottom with the sets from the infinite chain a, b, c... For example, from the set {{}, {{}}}, you would achieve {a, {{}}} by replacing the "first" empty set; {{}, {a}} by replacing the "second" empty set; and {a, {a}} by replacing both empty sets. You would do this for all sets from the chain, even combinations like {a, {b}}.

The only restriction is that when the graph of the set contains an infinite number of empty-set-nodes, for example in {{}, {{}}, {{{}}}, {{{{}}}}... }, you can replace either finite or infinite number of them, but you may only use a finite number of different sets from the descending chain. So for example, you could infinitely many "a"s and infinitely many "b"s; or perhaps two "a"s and infinitely many "b"s; or just two "a"s and three "b"s; but you cannot use infinitely many different letters. So it is allowed to create {{}, {a}, {{{}}}, {{{b}}}... }, or {a, {a}, {{a}}, {{{a}}}... } (with infinitely many "a"s), or {a, {b}, {{a}}, {{{b}}}... } (with infinitely many "a"s and "b"s), but not {a, {b}, {{c}}, {{{d}}}... } (with infinitely many different letters used).

I think that the class of sets created this way satisfies the ZF axioms.

EDIT: I will try to send you an e-mail during this weekend, because this definitely makes more sense with pictures, at least in my head. Thank you for your patience so far!

Ah, OK, I think I now understand your intended construction. I'm trying to figure out whether it satisfies the ZF axioms, but right now it's past my bedtime. One thing that definitely isn't true is the following stronger version of "if Foundation holds in the original model then it also holds in your universe": Write x->y to mean that set y in your universe is obtained from set x in some other model of ZF by "replacing some empty sets with things from a,b,..."). Then (I repeat: this is a thing that isn't true): If x has an element w disjoint from x, and x->y, and when x->y w turns into z, then z is disjoint from y. So if Foundation is true in your universe it's not for the very most obvious reason. (Counterexample to that stronger claim: let x = {{{}}, {{},t}} for some choice of t, let w = {{},t}; let y = {{b}, {a,t}} so that w -> {a,t}; then although w is disjoint from x it isn't true that z is disjoint from y, because both contain a={b}.

I'll return to this tomorrow, if I find the time, and think some more about whether your universe is guaranteed to be a model of ZF...

Well, what is an infinite descending chain? It's precisely a sequence  such that . Which means, indeed, a function with the required property. And then, yes, you can use Replacement to guarantee that there's a set containing the .

Now, if I understand you correctly, what you're asking is whether the following scenario is possible. We have a model M of the axioms of ZF. (So, out in the "real world" we have a set S providing the "sets" in M, and a relation R corresponding to set membership in M, so that R(x,y) means that x is an element of y in M.) In the model there's a set A, and "really" (i.e., outside, not inside, the model) we have a sequence (i.e., a function from the natural numbers, but we write it with subscript notation)  of elements of S such that . So with our god's-eye perspective we can see that A "has" an infinite descending chain, but inside M there is no such thing; there is no element of S that (within the model) is a function from the model's natural numbers such that, etc. And there is no element of S that contains-in-M precisely the elements of our descending chain.

The answer (which was not obvious to me, but it's years since I actually studied this stuff) is that this is possible, and for very much the sort of reason you have in mind.

Suppose we have a model of ZF. It satisfies the axiom of foundation. And suppose, for reasons I'll explain in a moment, that in the "outside world" the axiom of choice holds. Now we construct what's called an ultrapower of our model. The details are a bit intricate; here goes. Say that a "filter on the natural numbers" is a set of sets of natural numbers such that (1) if X is in the filter then so is anything that contains X and (2) if X and Y are in the filter then so is their intersection. The idea is that a filter is a "notion of largeness"; for any filter we can call sets in the filter "big" and sets not in the filter "small" and this will be a somewhat-reasonable use of language. Now say that an ultrafilter on the natural numbers is a filter that can't have any more sets added to it while remaining a filter, other than by making literally every set of natural numbers "big". Example: {sets containing 17} is an ultrafilter. (Proof: exercise for the reader.) Ultrafilters of this "easy" kind -- {sets containing n} for some fixed n -- are called principal ultrafilters. Are there any other ultrafilters? It's not obvious. There is no way to construct one, but if the axiom of choice holds then indeed there are. OK, suppose we've got an ultrafilter. Now, consider all infinite sequences of elements of S and say that two such sequences  are the same if  is "big" (i.e., is an element of the ultrafilter). This set, modulo this equivalence relation, is what we call an ultrapower of S. We can extend this to the whole model M by saying what it means for one element of the ultrapower to be an element of another:  is an element of  iff  for a "big" set of . Now an amazing thing is true: any sentence in the language of set theory that's true in M is also "true" of the ultrapower. In particular, the axiom of foundation is "true". BUT it turns out that the natural numbers in M are strange in exactly the sort of way we need. Consider the sequence (0,1,2,3,...). This is bigger than (0,0,0,...) and bigger than (1,1,1,...) and bigger than (999,999,999,...), etc. (Because the corresponding per-sequence-element is true at all but a finite set of positions, and any finite set is "small".) It behaves like an "infinitely large natural number". We can subtract 1 from it, getting (0,0,1,2,3,...) which is smaller. We can subtract 1 from that, getting (0,0,0,1,2,3,...) which is smaller still. (There's no particular reason other than convenience for using 0 for the elements we can't actually subtract 1 from. Remember, any "small" change to the sequence is ignored, and in particular for any finite set of 'em it doesn't matter what entries we use.) And so on. So we have an infinite descending sequence of natural numbers -- infinite "from the outside", that is. On the inside it's no such thing; it can't be, because "there is no infinite descending sequence of natural numbers" is a fact about M, readily expressed in the language of set theory, and therefore true in the ultrapower too. (So, in particular, that nice simple sequence of elements is not a set in the ultrapower.) And if we "implement" natural numbers in the usual way, "smaller than" for natural numbers is exactly the same relation as "element of", so in fact this very thing is an externally-infinite descending sequence of sets.

Yes, you understood my question correctly... and I need to spend some time thinking about your answer. (Mostly because it's past midnight here, so I am leaving my computer for now.)

Thank you! It was a pleasure to be understood -- unlike when I e.g. post a question on Stack Exchange. :D

Note that the GoodReads list doesn't look like it's been updated in many years. https://intelligence.org/research-guide/ adds Wasserman's All of Statistics and Shalev-Shwartz's Understanding Machine Learning, it replaces Rosen's Discrete Mathematics and Its Applications with Lehman's Mathematics for Computer Science, and it swaps out Bostrom's Global Catastrophic Risks for Yudkowsky's Inadequate Equilibria.

Within surreal number lingo the "construction ordering" is often referred to as "birthday" and -5 being younger than -10 becomes a thing one can prove. 1/2 has birthday just after 1 but 1/3 has birthday ω.

From the perspective that integers are sets of integers sets of integers do not come "after". That is if you first create the integers and then you try to form {0,1,2} that is not a new construction as 3 has already been formed. In that perspective "last integer" and "last set" are not that different. There is the distinction of a limit ordinal. If "natural numbers" is a set then making the singleton {natural numbers} can also be thought as a successor ordinal ie ω+1 and how the set "natural numbers" can be thought as the ordinal ω. So there is a process where one can think of giving infinite time to nest sets. One can even give another round to get up to ω*2.

In the other direction one can assume as little magically given base material as possible. Having nothing but how to make new numbers one can get 0 as the limit ordinal of ex nihilo.

Yeah, the surreal numbers are somewhat similar to my intuition, but not the same. Kolmogorov complexity is probably closer. But I don't have anything precise in mind. Just a feeling that "first numbers, then sets of numbers" makes sense on some level, but there is no way to make the same sense about (all) sets of sets.

I just started reading the reviewed book. Thanks for inspiration!

Just because you personally don't understand doesn't mean that somebody else could not.

But you might also be getting to the distinction that "all sets" is a proper class whereas numbers can be put into a set and are thus a small class.

What does "For now we will not consider such x as described by S(x):x∉x as sets." mean?

From what I understand, such predicates seem to be causing trouble. For example, the result that no set contains everything seems like too strong a result at this point.

From the book: "To specify a set, it is not enough to pronounce some magic words; it is also necessary to have at hand a set to whose elements the magic words apply". Magic words basically mean the predicates S(x).

The book says such x's don't constitute a set and calls them illegal. It also mentions that class is the word to describe such x's and that classes are irrelevant in its approach to set theory. 

Perhaps when I read further I'd be able to reason better.

There is a whole interplay about intensions and extensions which might clarify but also migth confuse. If the rule is unambigious in which of the members survive to the filtered set things are clear. The possible "selections" one might call the extensions. With {a,b,c} the options are {},{a},{a,b},{a,b,c},{b},{b,c},{c}. The way they are specified one migth call the intensions. If both rules "x is small" and "x is red" give out the same {a,b} selection one might try to argue that it is two intensions picking out the same extension.

If your base set is {} then providing any wild intension can only produce a {}. If you have a meaningful intension but do not provide a base set to select from you might have requirements but you don't have any members. "x is small and red" migth alwsays produce a narrower selection than "x is small", but if you don't know if you are selecting from {a,b,c}, {1,2,3} or {aleph,teth,zayin} you can't be sure whether a given entity is a member or not. So there is no subset because there fails to be any members. So a selection rule itself can't constitute a set.