User:Marc Goossens/elements of set theory

Constructive sets (as strings of symbols) and structuralist physics
(comment to user Tastyummy 070309)

Hi, maybe the following is of interest to you.

In the above talk, a good amount of semantics is attributed “a priori” to the notion of sets, and their possible application to model (or speak about) the world surrounding us. There is an approach which avoids this, and allows any physical semantics for sets to be recovered (or indeed “added”) later on.

As shown by Bourbaki, set theory may be constructed by adopting rules for constructing, classifying and manipulating strings of symbols from a basic alphabet. This approach has been refined by Edwards and again by Schröter. See e.g. Edwards, R.E., “A formal Background to Mathematics”, Springer 1979.

The result is like a symbolic game with symbols, which mimics (of course: by design!) the “mathematically useful behavior” one expects from classical set theory, which in turn abstracts intuitive notions or expectations. For example, like any other Bourbaki-Edwards (B-E) set, the empty set is simply defined as on particular string of symbols. And lo and behold, when this string is “inserted” into the agreed (defined) string-manipulation schemes for set operations like union, intersection, … it nicely conforms to what we want. Yet the entities (sets) built in this way, are no more than (abbreviations for) strings complying with certain agreed rules, and are at this point devoid of any further meaning.

Note: any abstraction always involves stripping off semantics and reducing, simplifying behavior, on the other hand, any such construction may also bring in some artifacts, so that the construct should never be naively identified with what it’s supposed to model: the model may work only to some extent, if at all.

B-E go on to employ this notion of sets as a basis for mathematical structure-type (or “species of structure”) theories, that cover the bulk of mathematics. Apart from naked sets, they include all sorts of stuff like the natural numbers, other “number sets”, the familiar algebraic structures (groups, fields, rings, vector spaces, algebra’s, …), topological and measure structures, and combinations, extensions and variations thereof (operator algebra’s, manifolds like pseudo-Riemannian spacetimes, etc.).

Let’s keep in mind that, again by design, none of these mathematical concepts carries any interpretation or semantics: everything is but a purely formal game. This is true, regardless of the intuition and heuristics that have guided the choice of the defining axioms for these structures. Once the structure-axioms stand, they are preferably consistent, at best suitable for doing some nice math and with some luck even fun.

Note: the structure-type notion is categorical in nature, but does not go the full length category theory does. It is, in a sense, more cautious, which seems appropriate when probing deeper philosophical questions as to the how and why, scope, ontology etc. of such a tremendously powerful human art as is physics.

This is as far as B-E mathematics goes (which is pretty far). So what about physics? A general way how physical interpretation may be appended to all this was proposed by Ludwig. This is one of the “structuralist” programs for theoretical physics, as listed in Stanford Encycopedia.

According to Ludwig, at the core of any Physical Theory (PT) resides a B-E Mathematical Theory in the above sense, which is to serve as a model for some excerpt of reality. Observe that for each (attempted) PT, its MT is chosen or “proposed”, postulated, if you like; it is never formally inferred.

Next, one goes on to specify the “known inputs” for the PT. These consist first of appropriate templates for “observational statements”, which are accepted as relevant for the intended PT. In general, any concrete (experimental) observation may be formulated in set-theoretic language as “the constant a is an element of the set B”. Each observation leads to a formal such sentence, which is appended to the list of axioms of MT, thus extending it.

Actually, in order to make the MT into a PT, one also has to specify which “basis sets” in MT may appear in the observational statements. This convention, together with the “input templates” are of course additions to the formal scheme, outside of the MT considered. As such, they are “meta” notions. Together, they constitute the PC’s “mapping scheme” or “mapping principles”.

What we also have to do, is to describe in “natural language”, which parts of nature are adopted as “known inputs” for the intended PT. This is referred to as the “(fundamental) domain” of the PT.

In other words, the basic ingredients of a physical theory PT are given by the triple (domain, mapping scheme, MT). For the most elaborate and in-depth development, see Schröter, J., “Zur Meta-Theorie der Physik”, W. De Gruyter, 1996. A nice summary is given on Martin Ziegler’s page.

Ideally, the structural axioms of the MT used, have an immediate physical interpretation. This is for example not the case for the Hilbert space axioms, as used in quantum mechanics. This indicates that Hilbert space is only an ancillary mathematical structure, possibly fine for practical calculations, less so for proper understanding. (In fact, Hilbert space is just the carrier of a – very practical - mathematical representation of other MT’s that do permit more direct physical interpretation.)

Another interesting footnote is that to the extent that the Ludwig program works (as is essentially established for physical models like General Relativity and non-relativisitic Quantum Mechanics), it also shows that traditional binary logic is sufficient for physics (as all B-E mathematical theories are based on it).

Formal Bourbaki-Edwards set theory in a nutshell
The quintessence of the Bourbaki-Edwards approach is: Here’s the B-E recipe:
 * 1) Mathematical models have no (need of) any semantics outside mathematics itself.
 * 2) So there is not much to settle. The rules of the formal game are to some extent arbitrary, or at least subject to choice and variation (by the human players).
 * 3) The only requirement is that the output of the game (or any sub-game of it) be “useful” in the sense of “consistent” = free of internal (!) contradictions.

A. Setting the scene: Bourbaki languages
1.	We select a primitive alphabet, consisting of:
 * formal letters a, … z, a’, …, z’, a’’, … (infinite supply)
 * logical symbols 	$$ \lnot, \lor, \tau, \Box $$
 * the special symbol $$\in $$ (foreshadowing that we will want to work with sets)
 * one needs connector arcs to link $$\tau$$ the placeholders it “binds”
 * Importantly, apart from a suggestive name or sign, none of the symbols has any a priori meaning. Any “meaning” will arise later on, but merely mechanically: in the form of mimicking the behavior our expectation suggests.

2.	These symbols may be concatenated into strings (sequence of symbols).

3.	To permit practical work, we require some machinery for abbreviating and manipulating strings:
 * (Practical convention: strings may be represented / abbreviated, e.g. by letters A, B, … not belonging to the primitieve alphabet.)
 * $$AB$$ is the concatenation of the strings (represented by) A and B
 * $$A ~ \operatorname{\mbox {id.}} ~ B$$  means A and B denote the same string (clear a meta-statement relative to the formal system).
 * Tau-transform or substantivation of a string A relative to a letter x: $$\tau_x(A) $$ (operation effectively removes all occurrences of x from A, puts $$\Box $$ placeholders in their stead and connects these with arcs to a prepended $$\tau $$ symbol. (This is the Hilbert operator.)
 * Substitution-transforms to replace substrings with other strings.

4.	In order for this to lead anywhere useful, we need to get strings sorted out.
 * We lay down a meta-rule (convention) that distingusihes between I-strings or “terms” and II-strings or “statements”. Each suitable string is either I or II. The intention is that terms will behave / be treated as “nouns” (say “objects”), whereas statements behave like (surprise!) “statements” (about objects). However, this is no more than a metaphor to help us players understand the game better.
 * Another meta-rule provides a machine for handling lists of strings, allowing us to produce new “valid” strings, from preceding ones. Any string written down according to this procedure is called an assembly or construct. This rule equips B-E with a notion of well-formed formula: assemblies are “ok to proceed with”, any other strings are mere garbage, and ignored henceforth.
 * This all looks more daunting than it is. In fact boring stuff. Mind you: formal string manipulation can be fun; see for instance the metamath site. (Note: the formal system employed by Metamath is not identical to B-E.)

B.1. Some practicalities
Other formal languages such as first order predicate calculus typically cut short most of the above, and come to the present point rather more hurriedly. B-E gains explicitness (especially of reduction to mechanics of symbolic strings), at the expense of nitty-gritty. We now proceed to inject these aspects by extending the language.

5.	First some housekeeping:
 * For clarity and readability, introduce some auxiliary (!) logical symbols and abbreviating signs (the classics!): $$\land, \Rightarrow, \iff, \exists$$ and $$\forall$$. It is essential that this step is merely cosmetic, and introduces noting new.
 * (Following Edwards, one distinguishes typographically between abbreviations for terms and those for statements.)

6.	We now allow (statement) schemes. This is a meta-expression (oustide our formal system) which acts like a template; it containins placeholder symbols (“variables”) for letters, terms or statements, thus generating statement strings within our formal system, by replacing the placeholders with some actual letter, term or statement. (A scheme may be “optimized”. Curiously, Bourbaki and Edwards here opt for totally opposite preferences.)

B.2. Getting ready for math: proof schemes; proofs, theorems
7.	A notion of equivalence (“Eq.”) is introduced for finite meta-language constructs.

8.	Take a finite list L of statements and statement schemes. A proof scheme over L, is a meta-theorem as follows: “L proves the statement A, if there is a finite list L’ of statements, fulfilling (a suitable list of requirements).
 * L’ is called a proof of A.
 * A is called a theorem over L, if a proof L’ exists, with A as its last statement.

B.3. Still getting ready for math: theories
9.	In order to perform maths (algebra, topology, geometry, you name it), we need the possibility to construct a plethora of different theories. In addition to the language system Sigma already built up, any one theory T consists of
 * a given, finite list L of statements and statement schemes in Sigma; these are the explicit axioms and (Axiom) Schemes of T
 * a proof scheme Gamma over L
 * A statement generated according to an axiom scheme of T is called an implicit axiom of T. Any letter that appears in L is a called constant of (or in) T, any other letter is called a variable of (or in) T.

C. Still getting ready for math: introducing formal logic
We now set out a specific theory, which we shall employ as a basis for (i.e. part of) all further theories considered (in a way to be made precise).

10.	The theory consisting of B-E language, complemented with the following 5 axiom schemes, and no further explicit axioms, is called a B-E formal logic.
 * $$ (A \lor A) \Rightarrow A$$
 * $$A \Rightarrow (A \lor A) $$
 * $$ (A \lor B) \Rightarrow (B \lor A) $$
 * $$ (A \Rightarrow B) \Rightarrow (( C \lor A) \Rightarrow (C \lor B)) $$
 * $$ (S|x)A \Rightarrow (\exists x) A$$
 * (We’ll call this list :LOGIC:)


 * The expression (S|x)A is the statement generated by replacing all occurrences of the letter x in the statement A by S, where S is a term.
 * Importantly, any “meaning” of the logical symbols (both primitive and other) like $$\lnot, \lor$$ and $$\Rightarrow$$ is fully and solely laid down by their behavior in line with above schemes.
 * Similarly, any notion of “truth” for statements corresponds to the notion of theorem, relative to a given formal theory T. Nothing more, nothing less. This rests upon the concept of proof scheme as per 8. above, as driven by the axiom (schemes) of T. Henceforth, we will always adopt specifically :LOGIC: for L, and call any T stronger than it a logical theory.

D. Almost there: set theory
11.	A few more auxiliary concepts are needed, before we can fix the axioms of set theory. Observe that their definition does not yet require the idea of sets.
 * Equality of terms: defines $$S = T $$ as abbreviation of a string.
 * If S and T are terms, the above string becomes a statement, to be understood as expressing equality between the (intended) sets S and T.
 * Collectivation: defines $$ \mbox {Coll}_x A $$ as abbreviation of a string.
 * If A is a statement, so is the above string. In our formal system, it is intended to represent the claim that A ’’collects” over x in the following sense: the collection of all x fulfilling the statement A may be regarded as a set, without risk of “paradoxes”. That this is effectively so, will be ensured by the set theoretical axioms put forward below.
 * Set builder: defines $$ \lbrace x~:~ A \rbrace $$ as abbreviation of a string.
 * Intuitively: the set of all x fulfilling A.
 * Containment: defines $$x \subset y $$ as abbreviation of a string.
 * Union: defines $$\cup X$$ as abbreviation of a string.
 * Empty set: $$\varnothing $$ $$ ~ \operatorname{\mbox {id.}} ~ \tau_x((\forall z (z \notin x)) $$ $$ ~ \operatorname{\mbox {id.}} ~ \tau^1 \lnot \lnot \lnot \in \tau^2 \lnot \lnot \in \Box^2 \Box^1 \Box^1$$ (superscripts replacing the connector arcs, which are a bit of typographical challenge)

12.	So finally now… the axioms of set theory: We propose the following 3 axiom schemes, extending :LOGIC: :


 * 1) $$ (S = T) \Rightarrow ((S|x)A \iff  (T|x)A) $$
 * 2) $$ (\forall x) (A \iff  B) \Rightarrow (\tau (x)A = \tau(x)B) $$
 * 3) (a tricky one)


 * Supplemented by the following 3 explicit axioms:


 * 1) $$ (\forall x) (\forall y) ~ \mbox {Coll}_z ((z = x) \lor (z = y)) $$
 * 2) $$ (\forall x) ~ \mbox {Coll}_y (y \subset x) $$
 * 3) $$ (\exists y) ((\varnothing \in y) \land (\forall x) (x \in y) \Rightarrow (x \cup {x} \in y)) $$