Wikipedia:Reference desk/Archives/Mathematics/2011 January 29

= January 29 =

ZFC and its implications
As the modern foundation of mathematics, is it possible to prove all provable mathematical statements from ZFC? And if so, doesn't that mean that other axioms like the Peano axioms are not true axioms because they may be derived from ZFC? 220.253.245.51 (talk) 08:18, 29 January 2011 (UTC)


 * No, that's far too fundamentalist an approach. Axioms are not objectively true or false; they're something you decide yourself whether to work with or not. How it works, roughly, is that mathematicians have some fuzzy, intuitive idea of what they think should be provable, and then they try to construct axiom systems that allow that to be proved without also proving something that the mathematician strongly feels should not be provable, such as 0=1. Doing so is not easy, because there is no guaranteed way of making sure that your axiom system doesn't prove something absurd -- but it's the best we have.
 * The fact that Peano's axioms can be derived from ZFC (or more formally, that they can be proved in ZFC to be consistent) does not make the Peano system lesser than ZFC. If anything, it makes Peano more trustworthy than ZFC because if Peano should out to be inconsistent, then goes ZFC too.
 * When we get close to such foundational issues, we hit an area where intelligent, insightful mathematicians can and do disagree about what is reasonable. Most mathematicians probably do think that there is such a thing as fundamental, objective truths about the natural numbers (and that the Peano axioms are among those truths, so we can trust things proved from them). But as soon as we get to sets -- particularly infinite sets -- some uncertainty creeps in. It is practically inconceivable that a contradiction exists among the Peano axioms, but a contradiction in ZFC is merely very unlikely -- it would be a tremendous surprise and leave mathematics in disarray, but it's not going to drive anybody crazy from the knowledge, and many have thought about how we'd go about rebuilding everything.
 * It is also not true that ZFC allows one to prove everything one would like to prove -- in particular, category theory would be much nicer and smoother if you could speak about things such as "the set of all possible complex vector spaces", which are not allowed to exist in ZFC. There are various ways to side-step this issue within ZFC (somewhat kludgy but mostly technically adequate), and some proposed axiom systems different from ZFC that would be easier to do category theory in (such as Quine's NF, but there is much less trust in those being contradiction free than for ZFC). –Henning Makholm (talk) 10:31, 29 January 2011 (UTC)


 * I would agree with some aspects of the above but not all of it. Most mathematical realists believe that the axioms of ZFC are objectively true in their intended interpretation, which is given by the von  Neumann hierarchy.  However they are far from all that is true in the von Neumann hierarchy.  In particular large cardinal axioms should in general be true.
 * You need very minor large cardinals (say, a proper class of inaccessible cardinals) to work one of the kludges Henning is referring to (the so-called Grothendieck universes). Those are pretty harmless in terms of consistency; hardly anyone expects them to result in a contradiction.  ::There's more controversy about the more interesting ones, say measurable cardinals or Woodin cardinals or supercompact cardinals.  In some sense they are more interesting precisely because they "take more risk" of being inconsistent.  This fits in well with Popperian falsificationism.  --Trovatore (talk) 09:52, 30 January 2011 (UTC)


 * I was trying to avoid restarting the unproductive shouting match about who gets to call himself a "realist", by setting a low bar for what everyone is supposed to agree on. :-) As for creeping doubt, I can only say that I personally feel deeply in my liver and spleen that the Peano axioms must be right, and reserve the right to go raving mad with ontological vertigo if they are demonstrated to be inconsistent -- whereas I simply cannot muster the same degree of visceral certainty for the proposition that ZFC's power and replacement axioms are safer than Cantorian set comprehension. I do pragmatically believe the latter, but more due to the failure of all efforts to falsify it experimentally than because it intuitively must be so. –Henning Makholm (talk) 10:26, 30 January 2011 (UTC)
 * FWIW I apologize if I was a participant in the shouting match you're referring to. I will try, as best I can, to move 2-person discussions off the reference desk in the future. &mdash; Carl (CBM · talk) 13:14, 30 January 2011 (UTC)
 * Apology accepted, though I'm quite sure none was due. Yes, I thought that debate was tediously going in circles, but we might as well close the refdesk down if anyone had a right not to see conversations they found tedious. –Henning Makholm (talk) 13:29, 30 January 2011 (UTC)
 * Have to call you out on another issue there &mdash; the "Cantorian set comprehension" thing. I take it you are taking the position that Cantor believed in unrestricted comprehension.  That is not in fact clear at all.  There is a substantial debate over what Cantor thought and when he thought it; the so-called "paradoxes of naive set theory" may in fact not attach to Cantor's viewpoint at all, at least to the later Cantorian viewpoint. --Trovatore (talk) 10:31, 30 January 2011 (UTC)
 * Oh, I was not purporting to convey biographical information about Georg Cantor, merely using "Cantorian" as a conventional label (because, in the heat of the moment, it did not occur to me simply to call it "universal" comprehension). It's entirely possible that it's not historically well-founded; if so I'd assign some tentative blame on Hilbert with his famous quip about paradise –Henning Makholm (talk) 10:55, 30 January 2011 (UTC)
 * I really don't see why. Why do you want unrestricted comprehension?  As far as I can tell it's simply a category confusion; it comes about from confusing the extensional notion of set with the intensional notion of class (i.e. predicate). --Trovatore (talk) 11:03, 30 January 2011 (UTC)
 * Huh? I don't "want" unrestricted comprehension. It was just an example of an idea that looked pretty neat at first, but then turned out to be disasterous. The real point was that my intellect, however proud I might be of it otherwise, is too feeble to really grok the benignness of ZF replacement. This may be a failing of mine, but I'm arrogant enough to suppose that I can't be the only one suffering from it. –Henning Makholm (talk) 11:21, 30 January 2011 (UTC)
 * Oh, well, I was casting around for your meaning wrt the "paradise" thing, and what I came up with was "paradise is where what we want to be true, is true". So from that I got that you wanted unrestricted comprehension to be true.  Sorry if that's not what you meant.
 * Your point about failing to refute thing experimentally, despite much effort, is part of what I was talking about with large cardinals. No one I know of thinks that it's self-evident that, say, Woodin cardinals exist.  That doesn't mean people don't think it's true.  But it's a discovered truth, not a self-evident one.  I think we have an article on quasi-empiricism in mathematics (though Quine himself was not enthusiastic about higher large cardinals; not sure about Putnam). --Trovatore (talk) 11:36, 30 January 2011 (UTC)
 * Of course the reason people would want unrestricted comprehension is that it's an immediate consequence of the concept of set as "any possible collection of mathematical objects". Of course this concept turns out to be inconsistent, and the type of naive set theory built upon it is inconsistent, but that doesn't change the fact that unrestricted comprehension is intimately attached to the natural-language concept of "set".


 * Historically, it seems that Cantor's earlier work (Grundlagen) in set theory was unarguably inconsistent, as he treated the class of all cardinal numbers as a set, while his later work (Beiträge) might or might not be, depending on how it's read. The main difficulty is that Cantor was sufficiently vague about what he meant by "set" at the end that it's hard to tell. The reference Frápolli 1991 from naive set theory examines that issue more closely . &mdash; Carl (CBM · talk) 12:48, 30 January 2011 (UTC)


 * Regarding "paradise", I was alluding to the fact that Hilbert famously called some kind of set theory das Paradies, das Cantor uns geschaffen hat, and speculated that this might have had the historical effect of people mistakenly attributing to Cantor some of the ideas that Hilbert spoke of. Now, looking closer into it, I find that Hilbert said this long time after the paradoxes of naive set theory had been discovered. So whatever Hilbert was speaking about certainly didn't include unrestricted comprehension, and I hereby retract my speculation. –Henning Makholm (talk) 13:08, 30 January 2011 (UTC)

Re the original IP: another thing to be careful about is that, although ZFC is currently the most common foundation for undergraduate mathematics, it is not necessarily a foundation for all mathematics. For example: There are many ways of interpreting these things, including reductionist ways that find a way to interpret all the work in ZFC. But the claim "ZFC is the foundation for all mathematics" is more subtle than elementary books let on. &mdash; Carl (CBM · talk) 21:09, 30 January 2011 (UTC)
 * Set theorists routinely study aspects of mathematics using hypotheses that cannot be proven in ZFC. These include large cardinals, determinacy axioms, and examples in set theoretic topology that explicitly assume the continuum hypothesis or its negation.
 * Most non-set-theoretic mathematics can be done in ZFC, but not quite all. In particular, the original proof of Fermat's last theorem is an interesting case. Although the proof can be reworked in ZFC, the literal methods used in the proof employ Grothendieck universes and cannot directly formalized in ZFC the way that elementary group theory can be.

Terence Tao has claimed that part of the undergraduate fundamental theorem of linear algebra's statement cannot be formalized in ZFC. The theorem says (among other things) that every finitely-generated real vector space V has a dimension dim(V). The issue is that those vector spaces form a proper class and not a set, so ZFC cannot quantify over them. 71.141.88.54 (talk) 23:27, 30 January 2011 (UTC)
 * There is no problem in quantifying over a proper class, only in collecting it into a completed whole. I think you've overinterpreted what Tao wrote.  Basically what he's saying (I think; I only gave it a glance) is that   is not strictly speaking a function, because its domain is a proper class.  That's fine.  You just reinterpret the statements in a completely routine way to use a definable predicate for   rather than a function in the strict sense. --Trovatore (talk) 23:44, 30 January 2011 (UTC)
 * I took a second glance, and it's a little more interesting than what I was saying. His point, I think, is that you can do the definable-predicate thing I was talking about, but only if you actually know the definition.  What you can't do directly in first-order logic is say "There is a (class) function   with these properties, only I don't know what function it actually is".
 * Whether that's really the content of the theorem he quotes, though, is arguable, I think. Translating theorems from English into formal language is not quite automatic, and occasionally one has to look at the proof to see what the theorem actually means.  In this case it's not a pure existence statement.  It means something like "I can give you an explicit definition of   such that the following properties hold". --Trovatore (talk) 00:32, 31 January 2011 (UTC)
 * This is a well-known (or well-ignored) issue in second-order arithmetic and other theories, too; it's not at all unique to set theory. When we write an English phrase with quantifiers over higher types (which, in the context of set theory, is analogous to quantifying over proper classes), the convention is that when this statement is formalized, the quantifiers are removed in a standard way to get something that can be written in the language of the theory at hand. &mdash; Carl (CBM · talk) 00:50, 31 January 2011 (UTC)
 * Trovatore, I still quite confused about "class functions" and "ordinary functions. By ordinary function I mean the usual definition: a set of order pairs blah blah. By class functions I mean things like the singleton function, the union function, the binary ordered pair function. The class functions are made by first proving a unique object with certain properties exist for any set (or pairs of sets or other stuff) and then introduce a new function symbol for it. What I dont understand is: how do we think of class functions in a non formalist manner? They are certainly not sets so do we think, in realist terms, the collection of all sets and then view class functions as literally taking sets to other sets (instead of coding them as ordered pairs, which is impossible anyway)? Money is tight (talk) 14:58, 31 January 2011 (UTC)
 * Generally class functions are a convenient way of talking about some definable predicate (maybe with parameters &mdash; of course in set theory a single arbitrary parameter is as good as as many as you want). So in general you have a formula &phi; with three free variables, and a parameter x, such that for every set y, there's exactly one set z such that &phi;(x,y,z) holds; then &phi; and x define a class function that takes y to z.  Make the obvious adjustments if you don't want the domain to be all sets.
 * Note the big difference from ordinary functions: We don't have any way, in this context, of talking about arbitrary functions whose domain is a proper class, but only definable ones (with parameters).  If you want to talk about arbitrary class functions, you're into the domain of second-order logic. --Trovatore (talk) 20:01, 31 January 2011 (UTC)


 * There are also NBG and MK set theory, which do allow defining class functions; one can perfectly well state "there is a class function that assigns the dimension to every finite dimensional vector space" in these set theories, by quantifying over classes. Better, you can prove in NBG (and hence also in MK) that there is a class function like that. NBG (but not MK) is conservative over ZFC: any fact about sets expressible in the language of ZFC and provable in NBG is provable in ZFC already. Indeed, you can get a model of NBG from a model of ZFC by taking all the definable classes as classes, although the intended model of NBG has all classes rather than just the definable ones. &mdash; Carl (CBM · talk) 20:12, 31 January 2011 (UTC)
 * And article links, for curious readers: NBG set theory, MK set theory. –Henning Makholm (talk) 20:59, 31 January 2011 (UTC)
 * Well, the intended model of Kelley–Morse is not really (V, P(V)), which doesn't really make sense. It's (V&kappa;,V&kappa;+1), where &kappa; is some inaccessible cardinal.  It's not clear what the "intended model" of NBG would be &mdash; I tend to think of it as V and the definable classes of V. --Trovatore (talk) 21:49, 31 January 2011 (UTC)
 * Yes, it can't be (V,P(V)) if P(V) is supposed to be a collection of sets. I prefer the approach that, in the intended interpretation, the set quantifiers range over all sets and the class variables range over all classes. That doesn't require a commitment whether there are nondefinable classes. That interpretation gives the axioms their usual (disquotational) meanings, which is the property I usually associate with the intended interpretations of foundational theories. &mdash; Carl (CBM · talk) 22:25, 31 January 2011 (UTC)
 * Except that KM just doesn't really make sense in that interpretation. If V is a completed totality then it has to be a set; if it's not a completed totality then it doesn't make sense, as far as I can see, to talk about arbitrary subcollections of it. --Trovatore (talk) 22:40, 31 January 2011 (UTC)
 * Whatever V is (completed or not), it's certainly a proper class, so I am comfortable asserting that at least one proper class exists. The class of ordinals is a second one. To avoid getting into a lengthy discussion, we should move to my talk page; you can have the last word on the matter here. &mdash; Carl (CBM · talk) 22:58, 31 January 2011 (UTC)
 * My position is that V isn't, strictly speaking, anything. It's not an object.  Statements regarding it are to be re-interpreted in a stereotyped way that I'm sure you understand.
 * The reason is that if V did in fact exist, it would have to be a set. The intuitive concept of the von Neumann hierarchy doesn't allow you to stop before you get to the end (and you never do).  So if V existed, then ON would also exist, and being a wellordered collection of ordinals would itself have to be an ordinal, and now you have Burali-Forti and the other antinomies. --Trovatore (talk) 03:55, 1 February 2011 (UTC)

Constructing mathematical models.
I am trying to (self) learn how to construct mathematical models and then solve them. The question I am considering is this: Suppose I have a culture of bacteria in a petri dish which divides itself in two identical copies of itself every 10 minutes. For an arbitrary time unit $$\delta t$$ I wish to write and solve both discrete and continuous time models which give me the number of cells in time $$t+\delta t$$. What I am deducing right now is that if $$\delta t = 10q + r$$ according to the divsion algorithm then $$N(t+\delta t)=2^qN(t)(1+r/10)$$ is the discrete model. Can someone tell whether this is correct and how to solve it. Also how do I deduce and solve the continuous time model. Thanks-Shahab (talk) 08:25, 29 January 2011 (UTC)


 * What you have constructed isn't a discrete model - notice that your N is not restricted to taking whole number values, as it would be in a model in which bacteria were discrete rather than continuous. Instead, you have a continuous model that is actually a series of linked linear models; for the first 10 minutes the number bacteria increases at a constant rate of N/10 per minute; between between $$\delta t=10$$ and $$\delta t=20$$ they increase at a constant rate of 2N/10 per minute; then 4N/10 for the next ten minutes, and so on.
 * An example of a discrete model is if the number of bacteria is assumed to instantly double every ten minutes, so that $$N(t+\delta t)=2^qN(t)$$. Gandalf61 (talk) 10:27, 29 January 2011 (UTC)

See exponential growth. N(t)=N(0)2t/10 is your formula for the number of cells N at a given time t, knowing the initial number of cells N(0). This function satisfies the difference equation N(t+10)=2N(t). The logarithm of N grows linearly: log(N(t))=log(N(0))+(log(2)/10)t. Bo Jacoby (talk) 11:40, 29 January 2011 (UTC).


 * Thank you for helping me. I see that both of you have constructed the discrete model by taking r=0, so that essentially time is measured in intervals of 10 minutes. Now my question is how do I solve the continuous model $$N(t+\delta t)=2^qN(t)(1+r/10)$$.-Shahab (talk) 14:54, 29 January 2011 (UTC)


 * But your model is already solved - you have an expression that gives you N at any time, given N at one known time t - so not sure what further information you are expecting here. Gandalf61 (talk) 14:58, 29 January 2011 (UTC)


 * I am looking for an expression for N(t) which is not involving any kind of recurrence, a formula in which you plug in t and get the requisite number of bacteria, provided N(0) is given. I think Bo Jacoby gave such an expression N(t)=N(0)2t/10 for the discrete case-Shahab (talk) 15:30, 29 January 2011 (UTC)


 * Bo Jacoby gave an expression for the continuous case. --COVIZAPIBETEFOKY (talk) 21:55, 29 January 2011 (UTC)


 * Okay. Can you bear with me please and explain how did Bo Jacoby arrive at this continuous solution strictly from $$N(t+\delta t)=2^qN(t)(1+r/10)$$. Also is the solution for the discrete case $$N(t)=N(0)2^{\left \lfloor \frac{t}{10} \right \rfloor}$$ correct?-Shahab (talk) 03:26, 30 January 2011 (UTC)


 * You don't need the floor signs in the discrete case, because "discrete" means that you're only going to apply the formula when t is a multiple of 10 anyway, so the floor does nothing.
 * Once you remove the floor signs, you have an expression that is meaningful for any t, is nicely simple and C$\infty$, and agrees with the discrete solution on points where the latter is defined. What more do you want for a continuous model? You reasonably want your functional equation for t0 that are not multiples of 10, but it is very easy to check that $$N(0)2^{t/10}$$ satisfies that. What is there not to be satisfied with about that model? –Henning Makholm (talk) 05:19, 30 January 2011 (UTC)


 * Is it possible that there is still confusion over the meaning of the question? (though Gandalf61 did explain above). Real bacteria don't usually wait for ten minutes, then all divide at once (though in certain circumstances there might be chemical co-ordinating messages in a few species). Real bacteria are dividing "almost" at random, with the average being a doubling every ten minutes in your example.  This is why a continuous model is valid and usually accurate for real bacteria.    D b f i r s   08:36, 30 January 2011 (UTC)


 * Thanks to all your comments almost all my doubts are resolved. Only one more question please: I had deduced $$N(t+\delta t)=2^qN(t)(1+r/10)$$ as my model initially. How do I get N(t)=N(0)2t/10 from here? -Shahab (talk) 09:04, 30 January 2011 (UTC)
 * Your problem is that $$N(t+\delta t)=2^qN(t)(1+r/10)$$ is wrong. You don't get anywhere from there.
 * To see that it is wrong, set N(0)=100, and imagine moving first to N(5) and then to N(10). By your formula you'd get N(5)=N(0)*(1+0.5)=150 and N(10)=N(5)*(1+0.5)=225, but N(10) was only supposed to be 200. –Henning Makholm (talk) 09:46, 30 January 2011 (UTC)

Ostensibly simple probability question?
How do I prove that $$P(A \cup B|A) = 1$$ for all A and B? Ostensibly it seems like such a simple problem but I can't do it! 220.253.245.51 (talk) 11:30, 29 January 2011 (UTC)
 * $$P(A \cup B|A) = \frac {P((A \cup B)\cap A)}{P(A)} = \frac {P(A)}{P(A)}=1$$
 * for
 * $${P(A)}\ne 0$$
 * and undefined elsewhere.
 * Bo Jacoby (talk) 11:49, 29 January 2011 (UTC).
 * Thanks, I tried that actually but that assumes $$P((A \cup B)\cap A) = P(A)$$. Can that be proven? 220.253.245.51 (talk) 11:55, 29 January 2011 (UTC)
 * $$(A \cup B)\cap A = A$$ as a matter of naive set theory. If you need help proving that, you need to disclose exactly which formalization of unions and intersections you have to work from. –Henning Makholm (talk) 12:17, 29 January 2011 (UTC)
 * (1) A ⊆ A∪B
 * (2) if A ⊆ C then A∩C = A
 * Bo Jacoby (talk) 01:47, 31 January 2011 (UTC).

Taking from a large population without replacement
Suppose there is a bag in which there are n balls, of which w are white and the remaining (n - w) are red. Then, on drawing balls without replacement, the probability that the which we are about to draw is white is not independent of the results of previous results: If we have already drawn many reds, then it is more likely that the next one be white, and vice-versa.

To be precise, if we draw k balls and if X is the random variable that is the number of white balls drawn, then

$$P (X = x) = \frac{\binom{w}{x} \binom{n - w}{k - x}}{\binom{n}{k}}$$

However, in examination questions on the binomial distribution, often they say "assume n is large", and expect me to use the binomial distribution to model X:

$$P (X = x) \approx \binom{k}{x} \left ( \frac {w}{n} \right )^x \left ( \frac {n - w}{n} \right )^{k - x} $$

Is the binomial distribution a valid approximation for large n, and how can it be justified? --jftsang 22:09, 29 January 2011 (UTC)


 * If you expand both the approximation and the exact expression, you'll see that the approximation is obtained by replacing w!/(w-x)! with w^x, (n-w)!/(n-w-k+x)! with (n-w)^(k-x) and (n-k)!/n! with n^k. The reason this is ok for large numbers is that w!/(w-x)! is equal to $$ w \cdot (w-1) \cdots (w-x+1)$$, and for fixed x and as w tends to infinity, the ratio of this expression to w^x tends to one (think like this: for really big w, (w-1) is pretty much the same as w...).  The same argument covers the other terms.  Quantitative estimates will be harder to come by: you'll need to look carefully at the ratio I just talked about to see exactly how it behaves. Tinfoilcat (talk) 00:56, 30 January 2011 (UTC)


 * For the binomial distribution to hold, the probability of drawing a white ball needs to be constant. Initially that probability is w/n. After drawing one ball, it is either (w-1)/(n-1) or w/(n-1) (depending on whether you drew a red or white ball). Let's assume you drew a white ball and the probability is now (w-1)/(n-1) (the argument for the other case is similar). Let's divide top and bottom by n (since we're going to assume n is large, it helps to have it only appear in denominators). We get:
 * $$\frac{\frac{w}{n}-\frac{1}{n}}{1-\frac{1}{n}}$$
 * If you let n tend to infinity, you'll get:
 * $$\frac{\frac{w}{n}-0}{1-0}=\frac{w}{n}$$
 * which is the initial probability. So, for an infinitely large bag, the probability doesn't change (which isn't surprising - infinity minus one is just infinity).
 * You can get an idea for how good an approximation that is for a large, but not infinite, bag by looking at the Taylor series. We'll use the series:
 * $$\frac{1}{1-x}=1+x+x^2+x^3+\dots$$
 * In this case, we get:
 * $$(\frac{w}{n}-\frac{1}{n})(1+\frac{1}{n}+\frac{1}{n^2}+\frac{1}{n^3}+\dots$$
 * Since n is large, we'll assume 1/n3 is negligible and will ignore it. That leaves us with just:
 * $$\frac{w}{n}-\frac{1}{n}+\frac{w}{n^2}-\frac{1}{n^2}=\frac{w}{n}+\frac{w-n-1}{n^2}$$
 * For really large n, that second term is going to be tiny. For slightly smaller n, we can see that if w and n are of similar size, it's still going to be small. If n is a lot larger than w, but still not very large, it could be substantial. That tells us that the binomial distribution is a good approximation for "large" n and that what constitutes "large" depends on the relative sizes of n and w - the larger w is, the less large n needs to be. --Tango (talk) 02:13, 30 January 2011 (UTC)
 * Actually, I need to take some of that back. For the other case, where you drew a red ball, you get the probability of the second ball being white to be (again, ignoring terms with a 1/n3 in) to be $$\frac{w}{n}+\frac{w-1}{n^2}$$. In that case, the unwanted term is small for large n and for small w (rather than large w, as before). That means the only way to get a good approximation is to have a really large n. (How large obviously depends on how precise you want need to be.) --Tango (talk) 02:18, 30 January 2011 (UTC)

For sampling without replacement you want the hypergeometric distribution instead of the binomial distribution. 71.141.88.54 (talk) 08:33, 30 January 2011 (UTC)
 * If you want to be precise, yes, but the OP was talking about approximating it by binomial, which is very common since (for large enough n) the approximation is very good and the binomial distribution is easier to work with (and people are generally more familiar with it). --Tango (talk) 16:15, 30 January 2011 (UTC)