Wikipedia:Reference desk/Archives/Mathematics/2011 February 9

= February 9 =

how long to cross the 0 on a random walk
Whenever you see graphs of random walks, sometimes you'll see them not crossing the zero for a really, really, really long time! How is that possible? Doesn't the law of averages state that it has to hover around zero? If it goes way into negative or way into positive, it needs to go in the other direction for it to be a "random" (instead of a weighted) walk, doesn't it? 217.136.92.148 (talk) 00:04, 9 February 2011 (UTC)


 * Be careful—that's the gambler's fallacy. If the first step of the random walk takes it up to 1, then the next step is equally likely to take it up to 2 or down to 0. Essentially, after that first step you are starting a brand new random walk, except that you're starting from 1 now instead of 0. So, from that perspective, once the random walk has reached 1, your new expectation should be that it should average out to about 1 from that point on: there is no "force" that should make it more likely to come down to 0 than to go up to 2. —Bkell (talk) 00:48, 9 February 2011 (UTC)


 * This has something to do with the Levy arcsine law. Apparently the proportion of the time that a random walk with p=1/2 will be in positive territory will have an arcsine distribution, meaning it is much more likely to be close to 0 or 1 than to the intuitive value of 1/2. Although I'm not an expert so I find this quite mysterious myself. For one thing, does the arcsine distribution even integrate to one over the unit interval? For another thing, I find it hard to imagine how this totally independent of the length of your sample path. On a long enough sample I would expect to see 'cycles' between very long periods in positive territory and very long periods in negative territory, which would bring the total back closer to 1/2. I hope someone can give a better answer. —Preceding unsigned comment added by 130.102.78.164 (talk) 01:00, 9 February 2011 (UTC)


 * $$\frac{2}{\pi}\arcsin(\sqrt{x})$$ is the cumulative distribution. The integral of the arcsine distribution over the unit interval is $$\frac{2}{\pi}\arcsin(\sqrt{1})$$, which you can check is 1.
 * Concerning the law of averages, you can argue that it has to stay relatively close to zero. But here "relatively close" just means less than $$\sqrt{2n\log\log n}$$ (and even that not quite).--203.97.79.114 (talk) 02:43, 9 February 2011 (UTC)


 * Funny enough, while the reasons he gives are iffy (for reasons related to the gambler's fallacy), he's not far from the truth. In fact, a simple random walk on $$\mathbb{Z}$$ is expected to visit every point an infinite number of times (in particular, it returns to 0 in finite time, with probability 1). Our article on random walks addresses this, roughly, and the referenced Mathworld article (in the Higher dimensions section) furthers the discussion. Invrnc (talk) 02:49, 9 February 2011 (UTC)
 * Also, I'll point out that, though returning to the origin in finite time has probability 1, there is no upper bound on this finite time. Simple random walks can stay away from the origin for arbitrarily long times, but this occurs with arbitrarily small probabilities. SemanticMantis (talk) 15:46, 9 February 2011 (UTC)
 * Moreover, the expected time between visits to the origin is infinite. Algebraist 15:56, 9 February 2011 (UTC)

WHAT???
Moreover, the expected time between visits to the origin is infinite

How can that possibly be true!! Please elaborate... 217.136.92.148 (talk) 18:04, 9 February 2011 (UTC)
 * I am not familiar with this result. But it would mean that the distribution of lengths between visits to the origin would have a Long-tail distribution.  I am also a bit quesy about the word infinte, probably a better wording is that the expected time is not finitly bounded.  Think of it this way, since the walk is recurrent we know that it will cross 0 infinitly often.  But we also know that it will cross 10^10 infinitly often.  The result just states that large permutations will occur often enough that we can't put a finite expected value for the time until the next crossing of 0.  Taemyr (talk) 18:40, 9 February 2011 (UTC)
 * The word "expected" is being used in the sense of expected value. Just to clarify, this does not mean that the walk will return to the origin "after infinite time", it just means that probabalistically, the expected length of time between visits is really big. How big? Bigger than any fixed real number. Staecker (talk) 19:00, 9 February 2011 (UTC)
 * The way you stylized "how big" - is that someone in the audience whispering it off-stage? 109.128.101.244 (talk) 20:02, 9 February 2011 (UTC)
 * There's no reason to be shy about calling the expected time infinite. It is infinite (assuming Algebraist is right, which he usually is).  Yes, you have to be careful not to conclude that there might be two individual visits that are separated by infinite time &mdash; as the problem is constructed, that can't happen (although you could probably extend the setup to a nonstandard model of Peano arithmetic or something, and in that context it could). --Trovatore (talk) 19:07, 9 February 2011 (UTC)
 * [ec] "The expected time is infinite" is legitimate. If $$p_i$$ is the probability that there will be no visit before time i, the expected time of the next visit is $$\sum_{i=1}^{\infty}p_i$$, which is $$\infty$$.
 * To the OP - this does not contradict the fact that $$\lim_{i\to\infty}p_i=0$$ and hence there will be a visit with probability 1. -- Meni Rosenfeld (talk) 19:13, 9 February 2011 (UTC)
 * In some cases, expected value matches with an informal notion of 'average behavior', but it doesn't have to. For instance, if heads is worth 1 and tails is worth 0, then the expected value of a fair coin toss is 1/2, even though that doesn't correspond with any possible event. Likewise, the expected value of a six-sided die is 3.5, and that cannot happen on given throw either. The best intuitive way of understanding the results mentioned above on recurrence of integer simple random walk: we can say with certainty that the walk will return to the origin, and that it will occur in finite time. But there is no finite time after which we can be sure the walker has returned. Does that help? SemanticMantis (talk) 20:03, 9 February 2011 (UTC)
 * IT helps if you meant to write: "The best intuitive way of understanding the results mentioned above on recurrence of integer simple random walk: we can say with certainty that the walk will return to the origin, and that it will occur in finite time. But there is no finite time after which we can expect that the walker has returned -- i.e. a finite time at which, "on average" he would have returned." —Preceding unsigned comment added by 109.128.101.244 (talk) 20:07, 9 February 2011 (UTC)
 * The reason that would have helped is, that's what I understand "expected" to mean. (Rather than your introduction of the word "sure").  109.128.101.244 (talk) 20:09, 9 February 2011 (UTC)
 * I think your new wording brings up a new issue. I'm not sure what "on the average he would have returned" means exactly, but it sounds like you might be thinking of the median time.  The median time is in fact finite (there's a finite time t such that the return is as likely to happen before t as after t).  It's the mean time that's infinite (add up all the possible lengths of time, multiplied by their probabilities). --Trovatore (talk) 20:25, 9 February 2011 (UTC)
 * I meant 'sure' in the sense of Almost_surely, which means probability of the event is 1. I think you're a little hung up on the infinite expectation. We can still answer questions like "after what time can we say the chance of return is at least 60%?" or "after 100 steps, what is the chance that the walker returned?" Anyone have an example of something more concrete with an infinite expectation? The expected degree in some scale-free network perhaps? SemanticMantis (talk) 20:40, 9 February 2011 (UTC)
 * Yes, you meant that, but weren't you SUPPOSED to mean 0.5 instead of 1? If I walk into a room at a random time, and wait for the minutes on the digit clock in it to advance by one, I have an expected wiat time of 30 seconds. (There is a 50% chance time that I walked in when the seconds were reading between 0-29, giving me a 30+ second wait and a 50% chance that I walked in when they were reading between 30-59, giving me a 30- second wait.  Having to waiting 31 seconds is as likely as 29 seconds, and so on all the way down.  The expected wait is 30 seconds).  But the way that YOU construe the problem, you are saying "I have to wait 60 seconds", since oyu are using "probability one".  (and perhaps it just turned the next minute the microsecond I entered the room).  So you are in conflict with Trovatore.  He is saying, "Median wait" (0.5 probability if you like), and you are saying "Probability 1 time wait".  Which of you is correct? 109.128.101.244 (talk) 23:14, 9 February 2011 (UTC)
 * Sorry, but I just don't understand your first sentence in particular, and where we are miscommunicating in general. Also, I've re-read the whole thread twice, and I see no inconsistencies between what I've said and what Trovatore has. Does Trovatore? ? Perhaps it's best to start with the simpler example of infinite expected value that Algebraist links below. Does that make sense? You could also start a new question about your specific concerns. This is a good question and interesting stuff, but this thread has got a lot of baggage by now. SemanticMantis (talk) 01:04, 10 February 2011 (UTC)
 * The problem with what you said, "there is no finite time by which we are (almost) sure to return", is that it has nothing to do with the infinite expectation. As long as the time of returning is unbounded, this will be true even if the expectation is finite. -- Meni Rosenfeld (talk) 09:39, 10 February 2011 (UTC)
 * Oh, I see, thanks. I didn't mean to equate the two notions, but I can see how it could read that way. SemanticMantis (talk) 15:10, 10 February 2011 (UTC)
 * The classic example is the St. Petersburg game. Algebraist 20:46, 9 February 2011 (UTC)

To address the original question, the probability that a random walk starting at the origin will revisit the origin again for the first time at step 2n is
 * $$p_{2n} = \frac{1}{2^{2n-1}n}\left(\begin{matrix}2n-2\\ n-1\end{matrix}\right)$$

(see Catalan number). By Stirling's formula,
 * $$p_{2n}\sim \frac{1}{2n\sqrt{\pi}\sqrt{n-1}},$$

and so the expected value
 * $$E(2n) = \sum_{n=1}^\infty 2np_{2n}$$

diverges by comparison with the p-series. Sławomir Biały (talk) 16:01, 10 February 2011 (UTC)

mathematicians: useful or useless?
I have two kinds of experiences with mathematicians. One, is that they are like magicians, who have a deep understanding of systems that no one in the world has. As long as they can assure that what they have an understanding of indeed corresponds to the conditions of the systems they're applying them to, a mathematician can do magic. But, my other experience is that a mathematician is useuless, and has a body of theoretical knowledge that they cannot apply to anything in any way. What gives? How can I have these two contradictory experiences? 217.136.92.148 (talk) 09:45, 9 February 2011 (UTC)
 * Mathematical fun and fascination is not necessarily immediately useful. It may find useful or even commercial applications after some time. Bo Jacoby (talk) 09:55, 9 February 2011 (UTC).
 * A professor of mine put it this way (quoting Faraday), "Of what use is a baby?" meaning you can't judge the value of something only by its immediate usefulness. The ordinary person doesn't use much math on a daily basis, but without it science and engineering are impossible and without them modern civilization would be impossible.--RDBury (talk) 10:28, 9 February 2011 (UTC)
 * The word "mathematicians" generalizes a large group of people with widely varying interests and abilities. Some will proudly say "I have never done anything 'useful'", while others will focus their studies on topics known to be applicable and the the skills of applying them.
 * Mathematical discoveries can take time to be assimilated into the theory of other professions, and some more time into their practice.
 * Necessarily, some mathematical concepts will be inherently more applicable than others, though it is hard to tell in advance which is which - the cryptographic applications of number theory is a famous example.
 * -- Meni Rosenfeld (talk) 10:39, 9 February 2011 (UTC)


 * Calling somebody either "useless" or "useful" are both denigrating (and so not really contradictory except on a purely local scale). Mathematicians are people. –Henning Makholm (talk) 10:42, 9 February 2011 (UTC)


 * Mathematics is part of what makes life worth living for a mathematician, and they get paid for it, that's no different from than a musician getting paid for their art. Mathematics has helped make the modern world where you can phone up your friends and meet up to have a good time listening to that music if that's what you like. Would you say that the music you don't like is useless? Dmcq (talk) 11:07, 9 February 2011 (UTC)


 * George Boole's work was "relatively obscure, except among logicians. At the time, it appeared to have no practical uses." In 1900, he probably would have been a good example of a person with a body of theoretical knowledge that couldn't be applied to anything useful. Flash forward 100 years, and Boole's work is fundamental to the use and operation of computers. That is, if computer programmers didn't have Boole's work, someone would have had to invent it first. - That's a repeating theme in mathematics: some esoteric, purely theoretical problem suddenly becomes the focus of practical inquiry, and the results of obscure mathematicians, who during their lifetime never cared about practical applications, suddenly become critical to major industries. In some respects, current theoretical mathematicians you deem as useless are also doing magic, but they're doing it on systems that haven't been invented yet. -- 140.142.20.229 (talk) 01:36, 10 February 2011 (UTC)


 * I would like to just say if you consider mathematicians useless then perhaps you'd like to first consider what use are philosophers. The view you have towards mathematicians I have towards philosophers (not philsophers of mathematics though). Money is tight (talk) 06:29, 10 February 2011 (UTC)


 * See G. H. Hardy for an example of the latter: he was comforted by the knowledge that number theory had no application in war (until later developments in cryptography!). —Tamfang (talk) 18:06, 10 February 2011 (UTC)
 * That's what I said (though in a bit more cryptic way). -- Meni Rosenfeld (talk) 19:00, 10 February 2011 (UTC)

Everything we do is useless anyway, so you could just as well spend your time solving useless math problems. Count Iblis (talk) 12:09, 12 February 2011 (UTC)

3,5,7 = triple adjacent primes, are there any others?
Actually, in this case, it's 2,3,5,7 as adjacent primes, with 2 and 3 being the only ones that abutt. I notice adjacent primes like 17,19 and 29,31, but I can't think of series of three such adjacent primes apart from the ones I mentioned. Are there any others? If not, is there are proof that there cannot be? And is there any proof that there is an infinite number of such adjacent primes?

And, if you had a graph of all natural numbers, like the x axis, and you plotted a curve above it which identified the primes, obviously this curve gets steeper quickly as the prime numbers thin out. What can we say about such a curve? Obviously, it cannot be identified by an algorithm, as then we could predict primes in advance. But is there a name for such a curve? (Oh, I'm not a math person really, so go easy on me, please. Myles325a (talk) 11:17, 9 February 2011 (UTC)
 * There are no more triplets, and the proof is simple - if a is any integer, then exactly one of $$a,\ a+2,\ a+4$$ is divisible by 3. So if $$a>3$$ they cannot all be prime.
 * Whether there are an infinite number of twin primes is an open problem.
 * For the second question, you should start with the prime number theorem, which can be used to approximate this curve. Note that the exact curve can most certainly be identified with an algorithm, just not a very quick one. -- Meni Rosenfeld (talk) 11:30, 9 February 2011 (UTC)


 * The closest admissible constellation of three primes is called a prime triplet. It is of form (p, p + 2, p + 6) or (p, p + 4, p + 6). Four primes is a prime quadruplet and so one. They are all conjectured to have infinitely many occurrences but it hasn't been proved for any of them. The largest known cases are at http://anthony.d.forbes.googlepages.com/ktuplets.htm. Apart from trivial occurrences with primes below 1000, the largest k with a known prime k-tuplet is a 27-digit prime 19-tuplet discovered earlier today after a long search: http://tech.groups.yahoo.com/group/primenumbers/message/22585. PrimeHunter (talk) 14:01, 9 February 2011 (UTC)
 * Maybe of interest, the Green-Tao theorem shows (nonconstructively) that there are arithmetic sequences of arbitrary length in the primes. I thought some had been discovered of length 23 or so.  71.141.88.54 (talk) 20:44, 9 February 2011 (UTC)
 * According to The Prime Glossary, the longest known arithmetic sequence of primes is currently of length 25, starting with the prime 6171054912832631 and continuing with common difference 366384*23#*n, found by Chermoni Raanan and Jaroslaw Wroblewski in May 2008. (Here the # symbol denotes the primorial.) —Bkell (talk) 21:50, 9 February 2011 (UTC)
 * It has since been improved to length 26. See Primes in arithmetic progression and my website which maintains the current records. PrimeHunter (talk) 00:06, 10 February 2011 (UTC)

Percent of negligible error
I know that there are many ways to estimate error. I would like to know the proper method for this situation. I have a population, say 100 people. Let's assume I'm measuring obesity and 60% are obese. Then, I have a new data point at a later date. The population has grown to 150 people. What percent of difference from 60% should be acceptable or negligible? I need the formula based on original population, original percentage, and new population. I've found dozens of variations on Google, so I don't know which is proper. -- k a i n a w &trade; 14:19, 9 February 2011 (UTC)
 * See Reference_desk/Archives/Mathematics/2011_January_18. Bo Jacoby (talk) 16:55, 9 February 2011 (UTC).


 * It appears that it cannot be done using those methods without also knowing the standard deviation. I do not have standard deviation. I only have population and percentage at two different points in time. I would like to know if the second percentage is significantly different than the first based on the change in population size. -- k a i n a w &trade; 17:16, 9 February 2011 (UTC)


 * Consider taking the situation as a series of Bernoulli trials, that is each person is obese independently with some uniform probability p. A simple estimator for p is simply to use the proportion you observe (0.6 in the first population). Then the number of obese people you expect is np, with variance np(1-p). Once you have modeled the initial distribution, you can then calculate the probability of your second sample in that model.
 * Even if the model of Bernoulli trials is not right, you should attempt to apply some sort of model to your data (put some probability distribution on your population and estimate the parameters). I don't think you can say much about the significance without some knowledge of the variance (standard deviation). Invrnc (talk) 20:10, 9 February 2011 (UTC)

The mean value of the parameter P is $$\mu=E(P)=\frac{k+1}{n+2}=\frac{61}{102}=0.598$$ where n=100 people is the size of the sample and k=60 is the number of them being obese. The standard deviation is $$\sigma=\sqrt{V(P)}=\sqrt{\frac{\mu(1-\mu)}{n+3}}=0.0483$$. Bo Jacoby (talk) 01:45, 10 February 2011 (UTC).


 * Thanks. I found what they want. In this example, you have 60 positives and 40 negatives in the population. You add 50 people. Assume the second percentage is 70%. That is 105 positives in the new population, an increase of 45. That means that in the increase of 50 people, you added 45 positives. The new population is 90% positive. They really want to know the positive rate on the new population to judge how it relates to the old population - which that gives. So, instead of showing a change from 60% to 70%, they actually have a population at 60% in the past and added a population at 90% in the present. -- k a i n a w &trade; 16:11, 10 February 2011 (UTC)


 * I should have stated: "I found what they want and it is not specifically what I asked about." -- k a i n a w &trade; 18:19, 10 February 2011 (UTC)
 * I would use Fisher's exact test here. Robinh (talk) 05:32, 14 February 2011 (UTC)


 * Thanks. I see the relation, but my problem is that the data I'm working with is in the range of millions of patients. I haven't found a way to handle factorials effectively when working with extremely large numbers. I have two choices - either the numbers overflow or the calculation takes hours. For a small population example, this is a real one: Of 818442, 33% have LDL<100 in Jan 2010. Of 1527433, 39% have LDL<100 in Jan 2011. Since the first population is almost completely a subset of the second and since LDL changes very little over a year's time, what is the actual increase in LDL<100 for the final population? -- k a i n a w &trade; 13:19, 14 February 2011 (UTC)

Fisher's Least Significant Difference test vs. Tukey's Least Significant Difference test (statistics)
Hello,

Is there a difference between Fisher's Least Significant Difference test and Tukey's Least Significant Difference test, or are they merely two names for the same test? To be clear, by Tukey's LSD test, I'm not referring to Tukey's Honestly Significant Difference test or the Tukey-Kramer test (which are quite different from Fisher's LSD test in that they correct for multiple comparisons).

I ask because the documentation for MATLAB's multcompare function indicates that it can use Tukey's HSD/Tukey-Kramer or Tukey's LSD (among other options; see the "Values of ctype" section about halfway down the page). I've searched for Tukey's Least Significant Difference test on the internet and in the literature but only found references to it in papers that mention its use. In contrast, searches for Fisher's Least Significant Difference test return results that provide information about the actual method, which leads me to suspect that perhaps Tukey's LSD is simply another name for Fisher's LSD or for the process of performing multiple pairwise t-tests.

Any clarification regarding this nomenclature would be appreciated.

Thanks!

142.20.133.215 (talk) 15:40, 9 February 2011 (UTC)

Does such a value exist?
For

x equals x to the cube plus one

Which is

x = x3+1

Does such a value exist?

Thank you very much. --192.197.51.41 (talk) 16:00, 9 February 2011 (UTC)


 * Yes. Your equation is equivalent to finding the root of f(x)=x3-x+1, which is a polynomial and hence a continuous function, which is therefore subject to the intermediate value theorem. Since f(-2)=-5 (i.e. less than ZERO) and f(-1)=1 (greater than ZERO), the IVT guarantees that there exists an x between -1 and -2 such that f(x)=0. For questions like these a quick answer can be gotten from WolframAlpha like so, which shows the answer to be x=-1.32472.... Zunaid 16:15, 9 February 2011 (UTC)
 * And in general, every polynomial (with real coefficients) of odd degree has a real root. A polynomial of even degree needn't have a real solution, but it must have a complex solution. -- Meni Rosenfeld (talk) 16:39, 9 February 2011 (UTC)
 * Alright, thank you very much. --192.197.51.41 (talk) 19:49, 9 February 2011 (UTC)
 * To expand on Meni's answer, see fundamental theorem of algebra, which states that every n-th degree polynomial must have n roots. In the case of your example it has 1 real root and 2 complex roots (as shown in the WolframAlpha link above). Zunaid 21:42, 9 February 2011 (UTC)
 * Technically, it must have n roots counted with multiplicity. x2-2x+1=0 is true only for x=1 but, in a sense, it is true twice for x=1 (1 is also a root of the derivative, if you wish to be precise and what sense we mean). --Tango (talk) 23:46, 9 February 2011 (UTC)

Good ol algebra word problems
Time for some quadratics! I've got a typical plane flies word problem. I know how to solve a quadratic equation, but I'm trying to figure out how to get the quadratic equation.


 * A pilot flies from Ottawa to London (5400 km). On the return trip, he travels 50 km/h faster and reduces the travel time by 60 minutes (1 hour). Find his average speeds in each direction.

What I've gotten so far is that the first trip, speed (x1)=5400/y, where y is the time in hours. Therefore, since x2 = x1 + 50:

(5400/y)+50 = 5400/(y-1)

or

(5400/y) - (5400/(y-1)) - 50 = 0

What is the next step? -  ʄɭoʏɗiaɲ  τ ¢  21:34, 9 February 2011 (UTC)


 * What I would do next is to clear the fractions: multiply the equation by the common denominator of the two fractions. (Equations are usually easier to work with when they don't have fractions in them.) —Bkell (talk) 21:43, 9 February 2011 (UTC)

discrete fourier transform in mathematica
I am trying to get used to taking discrete fourier transforms with mathematica. The first step I take is to generate some time data from a function whose fourier transform I know, namely I know that the transform of
 * $$f(t) = e^{- \frac{t^2}{2}-i \omega_1 t}$$

is
 * $$\hat{f}(\omega) = \sqrt{2\pi } e^{-\frac{1}{2} ( \omega -\omega_1 )^2}$$

which is a real-valued function. However when I create a time series of $$f(t)$$, using the following code:

omega1 = 2 Pi;

E3[t_] := Exp[- (t^2/2 )] Exp[-I omega1 t];

timelist = Table[E3[t], {t, -12, 12, .1}];

and then take its DFT by

freqlist = Fourier[timelist];

then the result is complex-valued, and I only get the Gaussian function I was expecting if I plot the absolute value. Is this to be expected for the DFT and FT to give different results? Thanks 128.200.11.124 (talk) 22:34, 9 February 2011 (UTC)


 * In the DFT, zero corresponds to the first element of the sequence, not the middle element. You should arrange your timelist to contain E3[t] for the following t's: {0, 0.1, 0.2, ..., 11.9, 12, -12, -11.9, ..., -0.2, -0.1}, and interpret your freqlist similarly. 98.248.42.252 (talk) 05:13, 11 February 2011 (UTC)