Wikipedia:Reference desk/Archives/Mathematics/2020 May 29

= May 29 =

Estimating populations
Hi, guys!

I'm a language guy but I TOTALLY suck with numbers. So much that I can't even find a title to explain my question! That is probably super easy to answer for someone who likes math. If I know the population of a country in 2000 and 2010, how can I estimate the population by year (2001, 2, 3 and so on)? Thanks! Ericdec85 (talk) 04:20, 29 May 2020 (UTC)


 * To apply mathematics to a real-world problem, the first step is to build a mathematical model of those aspects of reality that are relevant to the question. Just being good at mathematics is not enough; you also need an understanding of how the world works. Even then, there are two inherent problems. The first is that the world is complex, while your model is necessarily a highly simplified abstraction. For many physics and engineering issues we know how to model them reliably, but if human behaviour is involved, there can always be something unexpected that intervenes (war, drought, pandemic) and makes mincemeat out of your carefully constructed model. Or our understanding of the situation is not good enough to build a model, and we end up with a spherical cow. The second problem is that a model typically represents a whole class of real-world situations. It has parameters (mathematical variables) for which you need to supply specific values to get results out of the model. Finding the correct values for these parameters can be hard to impossible, but without them the model is rather useless. That having been said, let us look at the specific question, which is one of extrapolation if you are looking into the future, and interpolation if you want in-between values. The simplest model is that the population keeps changing each year by the same amount $D$. So then $population in year (Y + 1) =$ $population in year (Y) + D$. This is called a linear model. In a plot of population against time, it results in points that lie on a straight line. Let us abbreviate "$population in year (Y)$" by "$P(Y)$". Then $P(Y + 1) =$ $P(Y) + D$, so $P(Y + 2) =$ $P(Y + 1) + D =$ $P(Y) + 2D$, $P(Y + 3) =$ $P(Y + 2) + D =$ $P(Y) + 3D$, and so on. In general,
 * in which $P(Y + N) =$ is a parameter. Let us show this in action. For concreteness, assume that we know that $P(Y) + ND$ $D$ and $P(2000) =$ $2726391$. So then $P(2010) =$ $2955421$ $2955421 =$ $P(2010) =$ $P(2000 + 10) =$. Subtraction gives $P(2000) + 10D =$ $2726391 + 10D$ $10D =$, so $2955421 − 2726391 =$. Now we can calculate the estimates:
 * $229030$; and so on.
 * A (slightly) more realistic model for population growth is that the change in population each year is by the same percentage. This gives us an exponential model. The formula for a one-year change now has the form $D = 22903$ $P(2001) = 2726391 + 22903 = 2749294$, which generalizes for an $P(2002) = 2749294 + 22903 = 2772197$-year change to $P(Y + 1) =$ $C×P(Y)$. Applying this to the same data now gives $N$ $P(Y + N) =$ (approximately). Then $C^{N}P(Y)$ is the 10-th root of $C^{10} =$, which can be found using a calculator, computer or logarithm table. It comes out as $2955421/2726391 = 1.084$ (again approximately). Now we can redo the calculation of the estimates:
 * $C$ (after rounding to a whole number);
 * $1.084$; and so on.
 * I hope this was helpful. --Lambiam 07:52, 29 May 2020 (UTC)
 * $C = 1.0081$; and so on.
 * I hope this was helpful. --Lambiam 07:52, 29 May 2020 (UTC)


 * I would change your word percentage to ratio, as you did not use % anywhere in your solution, and as ratio is more specific. —Tamfang (talk) 03:53, 2 June 2020 (UTC)
 * True. I chose percentage over ratio because I thought the former term, being much more common in news articles, might be the more familiar to the OP. --Lambiam 06:04, 2 June 2020 (UTC)


 * This requires a bit of a background in basic calculus, but if you're interested, you may want to check out Modeling situations with differential equations, and Exponential models with differential equations on Khan Academy. In general, modeling real world phenomena using math often requires some knowledge about how the measurement, in this case population, changes, as time passes. This is where differential equations come in. Once again, this next thing requires a bit of calculus, but do check out this series by 3Blue1Brown.


 * More generally, this is how a problem like this is solved. First, as explained, you have to make an assumption about what kind of model best represents the situation. In the case of population, an exponential model makes the most sense. An exponential function is a function that changes faster as it gets bigger. Using this here should make sense, because each generation has more people that the previous one (generalized assumption), and therefore there will be more people able and willing to have children (generalized assumption), which will net an even bigger subsequent generation, and so on. Thus, the difference between the sizes of successive generations gets bigger and bigger, or in other words, the population increases faster as it gets bigger. Hence, we see why an exponential equation usually makes the most sense here.


 * For reasons that are not immediately important (hint: it has to do with differential equations), exponential functions are of the form $P(2001) = 1.0081 × 2726391 = 2748475$, where


 * $$f(x)$$ is the population at some time $$x$$,


 * $$P$$ is the initial population,


 * and $$e$$, is a number, like Pi, called Euler's number,


 * So, in your case, simply substitute $$P$$ with whatever the population was in 2000, and $$x$$ with the number of years that have elapsed from 2000, and you should have a rough estimate of what the population was. Except, there's one thing we haven't yet defined: $$r$$. That value determines how fast the function grows, and to calculate it, we need to do a bit of algebra. We know what the population in 2010 was, so we can set an equation to solve for $$r$$. Let's call 2010's population $$B$$. No matter what value $$r$$ is, I know that when $$x$$ is replaced with $$10$$, the equation has to equal $$B$$. Thus, $P(2002) = 1.0081 × 2748475 = 2770738$.


 * That means that $f(x) = Pe^{rx}$. We can now use a logarithm, in particular, the natural logarithm, to isolate the value of $$10r$$. Hence, $B = Pe^{10r}$. (Confused about what just happened? Don't worry, it's simpler than it seems. You may find this helpful.)


 * Now we simply divide by $$10$$, and we're done. $B⁄P = e^{10r}$.


 * Finally, the function that will tell you the approximate population $$x$$ years from 2000, is $ln(B⁄P) = 10r$. You can use this function for values both before and after 2010, since the conditions that resulted in this model up until then likely won't have changed, so we have no reason to assume that $$r$$ will change as time passes (you guessed it, that's another generalized assumption). -- Puzzledvegetable Is it teatime already?  15:34, 2 June 2020 (UTC)

Estimating intermediate values like that is called interpolation. The simplest way is linear interpolation. Basically draw a straight line between the two known numbers on a graph, and match against the line for the intermediate numbers. The other methods mentioned above (and this one) assume certain population growth models that might or might not be valid. But unless the change between your two numbers is quite large, the results of the different methods will match pretty closely. 2602:24A:DE47:BB20:50DE:F402:42A6:A17D (talk) 08:08, 3 June 2020 (UTC)

Generating random by pure mathematics?
Is there any mathematical expression with a non-deterministic outcome? Maybe something comparable to x2=1, which has two solutions (-1) and 1, but with the difference that everytime you solve the equation, only one of the two solutions is actually true, but not always the same, so I could sit down, putting the very same input a hundred times into one and the same algorism, and thus would receive a sequence of randomly alternating numbers. --80.219.180.46 (talk) 19:58, 29 May 2020 (UTC)


 * No. First, do not confuse expressions (which may evaluate to values if the values of all variables are given) with equations (which may or may not have solutions for unknowns). If an expression can be evaluated at all for a given value assignment to its variables, it will have the same value tomorrow as today. If an equation can be solved, its solution set will be the same tomorrow as today. Whether you are evaluating an expression or solving an equation, what you are doing is essentially constructing a mathematical proof. Quoting Daniel Graser: "Mathematical proof is quite extraordinary: what is proved today is true – today, tomorrow and in a thousand years." --Lambiam 21:44, 29 May 2020 (UTC)


 * In complex analysis it was once common to talk about multivalued functions, though I suspect that idea is gradually being phased out. A function, by definition, can only have one value, otherwise it's called a relation. A mathematical expression is really just a combination of functions, and so it can only have one value. That's the ideal anyway, whether mathematicians have always held to this ideal is another matter. --RDBury (talk) 23:26, 29 May 2020 (UTC)
 * A neat way of handling multivalued functions is to view them as being equivalence-class-valued functions. For example, the complex logarithm can be treated as a function $$\ln{:}~\mathbb{C}{\setminus}\lbrace {0}\rbrace \rightarrow \mathbb{C}/{\sim}$$, where $$u \sim v$$ iff $$u-v \in 2\pi{i}\mathbb{Z}$$. The codomain is an additive group, which also implies it is closed under multiplication by an integer. It is also a metrizable space, so it is meaningful to say that the complex logarithm is a continuous function. --Lambiam 10:23, 30 May 2020 (UTC)
 * Thank you very much for your answers. Does that mean, then, that there is no way to formally describe the emergence of a random distribution? I mean not in a way of how the consequences of random processes look like, as propability theory or stochastics do, but to mathematically aemulate the creation of randomness. --80.219.180.46 (talk) 01:52, 30 May 2020 (UTC)
 * I think the big question here is what, exactly, is randomness? It's really for philosophers and physicists to figure out. Mathematicians have axioms that say what properties randomness should have, see Probability axioms, but whether a specific system satisfies these axioms, or even if any such a system exists in the real world, they leave for others to decide. There is Pseudorandomness, which is very useful in computing and exhibits most of the properties of randomness, but really isn't random because values are determined by a single seed value. Then there's Lavarand which uses the chaotic movement of lava lamps to generate randomness, but that's no longer purely mathematical, and one could argue that while the movement of the fluids in a lava lamp are unpredictable over the long term, they still follow the laws of fluid dynamics and so aren't really random at all. Then there's quantum mechanical randomness, which according to theory is truly random in some sense, but again that's not coming from mathematics but from the properties of subatomic particles. See also Hardware random number generator. --RDBury (talk) 03:51, 30 May 2020 (UTC)


 * Other places the OP may want to look into is Stochastic process and things like Markov chains and the like for mathematical treatments of randomness. -- Jayron 32 04:14, 30 May 2020 (UTC)
 * These topics describe randomness but are agnostic with respect to the source of the randomness. The closest to a mathematical treatment of the emergence of randomness I can think of is chaos theory. But inasmuch as this can be used to emulate the creation of randomness, it remains firmly within the realm of (wholly deterministic) pseudorandomness. In many cases, the seeming randomness of a process is not an inherent property of the process, but a measure of our ignorance. It is not possible to disprove the hypothesis that this is in fact the case for all apparent randomness, including quantum-mechanical randomness. --Lambiam 10:23, 30 May 2020 (UTC)
 * The Hidden variable theory was essentially wholly disproven decades ago; quantum mechanics is fundamentally incompatible with any explanation that holds that its randomness is due local hidden variables. Bell's theorem and all that.  The only mathematically sound way for hidden variables to be causing quantum randomness is to violate the Principle of locality; theories exist which allow for this such as Bohm's pilot wave theory, but such theories require feats of logical contortion that many are not comfortable with.  -- Jayron 32 13:41, 2 June 2020 (UTC)
 * Perhaps "reality" as we experience it is a scripted and recorded stream, carefully constructed so that all experiments – which like everything else are part of the predetermined script – appear to confirm the predictions of quantum theory. You may not believe this, but what you believe or not is also written in the script. Now prove that this is not the case. --Lambiam 18:27, 2 June 2020 (UTC)
 * You may as well ask us to prove, using pure mathematics only, that there is a God, or that there isn't. Good luck with that. --   Jack of Oz   [pleasantries]  23:12, 2 June 2020 (UTC)
 * The point is that if there was a way to produce provably true randomness by mathematical means, it would disprove the hypothesis that everything is deterministic and that all appearance of randomness merely reflects our ignorance. If one thinks the predeterminism hypothesis cannot be disproved, one should also accept that true randomness cannot be produced by mathematical means. --Lambiam 10:46, 3 June 2020 (UTC)
 * I mean, you might as well say that peanut butter cannot be produced by mathematical means. Of course it can't.  Mathematics doesn't "produce"; mathematics describes.  --Trovatore (talk) 19:25, 3 June 2020 (UTC)
 * Mathematics can produce extreme anxiety in students :). What about accepting that here "produce" is short for "provide methods to produce", like some people might write that "mathematics produces answers to physical problems". --Lambiam 20:31, 3 June 2020 (UTC)
 * The glory of mathematics is its extreme generality; virtually any state of affairs that is logically possible can be described mathematically. That said, mathematicians like to get answers, so they typically avoid descriptions that don't give definite answers.
 * With that as background, the OP might like to check out the notion of a random variable. With random variables, you can ask questions like  and get an answer that is neither "yes" nor "no", but "yes" with a certain probability.  The mathematicians get to put a "Pr" around the statement, and get a definite value for the probability, and then they feel better.
 * Random variables are arguably just syntactic sugar on top of probability distributions, which are completely deterministic objects, but they do give rise to expressions that seem somewhat like what the OP may have been asking for. --Trovatore (talk) 23:22, 2 June 2020 (UTC)

As far as modelling random processes go, yes, that is stochastic processes. Probability theory and the related topic mathematical statistics are different: they don't try to model the randomness itself, but instead say how to predict or understand what is likely to happen in a system containing randomness. Further down, the interpretation of probability is a complicated topic in philosophy: the [SEP article may be better than ours. Regarding quantum mechanics: explaining how a wavefunction of probability amplitudes turns into a concrete observation is called the [[measurement problem]].  It too is a big philosophical mystery and again you could look for an SEP article about it. 2602:24A:DE47:BB20:50DE:F402:42A6:A17D (talk) 08:15, 3 June 2020 (UTC)