Wikipedia:Reference desk/Archives/Mathematics/2022 December 26

= December 26 =

Deducing formulae from data
Are there books that tell you how to deduce or guess the formula for an un-understood process for which you have measured data? Can anyone recommend a text? In junior high school we did things like plot pairs of points on graph paper, such as 1, 1.21; 2, 2.20; 3, 3.23, etc and realise that within measurement uncertainty it is presumably a straight line, formula Y = a + bX where a is 0.21 and b is 1. Big deal, real life is not that simple. We also plotted sets of points that looked just like the formula is square law, Y = aX^2. Again, big deal. There are of course other simple obvious things like exponential growth/decay. But real processes are often not that simple. In high school and 4 years of university engineering degree, we never looked at this topic again. There are a multitude of curve fitting textbooks on least squares regression, fitting generalised formula such as Y = k1 + k2X^2 + k3X^3 .... etc etc. These are not what I need. You can always fit a curve with such methods, if you throw enough terms in, or decide to accept arbitary error. But it tells you nothing about how the process that produced the data actually works, or help you extract the real curve from the "noise" and measurement uncertainty. Dionne Court (talk) 05:49, 26 December 2022 (UTC)


 * I'm afraid there is no general approach to finding a relatively simple formula that fits observed data within the measurement error. It may be instructive to read the section and see the long process leading up to Planck's 1900 construction of an empirical formula, which he called an "improvement" of Wien's equation. One can see the combination of heuristic theoretical approaches (Wien's thermodynamic arguments, Planck's resonant electric oscillators) with high-quality experimental data at work, and yet it appears that in the end success emerged as the result of patch work, splicing Wien's and Raleigh's formulas together without theoretical underpinning. One may hope that in the not-too-far future an AI can tirelessly try and fit parametrized models in order from simple to more complex, using some appropriate measure of complexity. Compare the (pre-computed) Inverse symbolic calculator, which will quickly inform me that my calculated numerical value 1.644934 may be $\zeta(2)$, which, where I'm coming from, is simpler than $$\tfrac{1867}{1135}.$$  --Lambiam 09:21, 26 December 2022 (UTC)


 * I haven't bothered with them but I believe there are various programs o the web for guessing a formula from some data. They put together various formula and give results ordered by a measure of simplicity and how well they fit the data. I guess they might do Planck's law. If the possibilities that are programmed in don't include the right answer at least they'll probably give you a good approximation for physical processes - even if they can't give the prime numbers for instance. NadVolum (talk) 14:08, 26 December 2022 (UTC)
 * A considerable time ago there was some hubbub about a program that discovered laws of physics, but after a brief flurry of interest I never heard of it again. --Lambiam 20:48, 26 December 2022 (UTC)
 * The term for this appears to be Symbolic regression. Our article lists a few apps. Some are free. A comparison is lacking; without examining each individually, it is not clear which ones can handle multivariate models. --Lambiam 12:52, 28 December 2022 (UTC)
 * Thank you, Lambian.  Your first post was useless to me, but now you have supplied the topic name (symbolic regression), which I can now follow up.  Dionne Court (talk) 05:36, 30 December 2022 (UTC)