Wikipedia:Reference desk/Archives/Mathematics/2022 May 4

= May 4 =

Need to deduce function of 2 variables
I performed an experiment varying two inputs d and p, resulting in a range of magnitudes of a result R. I need to discover a formula that models the process, but I'm stuck. At low values of P, the process accurately conforms to R = k1*k2 + k2*d (straight line crossing R = 0 at varying offset) where k1 is a fixed constant and k2 is a parabolic function of d.  At high values of p, the result must asymptote to R = 1. Depending on d there may or may not be a broad peak in R around p = 6. Can you help in deducing a formula for R = fn(d, p) or R = fn(k2, p)? Here is experimental data (experimental errors about +,- 0.03):-

In each p line, d = 0.5, 1.0,  1.5,  2.0,  2.5,  3.0 p = 3:   1.02, 0.60, 0.26, 0.38, 0.81, 1.71 p = 4:   1.04, 0.68, 0.40, 0.62, 1.37, 2.77 p = 5:   1.04, 0.74, 0.58, 0.76, 1.61, 3.23 p = 6:   1.07, 0.79, 0.57, 0.82, 1.68, 3.35 p = 7:   1.07, 0.81, 0.61, 0.85, 1.69, 3.33 p = 8:   1.08, 0.84, 0.65, 0.86, 1.66, 3.25 p = 10:  1.08, 0.87, 0.70, 0.89, 1.60, 3.00 p = 12:  1.09, 0.89, 0.75, 0.84, 1.47, 2.75 p = 14:  1.09, 0.91, 0.75, 0.83, 1.37, 2.52

Dionne Court (talk) 11:25, 4 May 2022 (UTC) Curiously, Richardson published a formula for the saturation current of the form $$AT^\frac{1}{2}e^{-\frac{k}{T}},$$ "fully confirmed by experiment", instead of the form $$AT^2e^{-\frac{k}{T}}$$ currently accepted in the law bearing his name. Both Richardson's law and Ohm's law (also in its original form) are monotonic in the values of the experimental variables. Your data are not monotonic in any of the two variables, which rules out most plausible models for moderately simple physical processes. It seems that the observed value (not "measured", for how can a measured value be dimensionless?) is the resultant of several at least partially antagonistic processes. I'm not sure what the linear behaviour is you see; I see nothing vaguely impressive. First you wrote "R = k1*k2 + k2*d ... where k1 is a fixed constant and k2 is a parabolic function of d", which – assuming that "parabolic function" means quadratic function – implies a cubic function in $p$. Now you write "The process closely follows R = k1d + k1P for low p", which is not what you wrote in your original post. From the fact that the crossing point is on the $d$-axis, I gather that this linearity considers the curves for fixed $d = 2.5$. But linearity appears to hold only for the very limited set of observations $d = 3.0$ and $d$. If that is indeed all, I still think this is a red herring. --Lambiam 18:11, 7 May 2022 (UTC)
 * Wikipedia has an article on Curve fitting that may give you some ideas. -- Jayron 32 12:16, 4 May 2022 (UTC)
 * Thanks, Jayron. It doesn't help that much though.  One can always find a polynomial approximation, as described in that article, for any f(x) over a limited range by having sufficient degree, but that reveals nothing about the underlying process, and you end up with a lot of random-looking constants.  I would rather discover a formula that models the process, so I can understand what's actually happening.  Extrapolation beyond the measured range will then be a lot safer.  In any case, polynomial curve fitting in 2 dimensions seems to be much more difficult.
 * I already know it assymptotes to unity at high P and probably at low d, the p at low values has a straight line effect, and the effect of d is offset-shifting and parabolic, but I'm having trouble tying it all together. Dionne Court (talk) 12:56, 4 May 2022 (UTC)
 * Very different processes can give rise to such a data set, so it is not possible to deduce the process merely from the data. For a given value of $p$, the output value first goes down and then up with increasing values of $R$. The increase from $1$ to $p$ is pronounced. For higher values of $R$, the output value first goes up and then down with increasing values of $d$. If you have an inkling why this should be so, it may give a clue to the underlying process. BTW, polynomial curve fitting in higher dimensions is not intrinsically more difficult, but is not particularly meaningful absent a rationale for seeking a polynomial model. --Lambiam 15:39, 4 May 2022 (UTC)
 * I agree; you need to start with some model for the function, some expression with a finite number of parameters where you can you can adjust the parameters to fit the data. Given the asymptotes, I think you can rule out polynomial approximations. That leaves more complex models, perhaps a rational function or perhaps involving non-rational functions such as the exponential function. Finding the correct model usually requires having some theoretical understanding of the the situation, in other words the data by itself is not enough. For example, suppose you had data for the radioactivity level over time for some sample of material. Knowing that you're dealing with radioactivity, and having a theoretical understanding of how radioactivity works, you would model the data using an exponential function, perhaps a combination of exponential functions. But with just the raw data and no understanding of where it came from, there's no compelling reason to assume that an exponential function would be the best model; you could probably find a good fit for the data with a rational function. You can find a function which fits any set of values if you allow enough parameters, but too many parameters results in Overfitting, and the resulting functions are useless in terms of predicting values not in the data set. --RDBury (talk) 22:51, 4 May 2022 (UTC)
 * I thank Lambiam and RDBury for taking the trouble to write, but both have only told me what I already know: I already know one can fit any curve over a finite range by replacing a data set by (probably overfitting) with another set of data (constants). Most likely my difficulty comes from not being experienced in any curve fitting beyond straight line (as in y = a + bx), parabolic, log x and e^x, sine, tan, and similar simple functions, despite 4 years of university.
 * This data set came from a problem in electromagnetics and I do know that it must assymptote to unity for high values of p (P >> 14) - this would imply a ke^-P term or 1 - k/p term.  But beyond that I'm stuck.  Dionne Court (talk) 01:26, 5 May 2022 (UTC)
 * How do you know the value of $p$ must approach $R = k_{1}p&thinsp;e^{−k_{2}p}$ for high values of $k_{1}$? A statement of this nature cannot be based on experimental observation, so there has to be an underlying theoretical insight, which can be the first step towards a mathematical model. Apparently, $k_{2}$ is a dimensionless quantity. Are $d$ and $p$ dimensionless too? There are just too many classes of models one might consider, such as $p$, in which $d$ and $R$ depend on $R$ but not on $d$. --Lambiam 05:54, 5 May 2022 (UTC)
 * Lambian, your guess that d, p, and R are dimensionless is correct. R is an input to another equation, and R must indeed tend to unity as p becomes large.  Indeed it does provide a clue or first step - I believe I said that, in different words above.  p is a shape factor derived from physical dimensions.
 * As far as R = k1p e−k2p as a possible solution is concerned, I wish it were that simple. Its' not, for several reasons.  For example, it assymptotes to zero.  This could be fixed by adding 1, but that won't work for moderate d, as then it must always be above unity, and the actual process with moderate d eases up from zero towards unity, without ever exceeding unity.   Also, R is zero at various negative values of p, the negative values depending on d.  Even for a single value of d, e.g., 3, no values of k1 and k2 exist that come anywhere near a match to the experimental data, even with appropriate offsets added.  Try it and see - peak error will be about 80%, Std dev 25%.  Dionne Court (talk) 13:48, 5 May 2022 (UTC)
 * You did not answer the question what your knowledge of the asymptotic behaviour is based on. The information that $p$ can be negative is relevant. Usually, dimensionless parameters based on a shape, such as aspect ratio, are positive. Can $d$ be negative too? And what about $R$? If its value can be equal to zero, can it go lower? You appear to have more info on how the value of $R$ varies with $p$ and $y = e^{x} + 1$ than provided by the table. --Lambiam 14:07, 5 May 2022 (UTC)
 * d cannot be negative.  It ranges from zero to about 3.4.   p cannot in reality be less than +2.  However, if you assume that R follows a straight line for low p (this is a very good fit to the data), that line crosses R = 0 at a small negative p. Note that that notional negative p for zero R varies with d.   R ranges from zero to a positive maximum (at p ~ 6 when d > 2).  It cannot ever be negative.  R must tend to unity as p becomes large.   This plus the experimental data is all I know.
 * You can safely assume that R assymptotes to unity as p become large. To fully explain why that is so would entail me posting several thousand words of electromagnetic theory, unless you already have expert knowledge on electromag theory. Dionne Court (talk) 15:44, 5 May 2022 (UTC)
 * I don't doubt that we can assume that, but knowing why it is so should help in restricting the infinitude of potential classes of model. Likewise for the dip and the surge with increasing $R_{d,p}$; is this something counterintuitive or somehow to be expected on a priori theoretical considerations? Presumably, $d$ is a ratio of two equidimensioned quantities, such as e.g. relative permittivity. In the absence of a model, basing expectations of the behaviour of $p$ outside the explored area on extrapolation is unsound (and even with a model this is iffy; compare the ultraviolet catastrophe resulting from extrapolating the Rayleigh–Jeans law). Is there a physical reason that $log R_{d,p}$ cannot be less than 2, or is that a limitation of the experimental set-up? --Lambiam 16:35, 5 May 2022 (UTC)
 * Yes, there is indeed a physical reason why the shape factor p cannot be less than 2, just as a circle has the lowest perimeter length for a given contained area, compared to any other shape.  However, if you project a tangent to the R curve at low d back, it will cross R = 0 at an apparent negative p.  Thus any mathematical model must take that into account. I don't know why (perhaps a really good physicist could figure it out, but I can't - that's why I want to find a mathematical model that fits the data) and it complicates the problem - but that's what the data shows.  R is a value needed in another equation.  If R does not approach unity at high p, that equation will fail. Dionne Court (talk) 00:58, 6 May 2022 (UTC)
 * Any tangent to the curve given by the equation $p$ will cross the x-axis, as does the tangent taken for any differentiable function other than at a stationary point, so the fact that the tangent to the curve of $d$ for fixed $d > 2$ as a function of $R → 1$ does so is not of any obvious relevance. One observation: considering the curves of $d → 0$ for fixed $p$ as a function of $d$, they seem to become close to linear for $p$, all with the same slope. The table also suggest the hypothesis that $d$ as $p$. --Lambiam 17:09, 6 May 2022 (UTC)
 * The fact of the tangents crossing the the p-axis (x-axis) is very relevant, as at what value of p it crosses depends on d. The slopes of log R for d => 2 are indeed linear within data experimental accuracy (interestingly, for all d, Ln R is always two straight lines intersecting at d = 1.75), but I found that the slopes for d > 2 varies from 1.5 to 1.1 as p varies 2 to 14.  I get a better fit by assuming it's a parabolic relation.  I was already aware that R probably tends to unity as d goes to zero - I stated so earlier.  I actually took data for d = 0.25 but unfortunately experimental error rises dramatically when d goes less than about 0.4 or so, and I consider the d = 0.25 data thereby useless.
 * Lambiam, I appreciate the time you have expended in responding here, and sometimes being able to discuss a problem helps clarify one's mind and lead to oneself solving it, but you haven't told me anything I didn't already know. Dionne Court (talk) 03:13, 7 May 2022 (UTC)
 * It would be surprising if the value of $d$ for which the tangent crosses the x-axis did not depend on the value of $p = 3, 4, 5$ (and the value of $d = 0.5, 1.0, 1.5$ where the tangent is taken). We have tried to tell you that constructing a (meaningful) mathematical model for the process cannot be based just on the experimental data. If you knew that already, why did you post the question in this form? My attempts to coax you to provide more information about the experiment, sorely needed to give us a clue, have been to no avail. --Lambiam 11:38, 7 May 2022 (UTC)
 * The x-axis crossing point offset matters because it dramatically affects the initial slope. Thus any mathematical model must account for this - it is not unimportant as you claimed. The process closely follows R = k1d + k1P for low p as I said in my original post. As I said, I don't know why it does - it just does.
 * Sometimes, understanding of a phenomena comes from working forward - as with Einstein deducing E = mc2. He deduced that formula purely by a thought process - it was not verified by measurement for several years, when good electron accelerators were built. Unfortunately I am no Einstein.  So, I have to work backwards - find a formula that fits measured data - as did O W Richardson with electron emission and perhaps that will lead to understanding why it does what it does, as it did with electron emission.  However my knowledge/skill in curve fitting is limited, and I'm no O W Richardson either, so I asked for help in curve fitting.
 * I'm not able to provide any further insight into why it behaves as it does. If I had any knowledge or insight beyond what I posted, I probably would not have needed to post a question.
 * It is possible that gathering more data over a wider range of p will help, as may a finer stepping of d. Probably not, but I'm doing it anyway. It will take a while as it is rather tedious to do. Experimental error is a problem outside the range I posted.  Dionne Court (talk) 13:16, 7 May 2022 (UTC)
 * You said a meaningful mathematical model cannot be constructed from data alone. Sometimes that is the only option.  As with O W Richardson's formula for electron emission - he fitted a curve to the data, and when dimensions were lined up, a constant was left over - which happened to be equal to Boltzman's constant.  A more famous example is Ohm's Law relating electric current to voltage. G S Ohm had no idea why it did what it did (and established physicists initially rubbished him for it), but he had found a formula that fitted his measured data and he turned out to be right. Dionne Court (talk) 13:26, 7 May 2022 (UTC)
 * Absolutely right, Dionne.  Back in 1998 a don in the Department of Mathematics at the Open University published in the house magazine M500 a formula which, after converting the numbers of the months to radians and taking trigonometrical ratios, produced a curve which gave the number of days in each month.   It allowed for leap year but not the century year rule.   A couple of months later he was bested by a correspondent with a purely arithmetical formula which allowed for all the exceptions.   Irv Bromberg of Toronto University published a purely arithmetical formula in Gregorian calendar on 27 May 2011.   The OU arithmetical formula (along with some others previously added) was discussed on this reference desk in 2016 Special:Permalink/724310991. 92.31.253.141 (talk) 17:51, 7 May 2022 (UTC)
 * See Overfitting, also mentioned earlier. --Lambiam 18:11, 7 May 2022 (UTC)
 * I'm not sure what to make of that, 92,31,253,141. I have a background in computer software development.  Compact arithmetic methods of producing calendars, calculating the days between 2 dates, etc., have been used in digital computers ever since there have been digital computers.  Of course, there is no fundamental physical process underlying the calendar, it's an arbitary device of a pope.  Dionne Court (talk) 02:46, 8 May 2022 (UTC)


 * Sorry about the typo in my original post. On the red herring aspect, you may be right.  My data is so sparse that an exponential curve could fit just as well.  And other curves - but one always tries the simplest things first.  We still need to account for the x-axis crossing variable off set though. Yes, parabolic = quadratic. Y = K1 + K2X2.
 * The subtlety in the Richardson formula story is that measured data was never accurate enough to tell which formula was right, but later theoretical analysis by someone else indicates the later form is correct. The 2 versions look very different in respect to T, but the numerical results are close enough. Which of course supports what you have been saying, but we (myself in any case) can only do what we can, just as Richardson did.
 * Re "observed" vs "measured": You are correct. R is input to another (complicated though well established) formula, the result of which is a physically measurable quantity (change in inductance).  Rather than complicate the question by requiring respondents to know a specialised branch of physics as well as math, I measured the change in resultant quantity and inverted the established formula to back calculate R, as I know the other inputs of that formula.  Dionne Court (talk) 03:18, 8 May 2022 (UTC)
 * Given a differentiable function $$f$$ in one variable, the tangent to its curve taken at a non-stationary point $$x_0$$ crosses the x-axis at $$x_1=x_0-f(x_0)/f'(x_0)$$ (see Newton's method). For $$f(p)=R(d,p),$$ defining $$R_p(d,p)=\tfrac{\partial }{\partial p}R(d,p),$$ this amounts to
 * $$p_1(d)=p_0-\frac{R(d,p_0)}{R_p(d,p_0)}.$$
 * So the tangents for different values of $$d$$ cross the x-axis at the same $$p$$ value only if the ratio of $$R(d,p_0)$$ to $$R_p(d,p_0)$$ is constant as a function of $$d$$. The fact that this ratio varies with $$d$$ is what one would expect and not does not require an explanation.
 * It would seem interesting to see what happens at much higher values of $$p$$ for say $$d=3.0,$$ to see how "fast" $$R$$ tends to $$1$$. Also, obtaining data for $$d=0.6$$ may be interesting to test the viability of the hypothesis that $R → 1$ as $d → 0$. And is it possible to get closer to $$p=2$$? Good luck with your tedious experiments. --Lambiam 05:43, 8 May 2022 (UTC)
 * As stated I am in the process of getting more data. However (as I stated before), getting data for d much less than 1 is subject to increasing experimental error.  Given experimental error, data at d = 0.6 won't be usefully different to d = 0.5 which you already have.  I have data for d = 0.25. R is around unity, but the experimental error is so high R could be anywhere from 0.6 to 1.5.  d = 0 cannot be done.  I can certainly obtain data for higher values of p, but it will take a while.  Each value of d or p takes about 2 hours to do, which is all the time I have available in a typical day.  I have today obtained data for d = 0.75, R is just under and very close to unity for all p 2 to 14 - experimental error about +,- 0.01.
 * I don't have tooling to make up a jig for p between 2 and 3.  It can certainly be done but it would take quite a while to arrange.  I obtained apparatus for this experiment (done at home) weeks ago when I had little idea what the R curves would look like.  Dionne Court (talk) 09:06, 8 May 2022 (UTC)