Wikipedia:Reference desk/Archives/Mathematics/2020 November 17

= November 17 =

generalized curve fitting
If I have a bunch of points xi,yi, with the xi known exactly but the yi subject to some error, and if I want to fit a straight line through the points, minimizing the square of each y error (either because I like the way &Delta;y2 takes the absolute value for me, or because I want outliers to have more weight), I can of course use the classic linear regression technique, which will actually spit out coefficients for me immediately, in closed form. Similarly, if I want something other than a straight-line fit, I can do things like taking log(yi) before fitting, or do a polynomial regression.

But what if I want something much more general? What if the y(x) function I'm trying to fit is arbitrary, perhaps with arbitrarily-many coefficients? What if I want to define my own error function, perhaps taking &Delta;x into account as well? Finally, what if I don't insist on closed-form output, but am willing to search, to iterate?

I'm sure there are well-studied ways of doing this, I just don't know what they're called. I could try to write my own program to do the searching, but there are probably extant ones out there that already work well.

I know about linear programming but I'm looking for something more general than that, too, because I'm not interested in limiting myself to linear functions. —Steve Summit (talk) 03:08, 17 November 2020 (UTC)


 * A frequently occurring case of model fitting where the commonly taught methods do not work is that where the model to be fitted to is a curve defined by the sum of a number of exponential functions; even the sum of two exponentials is difficult if their growth rates are not far apart. Similarly, fitting to a sum of sinusoids is hard. For a whole class of model-fitting problems where different methods are needed, see Energy minimization. In model fitting you have a data set of pairs $$(x_i,y_i)$$ with $$x_i \in \mathbf{X}$$ for some (further irrelevant) set $$\mathbf{X}$$ and $$y_i \in \mathbf{R}.$$ The model is defined by a parameter space $$\mathbf{S},$$ typically (a subset of) $$\mathbf{R}^n$$ where $$n$$ is the number of parameters, and a function $$f:\mathbf{S}\times\mathbf{X}\to\mathbf{R}.$$ We want to find a parametric vector $$\mathbf{s}\in\mathbf{S}$$ such that, ideally, all values $$f(\mathbf{s};x_i)$$ are close to $$y_i.$$ The goodness of fit can be given by a weighted sum of squares
 * $$G(\mathbf{s}) =_\mathrm{def}\sum_i w_i(f(\mathbf{s};x_i)-y_i)^2,$$
 * where the (non-negative) weights $$w_i$$ can be determined by any approach we fancy, one common method being to take $$w_i$$ to be the reciprocal of the expected variance in a measurement taken at $$x_i.$$ After all this preliminary work, the problem has been reduced to an abstract optimization problem:
 * Minimize the value of $$G(\mathbf{s})$$ for $$\mathbf{s}\in\mathbf{S}.$$
 * This may seem too abstract, but unless we know something more specific about how the value of $$G(\mathbf{s})$$ depends on $$\mathbf{s}$$ – as we do in linear regression problems, but which requires a very restricted class of models – there is not much better we can do. Now the tool chest of mathematical optimization can be applied; instead of aiming for the summit in the landscape, such as by hill climbing, we want to descend to the bottom of the deepest pit, but the mathematics is the same. Again, unless you can exploit some specific property of your class of models, in general you will have to use a computational iterative technique. In most cases we can analytically find the derivative of function $$G$$ and use the conjugate gradient method.  --Lambiam 09:45, 17 November 2020 (UTC)


 * Every curve-fitting is made (implicitly or explicitly) of two parts:
 * 1) Define a cost function that says how "good" a fit function is compared to the data
 * 2) Minimise that cost function
 * The first part is totally up to you, although of course there are standard ways to do it, and that choice will determine what results you get at the second step. Saying the second part has been well-studied is a major understatement; there are tons of algorithms that perform more or less well depending on certain general properties of the cost function (whether you know its jacobian, whether it has a single minimum, whether it is convex, etc.).
 * If you want to learn more about the topic, the keyword would be "optimization problem" but that article is rather short and the associated article mathematical optimization is IMO not very well-written. Gradient descent is a subset of the topic but the article is well-written and has lots of interesting refs.
 * If you are just looking at solving a given problem, you should just read the manual of whatever optimization library comes with your programming language of choice, learn barely enough about the subject to understand what the manual says, and pick whatever it tells you to pick. That is the math equivalent of taking whatever pills your doctor recommends rather than trying to form your own opinion about pharmacology: not intellectually rewarding, but incomparably quicker and safer. Tigraan Click here to contact me 10:40, 17 November 2020 (UTC)


 * Thanks for these tips. I would never have realized that "optimization" and "mathematical optimization" were the top-level terms to search on, but now I know.  And I'm glad to find the pages on hill climbing and Tabu search, which make sense and are the kind of approaches I'd been thinking about. —Steve Summit (talk) 04:46, 18 November 2020 (UTC)