Talk:Linear least squares/Archive 3

Matrices: bold or not?
Recently the matrices were made non-bold. Now an editor is making them bold again. Let's talk about it here. Which convention is standard in the statistics literature? If there is no standard practice, then which will be clearer to Wikipedia's audience? Mgnbar (talk) 12:42, 20 May 2009 (UTC)
 * I was the one who removed the bolding, mostly because I thought it ugly and made the article harder to read. There was some earlier discussion, where no opposition was raised to the removing of bolding.


 * While there are probably some books which do use bolding, neither of the two closest to me at the moment (Wasserman's All of statistics and Gelman's Bayesian Data Analysis) use bolding for matrices.—3mta3 (talk) 16:06, 20 May 2009 (UTC)


 * Sorry guys and ladies, I was not aware about the previous discussion about bold face notation for matrices and vectors. As you may guess, I am a proponent of this method. It is not a so much a question whether it looks good or bad, but whether it helps the reader to understand the mathematics behind it. And it helps me a lot. Let me bring an example from section 9 of the main atricle:


 * $$\mathbf{\hat r} = \mathbf y- X \hat \boldsymbol \beta= \mathbf y- H \mathbf y = (I - H) \mathbf y $$


 * $$\mathbf{\hat r} = \mathbf y- \mathbf X \hat \boldsymbol \beta= \mathbf y- \mathbf H \mathbf y =

(\mathbf I - \mathbf H) \mathbf y $$


 * The first equation has in my opinion two problems: First it uses bold face for vectors but not for matrices. And what is I - H? Subtraction of two scalars? Is I an unit matrix?


 * In the second equation there is no problem of this kind. Once our brain accepts that everything upper case bold is a matrix the subtraction I - H is obvious an matrix operation and I is the unit matrix.


 * I believe that the purpose of a mathematical text is to be understood and not to be beautiful. I will not touch this article on this issue unless people agree with me.


 * TomyDuby (talk) 18:18, 20 May 2009 (UTC)


 * Personally I find boldface a helpful convention too. The other standard convention for matrices is that they are upper-case, but that's also the convention for random variables, so by itself it doesn't distinguish the two. The matrix (mathematics) article uses boldface. So does Shayle Searle's Matrix Algebra Useful for Statistics ISBN 0-470-00961-6, and Searle is "Searle is one of the first statisticians to use matrix algebra in statistical methodology", (or so his article claims anyway). Of the four journal articles next to me at present, two use boldface (JRSS C: Applied Statistics and Controlled Clinical Trials) and two don't (The Review of Economics and Statistics and Epidemiology). Whichever, it should surely be consistent, either boldface for both vectors and matrices or for neither. Qwfp (talk) 20:17, 20 May 2009 (UTC)


 * I agree that matrices and vectors should bolded or not bolded together. My experience from math is overwhelmingly that these things are not bolded, but if people here think that bolding aids clarity, then I'm willing to go along. Mgnbar (talk) 15:58, 22 May 2009 (UTC)

Hmm, still no resolution. I guess my main objection to the bolding is that it looks really bad in inline text: it distracts the eye and makes it difficult to read a sentence (this is why italics rather than bold is used for emphasis in most books and wikipedia). See an old version (especially the latter half of the article). Perhaps this discussion should be brought up on Matrix (mathematics)? (which is bolded, though no discussion seems to have taken place when it was changed) —3mta3 (talk) 11:30, 23 June 2009 (UTC)


 * As Arthur Rubin suggested above, perhaps the logical place for discussion would be Wikipedia talk:Manual of Style (mathematics). Could notify WT:MATH and WT:WPSTAT. I'm not sure consistency across all articles is achievable or necessarily desirable though. I think bolding of matrices is useful in articles such as this about topics where matrices and random variables could otherwise be confused, and also has a place in articles on elementary topics such as Matrix (mathematics), but I wouldn't argue for it in more specialised articles that don't involve random variables such as QR decomposition. Qwfp (talk) 13:39, 23 June 2009 (UTC)

OLS
I hate to multiply entities without necessity, but probably Ordinary least squares should become a separate article.

As the first sentence currently states, "Linear least squares / OLS, is an important computational problem, that arises when it is desired to fit a linear mathematical model to measurements obtained from experiments."

Whereas I'm quite sure that the definition for OLS would be "Ordinary least squares is a very popular technique for estimation of a linear regression model."

See how it is two completely different aspects of the problem? The definition section uses notation incompatible with that used in linear regression, and then it even fails miserably at inverting matrix X'X. I mean of course you computer scientist will say that inverting that matrix is not the best way of solving the normal system; but honestly i'd like to see the argument why should i care given the fact that no application ever contains more than say 20 regressors, and inverting a 20x20 matrix on a modern computer can be done faster than in a blink of an eye. And having ill-conditioned matrix is not a good reason since in that case you'd be better off using ridge regression anyway.

Ok, next the article proceeds to define non-linear least squares, then wastes a whole section on deriving a result which can be done in one-line with matrix notation, then we discuss in great length fine points about solving the normal system (which really isn't of much interest to an average person, seeing as ALL modern software packages are already quite capable of doing it); the rest of the article is mainly unstructured but it deals with properties and extensions.

However i'd like to emphasize that never does the article mention large sample properties of the estimator: consistency and asymptotic normality. Never does it mention any of those quantities which appear in OLS output of modern statistical package, such as R2, Akaike and Schwartz information criteria, F-test, etc. I mean, a person who runs first regression in his life might want to know what all those quantities spitted out at him are, however the article doesn't give even a slightest hint.

In short, this article is about LLS as computation method; whereas OLS article should be about the estimation technique.

Stpasha (talk) 06:36, 1 July 2009 (UTC)


 * Between this article, linear regression, linear model, and least-squares estimation of linear regression coefficients there are already quite a lot of articles. Personally I think this article is better off to focus on the computational aspects (which are far from trivial even today: 20 may be a lot for a textbook example, but there are plenty of real world problems which are orders of magnitude larger: Gaussian process regression, for example). I think a lot of your concerns would be better addressed in the linear regression article (the confidence interval stuff in the current article could be moved there as well).


 * Ahh, I see you have commented on the other talk pages. Anyway, I still think this should focus on the computational problem: there are uses for linear least squares other than linear regression (this is one aspect of the article that could be improved). —3mta3 (talk) 19:39, 1 July 2009 (UTC)

Merger proposal
Skbkekas has proposed to merge several articles devoted to linear least squares topic. Since there is no discussion as of yet, I'll start one.

It is my view that the topic is broad enough and cannot be covered within a single article without going way over the suggested article length limit. As such it is probably not bad to have several distinct articles, as long as there is clear understanding of what is each article's topic, so that the overlap between them is minimal and they are all cross-linked as necessary. The articles in question are following: My vote is to delete the last article and keep the rest. // Stpasha (talk) 04:35, 14 July 2009 (UTC)
 * Linear regression (or Linear model) -- linear regression model in general, listing all the different methods for estimating this model
 * Ordinary least squares -- about the simplest and most popular method of estimating linear model: least squares.
 * Linear least squares -- about numerical methods of computation of the OLS coefficients. This article is distinct as it focuses on computational algorithms, not on the question of estimation per se. (Currently the article contains lots of redundant material and should be cleaned up).
 * Simple linear regression -- derivation of OLS estimates in case of a single regressor + a constant. Useful for practical purposes.
 * Least-squares estimation of linear regression coefficients -- ???


 * I have placed a note about this discussion at Wikipedia talk:WikiProject Statistics to try to attract contributors, otherwise this may be lost here. Melcombe (talk) 11:20, 14 July 2009 (UTC)
 * And have added notes of this discussion to some of the articles mentioned above. Melcombe (talk) 11:47, 14 July 2009 (UTC)


 * It may be good to have a new article "outline of regression analysis", similar to Outline of probability, in which all of the articles in Category:Regression analysis might be included and either grouped under appropriate headings or given a brief description (similar to the above). Melcombe (talk) 11:30, 14 July 2009 (UTC)


 * I certainly agree that multiple articles are necessary. Although I added the merger proposal to ordinary least squares, I see the value of keeping this and linear least squares separate, with the distinction being along computational versus statistical/inferential lines as suggested by Melcombe (although presently there is no such distinction, and linear least squares is much better than ordinary least squares, that is why I proposed the merger). There is also least squares which is an overview of linear and non-linear least squares.  I like the idea of a separate "outline of regression analysis" article, although alternatively, regression analysis itself could serve this role. I propose as first steps that we (1) AFD Least-squares estimation of linear regression coefficients and move any salvageable material into the other articles, and (2) clean up and distinguish linear least squares from ordinary least squares. Skbkekas (talk) 14:03, 14 July 2009 (UTC)


 * I agree both that multiple articles are necessary, and that the current structuring and division (/overlap) of content between these articles is less than ideal. There were similar discussions around 18 months ago initiated by User:Petergans, both at Talk:Least squares and then also above at . Part of the problem is catering for a number of different audiences: statisticians, numerical analysts, (other) mathematicians, engineers & scientists, econometricians…. Each group has a somewhat different perspective and standard notation, which seems to be at least part of the reason for the current, erm, diversity of articles. It's great if those with the energy and a fresh perspective wish to improve matters; I'm not sure I can summon up the enthusiasm to contribute too much myself. A couple of concrete suggestions though (ok, three):
 * I can see the sense in keeping the computational aspects of linear least squares / ordinary least squares in a separate article, but the division needs to be clearly signposted, starting with the article names; the two terms are synonyms as far as I'm concerned at least, and I think one name should redirect to the other, with the 'computation aspects' article having a descriptive title something like "Computation of ordinary least squares estimates" or "Numerical methods for linear least squares". However, linear least squares has a far longer edit history and Talk page, so it might be less confusing for posterity to keep that as the main article, merge the new content from ordinary least squares and split off a new article on the computational aspects.
 * I agree with merging Least-squares estimation of linear regression coefficients (also suggested last time around, but never quite happened), probably to linear least squares as currently suggested. No need to take it to AfD though; in fact deleting the article isn't allowed if any of the content is re-used (something to do with licensing and preserving the edit history; see Merge and delete).
 * Don't forget redirects, e.g. at present Ordinary least squares regression redirects to Regression analysis (See for more) and loads of things redirect to Least squares. How much impact this has on the hitcounts at WikiProject Mathematics/Wikipedia 1.0/Frequently viewed/List I've no idea.
 * --Qwfp (talk) 19:36, 14 July 2009 (UTC)


 * It would be far easier to rename this article into numerical methods for linear least squares, and move all salvageable material into ordinary least squares (this material consists of only 2 sections: “typical uses” and “software”). Then we can have the name “linear least squares” redirect to “ordinary least squares” and this conversation will finally be over...  //  st pasha  »  03:36, 23 April 2010 (UTC)


 * Two points. (i) In my approach an "outline of regression analysis" article would be no more than a grouped and annotated list of articles, whereas Regression analysis is a readable article and should remain so. Perhaps it may be reasonable to have two lists of articles in the style of some existing pairs of "basic topics in ..." and "topics in ..." lists. (ii) Part of the problem is illustrated by the new ordinary least squares article which dives in with an explicit model rather than starting from the point that ordinary least squares is an estimation procedure which exists independently of any model, but whose properties can be evaluated for any model. Having an overview of what should be the main topics of the various articles would help to prevent such uncoordinated diversification. Melcombe (talk) 08:50, 15 July 2009 (UTC)


 * I have created Outline of regression analysis. Skbkekas (talk) 03:07, 28 July 2009 (UTC)


 * I agree with Qwfp that we should have Ordinary least squares redirect to Linear least squares. I think there has been enough time to comment, and I hope to get around to doing it (and transferring whatever is worth transferring) in a few weeks, if someone else doesn't do it. I also think we should have an article explicitly titled something like Numerical methods for linear least squares if we want an article on that subject, rather than trying to make Linear least squares be that. Eric Kvaalen (talk) 19:53, 19 April 2010 (UTC)


 * The LLS article certainly have been around for a much longer time then the OLS, which doesn't necessarily mean it's a better article. If you compare the two, you'll see that: both are mostly unreferenced; the LLS article is poorly organized and inconsistent in notation; the OLS article is organized and consistent, but somewhat lacks in content, failing to mention the WLS and constrained OLS (the topics which are only briefly touched in LLS article), as well as tests associated with the regression. If you want to merge the two articles, it's good, but the result shouldn't be worse than what we have right now.  //  st pasha  »  21:17, 19 April 2010 (UTC)


 * Honestly -- who keeps removing the calculations for a straight line? I realise that fitting a straight line is a simple example of the general linear least squares process, but it is also one of the most commonly used. It took me AGES to find out that the relevant formulae have some how been relegated to Regression analysis, and it's not at all clear why they should be be put on that page. Similarly, I invested a substantial amount of time creating the Mean and predicted response page, to explain a concept that is widely mentioned in university course but for which no good source was previously available on the internet. The formulae for fitting a straight line SHOULD be includeded in one of these pages, whether it's under linear regression, linear least squares etc. Or at the very least, there should be a link from the linear least squares pages and the least squares pages to a page for fitting a straight line. Perhaps it needs its own page, but given that this is the most common type of regression in the social sciences (and often in physics, chemistry etc as well), one would think that the page should a) be easily locatable, b) contain explicit formulae for the estimates of the slope and the intercept of the line, c) contain the formulae giving the standard errors, d) contain the formula which adjusts the standard errors for over/underdispersion of the data points about the straight line. Velocidex (talk) 23:19, 20 July 2009 (UTC)


 * We do have a dedicated page for this: Simple linear regression. It is linked both from the linear regression article and from the ordinary least squares article.  Well, maybe those links aren't very easy to spot, in which case they must be made to STAND OUT MORE somehow, however it would be wrong to include the solution of simple ols case in an overview article such as linear regression is.


 * Aaah excellent... sorry, I couldn't find that page. I have no issues with the straight line formulae being put on a different page, provided they are easily locatable... I know I have had physics students ask me for more info on the straight line formulae, so I know lots of people want to understand them better :) Velocidex (talk) 23:24, 21 July 2009 (UTC)


 * I’m not entirely sure what did you mean by “similarly, I have invested a substantial amount of time creating Mean and predicted response page …” — that is to say, which exactly point are you trying to prove “by similarity”? And since you mentioned that article, I'm afraid it contains factually incorrect material when considering the “predicted response”.  You see, the “mean response” is a function of only $$\hat\alpha$$ and $$\hat\beta$$, as such $$\hat{y}_d$$ is asymptotically normal and therefore its confidence interval may be constructed once we know its asymptotic variance.  I even gave such formula in the OLS article, see the “Large sample properties” section.  On the other hand your “predicted response” is a function of $$\varepsilon_d$$ and as such it is not asymptotically normal and no inference about its confidence bands can be made (unless of course you're willing to make an assumption that errors are all normally distributed, however such assumption is usually being frowned upon in modern statistics). ...  st pasha  » talk  » 00:28, 21 July 2009 (UTC)


 * Agreed that the mean response is asymptotically normal, but that the predicted response depends on the particular distribution of the $$\epsilon$$s. I agree that the normality-of-errors assumption is frowned upon, but nevertheless it remains approximately true for a wide class of problems. Furthermore, it is at least a starting point to work from. Once you move away from this assumption, you need to get into robust statistics, which are numerically tractable but (in general) not analytically tractable. If you're a researcher and you want to include a more general error distribution you can just do everything numerically. On the other hand, if you're a 2nd or 3rd year undergraduate psychology student, these formulae crop up everywhere. They are a starting point, not the final word, on predicted response. The appropriate thing to do is to caveat the derivation with "this assumes the errors are normally distributed". Velocidex (talk) 23:24, 21 July 2009 (UTC)


 * That is certainly true. Normality assumption is popular, easy to understand and work with, and can be valid in a variety of situations (depending on the field you're working in). Who never used normality assumption shalt be the first to throw a stone.
 * However I'm trying to point out that having two results side-by-side — one being asymptotically valid for any distribution of the errors, the other being applicable only when errors are normal — is very much misleading. This distinction should be at least emphasized and made clear. Even better would be first to consider normal errors, and after that to provide the method for calculating the predicted response confidence interval in the case of arbitrary errors (probably bootstrap). ...  st pasha  » talk » 12:51, 28 July 2009 (UTC)


 * I think we need an article on OLS/least squares/ any model that is given by $$E(Y|X) = X\beta$$, and there is no reason to have more than one article. I think an article on estimation of $$Y=X\beta+\epsilon$$ with $$\epsilon$$ distributed $$ iid N(0,\sigma)$$.
 * I can understand why we have the article on "Simple linear regression", I think this article should be viewed as a simpler version of the OLS article, and not remove any content from the OLS article because it is also there, starting slow is always a good idea, even if it is the advanced article.
 * Even then, the article regression analysis is also sensible, as in generalized linear models, though this is not mentioned in the first article. It would almost make sense to have a concept disambig page. PDBailey (talk) 01:38, 28 July 2009 (UTC)

Broken link
The link associated with the words "example of linear regression" is broken; apparently the page it referred to was modified. RVS (talk) 00:36, 2 September 2009 (UTC)

Under-determined linear systems
The article erroneously restricts least squares to over-determined systems.

Optimization specialists use least squares for under-determined systems. Least squares solutions of nonnegatively constrained problems are discussed in the book of Åke Bjork (SIAM) or in the linear programming book of George Dantzig (with Thapa, volume 2). Such problems have become popular also because of genetic applicatons where there are more parameters than observations--- see lasso estimator etc.

(Numerical analysts use least squares for over-determined systems.) Thanks. Kiefer.Wolfowitz (talk) 14:07, 29 March 2010 (UTC)

Requested move
Numerical methods for linear least squares → Linear least squares — Return to original name as content has little about numerical methods and original name was reasonable Melcombe (talk) 15:49, 20 October 2010 (UTC)


 * This discussion seems to be stale, and there doesn't seem to be consensus (after a very long time) for simply moving the page as requested, though other changes and reorganization seem to be desired. I'm therefore closing the move request to get it off the WP:RM backlog - feel free to reactivate it if you still think it's going anywhere.--Kotniski (talk) 13:55, 6 December 2010 (UTC)

Survey

 * Feel free to state your position on the renaming proposal by beginning a new line in this section with  or  , then sign your comment with  . Since polling is not a substitute for discussion, please explain your reasons, taking into account Wikipedia's policy on article titles.


 * Support. This article is little to do with numerical methods, and there was no discussion of the move.  Suggest deletion of the redirect after the return move, as it's incorrect.  — Arthur Rubin  (talk) 17:30, 20 October 2010 (UTC)


 * support. I was totally surprised by the content of this page given its title. Old title was a better description of the content. However, I think making a page at this spot would make sense and there there is enough on this page to make a start class article with mostly copy and paste. 018 (talk) 20:17, 20 October 2010 (UTC)
 * I'm OK with splitting off the numerical stuff into a stub after the article is moved back. — Arthur Rubin  (talk) 00:21, 21 October 2010 (UTC)
 * Arthur, Linear least squares currently redirects to Ordinary least squares. Do we need both articles?  --Born2cycle (talk) 21:35, 23 November 2010 (UTC)
 * I'm OK with there being only one article. However, others have stated that "ordinary least squares" is a statistics term, and "linear least squares" is a mathematics term, which happen to have the same equations, but not the same meaning.  I do not agree, but I'm pointing out the conflicts we should expect.  — Arthur Rubin  (talk) 21:47, 23 November 2010 (UTC)


 * Partially support. I think that numerical methods for linear least squares can continue to exist, but the "motivational example", and everything from "properties of least squares estimators" to the end should be moved to recreate a new (shorter than the earlier version) "linear least squares" article. Skbkekas (talk) 22:01, 1 November 2010 (UTC)


 * Oppose. There was a lengthy discussion long time ago (both on this Talk page, and on the talk page of the WPStatistics project) that we have too many articles devoted to linear regression, and all those articles have significantly overlapping content. It was decided that each article should be given its own clearly defined topic, which should first of all be reflected in the name of the page. Then each article should be cleaned up to remove the content overlap (or at least summarize the overlap to small subsections, with the proper template). This is why this article got renamed, and this is why it has a big  banner at the top. There is no point in renaming this article to “linear least squares”, since we already have a perfectly reasonable linear least squares article.  //  st pasha  »  21:46, 20 October 2010 (UTC)


 * Oppose, and let the Wikiproject members get on with it. Rome wasn't built in a day. Andrewa (talk) 12:26, 22 October 2010 (UTC)


 * Support. An article with the "numerical least squares" title should display the brilliant findings of Åke Björck (and mention the stable updating algorithms & worst-case worries of Gene Golub). This page has little numerical content. Kiefer.Wolfowitz (talk) 22:30, 1 November 2010 (UTC)

Discussion

 * Any additional comments:

Wow, placed a "clean up" tag. That is a... really underwhelming effort. I think it would make more sense to redirect the page and move this to a sandbox for now. OR I could do a hatchet job to it where I removed everything that isn't numerical OLS from it and the scraps could be cleaned up... either of those is a WAY better option that leaving this in the article space. 018 (talk) 23:34, 20 October 2010 (UTC)
 * Wait, are you putting a blame on me for not working on an article whose topic I know rather superficially? //  st pasha  » 03:43, 21 October 2010 (UTC)
 * stpasha, I'm not trying to blame anyone for anything (sorry if I was too strong in my wording). I'm just bemoaning the poor documentation of the previous decisions leading to this additional long discussion. Again, if this article was meant to be reworked so much, I think doing a terrible job and putting a cleanup tag would have made more sense. But I'm just one editor. 018 (talk) 15:22, 21 October 2010 (UTC)

I read this talk page before I commented. There's no discussion of this move; in fact, there seems consensus that OLS and LLS should be the same article. — Arthur Rubin (talk) 00:11, 21 October 2010 (UTC)
 * Which they are, as of right now (LLS redirects to OLS). I'm not sure what your point is? //  st pasha  » 03:43, 21 October 2010 (UTC)
 * This article was at LLS, and hasn't changed significantly since then. This article (LLS) ahould be merged with OLS, and if it is considered appropriate to create a "numerical methods" article, the numerical methods should be split out from the existing articles.  There isn't much in the way of numerical methods here.  — Arthur Rubin  (talk) 06:43, 21 October 2010 (UTC)
 * The articles were in fact merged, only it is hard to recognize right now because they evolved significantly since then. The section “Motivational example” was not taken, because of its questionable motivational quality, and because the article already contains an example taken from the “simple linear regression” article. “The general problem” already exists in the OLS. “Uses in data fitting” is just a stray section which should have been in the “Typical uses and applications”. “Derivation of the normal equations” was moved to the “proofs involving OLS” page. “Computation” is the numerical part, and was supposed to be left in this article. “Properties of the LS estimators” was merged into the OLS. “Weighted LS” was not merged, because it belongs to the WLS article. “Parameter errors, correlation and confidence limits” was merged. “Residual values and correlation” was merged. “Objective function” simply makes no sense, but its purported idea will be included in the “OLS#Testing” subsection when it will be eventually added. “Constrained linear least squares” is another extension of the method, so it should have its own page, and mentioned here only briefly. “Typical uses and applications” deserves to be merged though it wasn’t, mainly because its listing is so vastly incomplete. “Software for solving linear least squares problems” was not merged because it is meaningless — there is NO statistical package that is unable to solve the LLS problem. “External links” was not merged because it violates WP:EL anyways, so why bother. //  st pasha  » 03:26, 22 October 2010 (UTC)


 * The problem is that the content of Ordinary leat squares was rearranged to make it look as if it is only applicable if there is a statistical model around and where one knows that that model is correct. And you really should stop thinking of Wikipedia as something dedicated to statistics. Why talk about statistical packages when routines for solving least squares problems are typically found in the "optimisation" section of subroutine libraries, rather than under statistics? Melcombe (talk) 09:11, 22 October 2010 (UTC)


 * The redirect of LLS to OLS was foolish, since OLS is not what "LLS" means. Perhaps a way forward would be to rename the present version of what is here to "linear least squares (mathematics)" (perhaps reducing any residual stats content even more) and to have a "linear least squares (statistics)" article that for now could be a disambig page for OLS andsimple linear regression, and for WLS and GLS articles if they exist. Of course, things are complicated by there being the content of Proofs involving ordinary least squares to consider in considering the overall content of the various articles. The article Outline of regression analysis presently has Linear least squares under the heading "Non-statistical articles related to regression" which rather contradicts the redirect to OLS. There are also some other articles presently containing links to sections of a former version of Linear least squares that have now been broken by the renaming ... I doubt that those are easily fixed. Melcombe (talk) 09:03, 21 October 2010 (UTC)
 * The relationship between the topics that you’ve mentioned is the following: “simple linear regression” ⊂ OLS ⊂ WLS ⊂ GLS (meaning that each next step is the generalization of the previous, and includes it as a particular case). Currently LLS is identified with the OLS. Are you saying that the term LLS is more general than OLS, and should be identified with GLS (myself, I don't know)? Also, there is another article which was somehow omitted from the previous discussion: the Least squares. As for the “Proofs involving ordinary least squares” — that’s a very awkward name, assigned by people who thought slashes are inappropriate in articles’ names. I’d rather have it renamed into the Appendix:Ordinary least squares instead... //  st pasha  » 03:26, 22 October 2010 (UTC)
 * I'm saying that "linear least squares" need not include anything statistical by way of statistical models, and is of use to many who would use it on a non-statistical basis. There used to be correspondence between "linear least squares" and Non-linear least squares (note the similarity of names} in that both address the problems on a mathematical rather than statistical basis. The article Outline of regression analysis was created (see much earlier discussion above) to provide an overview of what sort of thing would be found in which article. And yes, Least squares is included in Outline of regression analysis. Of course that outline itself can be changed, but it at least should provide a guide for those contemplating ad-hoc changes to article structures. Melcombe (talk) 09:02, 22 October 2010 (UTC)

There is no non-statistical basis here. When someone says “Oh, I just want to fit a linear function to the data, so I’ll just minimize the norm ||y − Xβ||²”, what he really means is that there is an ε = y − Xβ, and that he minimizes the norm of this ε, and then if he stops to think why — he would have to interpret this ε as a random variable, and conclude that this method is optimal when ε is in fact normally distributed. This is how the method of least squares was formulated by Gauss, and this is how the normal distribution was in fact discovered. Even if you use the LLS method without formulating any statistical model, it is still implicitly down there. Just as how we talk about random variables without specifying the underlying probability space.

As for the non-linear least squares article — it is a very poorly written article, and I feel sorry for it. The majority of the content is devoted to the description of the Newton-Rhapson method, which should really be mentioned only briefly. If you remove all the irrelevant stuff out of there, it will be only a stub. The actually important information, about the statistical properties of the NLLS method, is missing altogether. //  st pasha  »  10:54, 22 October 2010 (UTC)

derivation of normal equations
I see no reason to include this in an article titled, "numerical methods..." The fact is that the normal equations are mainly useful for pointing out how terribly unstable their properties for. They are the straw man for jumping off to how you should solve Ax=b. I would also expect they are in other articles. I say delete the derivation. 018 (talk) 17:18, 23 October 2010 (UTC)
 * There seems to be a disagreement as to what the subject of the article is; it has never really been "numerical methods". — Arthur Rubin  (talk) 22:22, 23 October 2010 (UTC)

Is hat matrix symmetric?
Under the section "Weighted Linear Least Squares"::"Residual values and correlation", I think that the statement that the hat matrix is symmetric is incorrect. The product $$HW^{-1}$$ is symmetric.

--Csgehman (talk) 23:21, 1 July 2011 (UTC)Curtis Gehman, 2011-07-01

complex values
When the data are complex, naïvely squaring the errors can make them negative. Then what? —Tamfang (talk) 01:29, 10 September 2011 (UTC)
 * Please use the Wikipedia help desk for such questions.
 * (You should read about complex inner product spaces or Hilbert spaces. You might consult the Springer Lecture Notes in Statistics on Graphical Models and the Complex Normal Distribution, written by Danish mathematical statisticians. The Davis & Brockwell yellow textbook on time-series analysis may discuss complex Hilbert spaces.)
 * Kiefer .Wolfowitz 02:40, 10 September 2011 (UTC)


 * I'm guessing that by Help Desk you mean Reference Desk/Mathematics. You don't think such a generalization would be worth mentioning in the article? —Tamfang (talk) 21:39, 12 October 2011 (UTC)


 * Complex-valued extensions are important in time series, spatial statistics, and signal processing. People reading those courses have studied linear algebra. Anybody who has taken a sophomore-course in linear algebra (using e.g. Anton's book) knows that in a vector space over the field of complex numbers that the inner product uses the complex conjugate. Thus, a warning about squaring complex numbers seems pointless here.
 * However, I may be wrong: Can you find a reliable reference, preferably an encyclopedia, that discusses this? Kiefer .Wolfowitz 08:37, 13 October 2011 (UTC)


 * "Anton's book" puzzled me for a moment: Anton is my name, and I didn't mention a book ....
 * When I tried to reinvent l.sq. for complex-valued polynomials, I ran into the problem that the conjugate has no derivative. (I resolved that problem once before, several years ago, but could not remember how.) —Tamfang (talk) 00:29, 14 October 2011 (UTC)


 * Even if the function values are complex, the minimization problem remains a real problem. So you have to consider one complex function as two real valued functions, one complex argument as two real arguments etc. Some of the structure remains, so in the end it is still possible to express the resulting expressions using complex numbers. E.g.,
 * $$\frac{d}{ds}|p(z)+sq(z)|^2=2\overline{(p(z)+sq(z))}q(z)$$.
 * --LutzL (talk) 09:30, 14 October 2011 (UTC)

Split2011
There has been a split template in the article for several months(split out section). I have added this section to prompt discussion. Melcombe (talk) 13:37, 4 November 2011 (UTC)

Notation of matrix variables
Uniform notation should be used for matrix variables in this article: American style (capital and bold letters) or European one (capital and Italic ones). Kkddkkdd (talk) 09:09, 13 November 2011 (UTC)

Limitations
In the limitations section, there is the following:

"...whereas in some cases one is truly interested in obtaining small error in the parameter $$\mathbf{\hat{\boldsymbol{\beta}}}$$, e.g., a small value of $$\|\hat{\boldsymbol{\beta}}-\hat{\boldsymbol{\beta}}\|$$..."

Shouldn't this section be referring to $$\|\boldsymbol{\beta}-\hat{\boldsymbol{\beta}}\|$$? --128.117.194.137 (talk) 21:50, 23 April 2012 (UTC)