Talk:Least-squares support vector machine

LS-SVM is really just Kernel Ridge Regression (a.k.a. Kernel Regularized Least Squares)
See, for example, Rifkin et al., "Regularized Least-Squares Classification", NATO-CSS 2003. Shouldn't this be mentioned? Also, the "support vector" in the name is misleading, since there is no concept of a support vector -- this algorithm doesn't induce sparsity, like the SVM does. Jotaf (talk) 19:24, 5 November 2012 (UTC)

Indeed, ordinary LS-SVM lack the notion of support vectors altogether, but it was derived from the SVM optization problem so the name stuck. thisbugisonfire (talk) 14:53, 8 July 2016 (UTC)

?
Quelques remarques :
 * 1) lien Least Squares Support Vector Machine dans atelier-projet faux (manque des tirets)
 * 2) lien kernel matrix faux : est-ce kernel_(matrix) ?

Illustrations do not seem to go with this discussion
I believe the pictures fail to illustrate the difference between LS-SVM and SVM. Here is why: First - SVM provides the same results as LS-SVM, when the data is separable. That is both:

$$ \xi_i = 0,  i=1,\ldots,N $$

and

$$ e_{c,i} = 0,  i=1,\ldots,N $$

Then both the objective functions of the L1 and L2 versions become the same. However, the illustrations appear to show the SVM separating the two classes perfectly. There should not be a difference if the same kernel that perfectly separated the example is used (no classification error). Also the LS-SVM appears to misclassify half of the points.

To demonstrate the difference between these two methods, a linear kernel should be used with data that is not linearly separable. Then give two examples: One with large outliers, where LS-SVM provides more relative weight to those outliers than SVM. Then give an example where outliers are 1.0 algebraic distance away. SVM and LS-SVM ought to produce nearly identical results when there are only misclassified samples with an error of 1.0 and $$ 1^2 = 1 $$.