Talk:F-score

This stub almost completely duplicates Information_retrieval. Arguably, the two should be merged.

I think that recall should not be described joinly with the F1 Score. The redirect link from Recall to F1 Score should be supressed.

In this way, I have corrected the redirect link from recall; instead of being linked to "F1 score", it is now linked to "Information Retrieval", which has a section with a Recall Description.

—

The sentence "Two other commonly used F measures are the $$F_{2}$$ measure, which weights recall twice as much as precision, and the $$F_{0.5}$$ measure, which weights precision twice as much as recall." is wrong. It's easy to see that $$\beta=2$$ weights recall four times as much as precision (q.v. german article. 141.89.52.220 (talk) 12:29, 19 August 2009 (UTC)


 * I do not agree (anymore). Neither with the fact that $$\beta>1$$ puts more weight on precision, nor with the factor (twice, four times, ...) Perhaps it would be clearer if the formula were written $$\frac{1}{F_\beta} = \frac{1}{1+\beta^2} \left ( \beta^2 \cdot \frac{1}{recall} + 1 \cdot \frac{1}{precision} \right )$$. Here, we can see that the reciprocal of recall is weighted $$\beta^2$$ times the reciprocal of precision. Have a look at a visual illustration of the situation. It shows that for $$\beta>1$$, the gradient is more vertical for large parts of the plot, which means that recall is more important there.--Jonas Wagner (talk) 18:23, 3 March 2011 (UTC)


 * I think you meant to say the lines for F2 are more horizontal, not vertical. Being horizontal means the value of precision matters less, which agrees with your core claim. — Preceding unsigned comment added by 78.22.80.252 (talk) 15:03, 23 August 2012 (UTC)

This statement, "$$F_\beta$$ measures the effectiveness of retrieval with respect to a user who attaches β times as much importance to recall as precision", as quoted directly from van Rijsbergen's book (linked from this article), appears to be in error. As per analysis above and also Chapter 8 of Manning et al.'s IR book (see here), I believe the "β times as much importance" part should read "β² times as much importance" instead. --unkx80 (talk) 15:19, 18 August 2013 (UTC)

—

What is the point of showing the Diagnostic Testing Diagram? F-score does not even appear in it. And it is presented with no discussion or a link back to Confusion_matrix. 70.166.151.52 (talk) 16:52, 5 April 2017 (UTC)
 * On the contrary, I clicked on the Talk tab to come here and comment on how I found so exceptionally clear, the Diagnostic Testing Diagram's definition of F1 Score.  In the lower right corner you'll see the top level definition -- that of F1 score. Definitions upon which it depends are laid out in a tabular form of cells, but each of the cells includes not only the term being defined, but also its various synonyms that one might think mean something else if not warned of their redundancy. The only critique I can come up with is that the table should perhaps have been flipped to exchange the top right and bottom left corners.  But that is so minor a critique compared to a rare example of how technical Wikipedia articles should be written in general that I hesitated to bring it up at all. Jim Bowery (talk) 22:52, 19 December 2019 (UTC)

Requested move 2 October 2020

 * The following is a closed discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review after discussing it on the closer's talk page. No further edits should be made to this discussion. 

The result of the move request was: moved. (closed by non-admin page mover) — Nnadigoodluck  █ █ █  15:59, 19 October 2020 (UTC)

F1 score → F-score – The article covers both the F1 score and the more general F-score or Fβ score. The F1 score is a special case of an Fβ score where β=1. As far as I can tell the most common spelling is "F-score" with a dash rather than "F score". Marko knoebl (talk) 17:07, 2 October 2020 (UTC) —Relisting. BegbertBiggs (talk) 18:54, 10 October 2020 (UTC)

"Equivalent" formulations for F-1 don't account for division by zero
The three formulations:


 * $$F_1 = \frac{2}{\mathrm{recall}^{-1} + \mathrm{precision}^{-1}} = 2 \frac{\mathrm{precision} \cdot \mathrm{recall}}{\mathrm{precision} + \mathrm{recall}} = \frac{2\mathrm{tp}}{2\mathrm{tp} + \mathrm{fp} + \mathrm{fn}}

$$

are undefined at different values:

• The first is undefined when either precision or recall (or both) is zero, or when there are only true negatives (i.e. a classifier perfectly predicts only negative examples on a completely negative dataset)

• The second is undefined when precision and recall are undefined, or when there are only true negatives

• The third is undefined only when there are only true negatives.

so they are not completely equivalent Connorboyle (talk) 23:41, 24 December 2023 (UTC)