Talk:Specificity (tests)

In gene structure prediction literature, specificity has traditionally been computed as $$S_p = \frac{TP}{TP+FP}$$. That is, $$S_p$$ is the proportion of predicted coding nucleotide that are actually coding. 22:31, 15 Aug 2004


 * The problem with using specificity that way is that it does not tell you much about the procedure or test, since it depends on what proportion of the underlying population is in fact positive or negative. --Henrygb 19:36, 22 Nov 2004 (UTC)


 * It would be nice to see a credible reference if you have one. I think you are talking about positive predictive value. --Henrygb 02:36, 12 Mar 2005 (UTC)


 * I'm not sure what you mean by credible reference, but Burset and Guigo (1996) defines specificity in this way. That's not to say that it's correct.  As they say in a later paper (free full text), "we essentially compute the proportion of actual coding nucleotides/exons that have been predicted correctly-(which we call Sensitivity) and the proportion of predicted coding nucleotides/exons that are actually coding nucleotides/exons (which we call Specificity)".  Thus, it may not be correct, but it has become the standard.  That said, the gene finding definition of specificity appears to be the same as precision from Information Retrieval, i.e. "(number of relevant documents retrieved) / (number of documents retrieved)".  This corresponds to TP/(TP+FP).  Thus, there seems to be some conflict in this article, which states that specificity is the same as precision.  It is not.  24.63.115.69 06:55, 15 September 2005 (UTC)


 * Fair enough, I asked for a reference and you kindly provided one. I still think it is an error: the CDC in their pages on Genomics use the standard definition . Meanwhile someone else manages to use "specificity" to produce a number 10^13,167,898.  I read the article as saying that positive predictive value and precision are equivalent, and that both depend on the underlying population, which specificity does not. --Henrygb 13:49, 15 September 2005 (UTC)


 * $$\frac{TP}{TP+FP}$$ is precision, also known positive predictive value: 'the proportion of items classified positive which truly are positive'.  Specificity is the true negative rate, $$\frac{TN}{TN+FP}$$, 'the proportion of negative items classified negative'.   1996 Burset and Guigo paper simply got them mixed up.

continuous interpretation of specificity (for instrumentation)
I came to this page looking for a continuous interpretation of specificity (for instrumentation). For example if you built an instrument to measure the salt content of a solution, it might (by imperfect design) also register the amount of sugar in the sample. Suppose the actual instrument reading was [reading] = 0.99*[true salt concentration] + 0.01*[sugar concentration]. Is there a concept of specificity that characterizes this kind of imperfection? 69.159.205.193 14:07, 15 February 2006 (UTC)


 * Such a thing does not fall under information retrieval definitions of specificity. What you are actually looking for is a statistical method to give you a confidence interval for the salt concentration... --Jettlogic

Reference for Information Retrieval and Binary Classification
This is based on http://www.musc.edu/dc/icrebm/sensitivity.html

Information Retrieval Basics True Positives (TP) "Number of P's that you called P"       True Negatives (TN) "Number of N's that you called N"       False Positives (FP) "Number of N's that you called P" (Type I errors) False Negatives (FN) "Number of P's that you called N" (Type II errors) Positives (P=TP+FN) "Number of P's"       Negatives (N=TN+FP) "Number of N's"       Data set (A=P+N) "Number of P's and N's"   Sensitivity (TP/P) "Proportion of P's that you called P" (recall in IR) Specificity (TN/N) "Proportion of N's that you called N"   False Positive Rate (FP/N) "Proportion of N's that you called P"   False Negative Rate (FN/P) "Proportion of P's that you called N"   Positive Predictive Value (TP/TP+FP) "Proportion of those you called P that are P" (precision in IR) Negative Predictive Value (TN/TN+FN) "Proportion of those you called N that are N"   Prevalence (P/A) "Proportion of data that are P"   F-Measure (2 x Rec x Pre / Rec + Pre) "Harmonic mean of precision and recall"

--Jettlogic

Table and edits
See Talk:Sensitivity (tests) re past wish list for simpler description, setting what it is before launching in mathematical jargon. I have also added a table and in Sensitivity (tests) added a worked example. The table is now consistant in Sensitivity, Specificity, PPV & NPV with relevant row or column for calculation highlighted. David Ruben Talk 02:44, 11 October 2006 (UTC)

I like the table! I think it would be helpful to somewhere explicitly say "Power = Sensitivity" (which follows from your equations) but I did not know how to edit the linked to example myself. If you agree, can you perhaps add this somewhere? Best wishes, David (wp07 at kreil.org). (23:20, 11 June 2007 User:141.244.140.159)


 * No not true that "Power=Sensitivity". Power is to do with the size of the study, giving a measure of how confident one can be that in a given selection of test subjects from the wider population as a whole, that statistical significance is achieved in showing null hypothesis (ie whether there really is a difference made by the conditional test).
 * As an example, consider a population of a million, with half having a disease and a test that correctly identifies all but 2 cases of the 500,000 with the disease. Now if we do a study with 2 subjects, the the power of the study is low; in that if one happens to randomly select the two people with the disease but a negative test as one's research subjects, one might wrongly assume the test has a low sensitivity - ie does not work. However if one selects 10,000 test subjects from the one million, then the power of the study is higher and we can be more confident of the result (which would conclude that the test is helpful in all but extremely rare cases). Likewise following a single person who smokes is never going to tell us much - if they get cancer this gives no clue as to how likely it is that smokers as a whole are to get cancer. Conversely, if this one individual smoker test subject never gets cancer, nor can we conclude no smokers ever get cancer - the power of the test is too low. However follow 1,000,000 smokers for 10 years will yeild meaningful data as to incidence rates of cancers in smokers. Power is therefore a feature of the research study methodology in selecting a enough test subjects, not just whether the test itself has good sensitivity or specificity. See Statistical power for more information David Ruben Talk 00:56, 12 June 2007 (UTC)

Specificity and Selectivity
I would like to clarify in my mind the differences between the two concepts. At the moment specificity redirects to specificity (tests) but am I wrong in believing that in general terms the two words are synonymous? LouisBB (talk) 06:11, 22 May 2008 (UTC)


 * Indeed same, but just issue of being consistant with Sensitivity (tests), as Sensitivity itself is a disambiguation page linking to several articles. Probably best to leave as it is - redirects are cheap, so no problem leaving Specificity as a redirect to Specificity (tests). David Ruben Talk 12:50, 22 May 2008 (UTC)