Talk:Steve Shnider

Using Google Scholar hits or similar
Hi, if you check out Reliable sources/Noticeboard there is a reasonable discussion about these sorts of sources. These are excluded from articles on the basis of SYNTH and OR. You can, of course, use the information to support a discussion about notability on the article talk page. In the case of Steve Shnider it may be better to find international awards for his work or independent reviews of his books as sources to add. Cheers—Ash (talk) 12:11, 18 January 2010 (UTC)
 * You seem to be insufficiently familiar with notability criteria for scientific articles. The criteria explicitly state that both math reviews and google SCHOLAR provide valid indications of notability.  You seem further to confuse google and google scholar, a very different engine.  Tkuvho (talk) 12:31, 18 January 2010 (UTC)
 * Actually the Noticeboard discussion specifically discussed the case for Google Scholar. If you doubt this example discussion (just the first one I picked out) you can try searching RS/N for yourself. Notability of the article is not at issue, just the inclusion of this transient original research. As you do not seem to give much weight to my experience and have reversed my edit (again), I'll ask for an independent third opinion which may help explain the matter. I shall copy this discussion onto the article talk page for the convenience of an opinion.—Ash (talk) 13:15, 18 January 2010 (UTC)
 * I agree with your comment that "notability is not an issue". This is why I removed your "notability" tag a few hours ago.  Your claim that I have reversed your edit AGAIN is not accurate.  Tkuvho (talk) 13:22, 18 January 2010 (UTC)
 * You are correct, thanks for pointing that out, I should have checked the history before using the word 'again'.—Ash (talk) 13:38, 18 January 2010 (UTC)
 * No problem, and thanks for your additions! Tkuvho (talk) 13:39, 18 January 2010 (UTC)

Third opinion
Hi, I come from WP:3O. I agree with Ash, Gscholar hits are essentially meaningless (they are dependent on too many quirks of Google algorithms and do not always return reliable results). A nicer and sounder alternative could be using Web of Science h-index, instead. Hope it helps. -- Cycl o pia talk  14:27, 18 January 2010 (UTC)
 * Just to clarify your statement, are you suggesting that it is okay to add citation statistics, such as the h-index, to BLP articles?—Ash (talk) 14:56, 18 January 2010 (UTC)
 * Well, why not? (and why BLP or not would make a difference in this respect)? I don't know if it has a place in this specific article, but in general I see no intrinsic problem with it. -- Cycl o pia talk  15:00, 18 January 2010 (UTC)
 * Okay, to answer your question I'll drop BLP and talk in general terms.
 * For any article, it would seem a direct contravention of SYNTH to copy transient statistics from a citation search database and paste these into an article. If the h-index (or equivalent) were quoted in a publication (such as a book review) then it could be used as a non-transient quotable statistic. However, this approach would seem a poor precedent, as there would be no reason not to go through Wikipedia adding one-off snap-shot citation indices to all academic articles about people and/or their publications. Such statistics would become fairly meaningless within a relatively short period after being added as a reader would always have to check for a more recent citation index. I thought this was covered by previous WP:RSN discussion but if it is not clearly covered by current recognized policy, perhaps a RfC is in order.—Ash (talk) 15:32, 18 January 2010 (UTC)
 * I fail to understand how simply citing a primary source (a well-respected, official citation search source like ISI, not the precarious hodge-podge which is Gscholar) can be considered WP:SYNTH. About "there would be no reason", well, there could be: it could simply be a poorly relevant information, and therefore fall under WP:UNDUE. However for scientists whose h-index is somehow notable (say, the most cited chemist ever), such information can make perfect sense. Also, I cannot find relevant discussions on the subject in WP:RSN. I agree such statistics change (relatively) rapidly, but this is simply a matter of putting correct "As of 20xx" and eventually updating here and there. -- Cycl o pia talk  15:39, 18 January 2010 (UTC)
 * Your caveat considering UNDUE would probably eliminate such statistics from this article as Shnider or his publication are not notable for a certain level of citations. Using the ISI site as per your example, I get an h-index of 4. Perhaps that point resolves the particular case here.
 * I still believe that SYNTH is the most relevant policy as database queries are not a primary source, or a tertiary source as per WP:PSTS, but the reports are original analysis that only exist once you have put in your search criteria and requested the analysis. It cannot be considered published as if someone were later to follow a link to the data, they may well get different statistics and would not be able to reproduce the report that you originally generated.
 * In the general case, as we have failed to find clear current guidance based on prior consensus, I go back to the suggestion of raising an RfC as I do not think we are converging on a consensus view in our local discussion.—Ash (talk) 16:25, 18 January 2010 (UTC)
 * About this article: If the guy has an h-index of 4, probably he's not notable (I have an h-index of 4, and I am just a run-of-the-mill postdoc). I am going to add a notability tag, and perhaps deletion could be necessary. About the Web of Science thing: Agree on an RfC. Would you care to draft it? :) -- Cycl o pia talk  17:26, 18 January 2010 (UTC)
 * Cool, happy to squeeze a short RfC from this discussion. Any suggestions about where it should be located? We started talking on this BLP but the general issue seems rather off-topic to stay here. Maybe on WT:Citing sources ?—Ash (talk) 17:35, 18 January 2010 (UTC)
 * Isn't WP:RSN the "place to be" in this case? After all, the discussion is if a WoS result can be used as a source or not. -- Cycl o pia talk  19:18, 18 January 2010 (UTC)
 * Yes, I think that's right. As the practice appears to be to raise a notice at RSN and point to the RfC, I think I'll shortly raise it here in a new section so it has some context. After checking through Shnider's publication list on MathSciNet, I don't think there's much of a rationale to put the article up for deletion (rather than marking for improvement), so even if it is long-running there should be no issue.—Ash (talk) 11:13, 19 January 2010 (UTC)


 * You should never include these kinds of figures, unless they have been explicitly published in a reliable secondary source (e.g. a reliable source, such as The New York Times, that explicitly states "x gets 545 Google scholar hits"). Statistics based on a Wikipedia editor's Google searches and the like are unverifiable original research. Jayjg (talk) 02:24, 21 January 2010 (UTC)
 * I agree completely with you about Ghits or similar, but using a Web of Science link, with date, is not as arbitrary:, the tool is used regularly to calculate h-index in academic settings, and it clearly includes only peer-reviewed journals, AFAIK. -- Cycl o pia talk  02:47, 21 January 2010 (UTC)

H-index
Revisiting the above search on the Web of Knowledge, using Author=(Shnider) Refined by: Subject Areas=(MATHEMATICS) Timespan=All Years (assuming that there is only one S Shnider publishing articles in mathematics), I get a h-index of 6, a total of 20 publications and a total of 193 citations excluding self citations. The previously calculated index of 4 searched for his institution (but not restricting to mathematics) which was probably overly narrow.—Ash (talk) 18:51, 18 January 2010 (UTC)


 * If your scientific engine gives a total of 20 publications for Shnider, then it is not an appropriate engine for the field of mathematics. The standard tool for mathematics is Mathematics Reviews, available on the web as MathSciNet.  It lists 67 publications for Shnider.  Tkuvho (talk) 05:56, 19 January 2010 (UTC)
 * Thanks, just checked it out. It's a big difference in publications listed and I'm not quite sure why. It might be due to restricting the WoK search to Mathematics, perhaps this ignored some of the theoretical physics journals? For a policy view on which database ought to be the default, I've raised the question on the Mathematics reference desk Policy recommendation for comparing citations h-index for any mathematician. For Shnider's notability it might be worth highlighting his most influential works—Ash (talk) 09:51, 19 January 2010 (UTC)
 * Question now archived, it resulted in no serious comments. Ash (talk) 12:19, 4 February 2010 (UTC)

RfC Using citation totals in articles on academics
Would you support including reports of current citation totals or calculations of h-index in an article about an academic (or a particular work) to substantiate notability? If so, can a single standard source (such as the Web of Knowledge) be considered authoritative?—Ash (talk) 11:30, 19 January 2010 (UTC)


 * Comment The way we weigh notability if it's in serious doubt is at WP:AFD, where a decently large cadre of editors with a good bit of experience on weighing the notability of academics work. Citation counts and other metrics of authority are controversial in the academic world at large, and among Wikipedia editors, and likely to remain so. My own feeling is that where raw numbers are uncontroversial, there's enough other material to substantiate notability, and where raw numbers are not, they are probably not sufficient to make the case on their own. Ray  Talk 03:06, 20 January 2010 (UTC)


 * If as you say mathscinet and google scholar statistics are a good indication of notability, I see little reason not to mention them in biography pages. Tkuvho (talk) 12:17, 20 January 2010 (UTC)
 * Comment Such numbers have to be interpreted with the field in mind. This person has 300 citations, which in my field (neuroscience) would be rather paltry and it could be perceived as disparaging to even mention that in an article. I don't know how good/bad this is in mathematics, perhaps another editor with experience in that field can comment on that. In no case would I mention this kind of data in the lead, though. It should be in the section on this persons publications, where it belongs. --Crusio (talk) 10:19, 24 January 2010 (UTC)
 * In mathematics 300 cites in mathscinet is a high figure. For the sake of comparison, consider the fact that Panos Papasoglu is cited 124 times. Tkuvho (talk) 10:47, 24 January 2010 (UTC)
 * Can you give some figures on some undoubtedly notable mathematicians to compare with? It may just be that Shnider is non-notable and Papasoglu even less so. Is there a publication somewhere saying something about mean citation rates in mathematics and what constitutes a decent h-index in that field (and what h-index does Shnider have, using MathSciNet -unfortunately I don't have access myself)? --Crusio (talk) 10:52, 24 January 2010 (UTC)
 * I don't think Ash thinks Shnider is not notable. He mentioned a few times that notability is not the issue.  He is wary of google scholar statistics being used inappropriately in all the sciences.  His concern may be justified.  I think in mathematics the use of google scholar statistics is perfectly appropriate and consonant with mathscinet figures.  Note that Shnider's recent book on operads is one of the hottest monographs in mathematical physics.  In fact Shnider is the "undoubtably notable mathematician" you are looking for. Tkuvho (talk) 10:59, 24 January 2010 (UTC)
 * For comparison, G. H. Hardy (famous in Number Theory) has 2588 citations, Oliver Pretzel (researcher in Combinatorics, and my tutor when I was an undergrad) has 43 citations and Landon Curt Noll (prime number researcher) has 3 citations. Regretfully, I would say that Pretzel would be too specialized for his own encyclopaedic article, though he might be mentioned in other articles such as Error-Correcting Codes. Noll has his own article due to plenty of public interest in his discovery of a large prime number rather than his academic track record. As has previously been said, citation levels are weak evidence of notability. For example someone could publish a highly cited concordance of other publications but make no substantial impact in their field (failing PROF) or publish a poorly cited fringe article that happens to fuel some media controversy (and so they become generally notable). Ash (talk) 11:26, 24 January 2010 (UTC)
 * I agree completely, I don't think at AfDs for scientists, having published just one highly cited article has very often been enough to keep an article. Notability comes more from a whole body of work being consistently cited a lot. >2500 citations would be notable in any field, 43, I'm afraid, would not make the cut anywhere. As for GS, I don't trust it much as it gives too many incorrect results. It's a rough approximation at best. --Crusio (talk) 11:34, 24 January 2010 (UTC)
 * PS after Ash's tweak of his comment: I think a high level of citations is evidence of notability. A low level of citations is an indication that someone probably will not meet WP:PROF, but that person (like Noll) could still be notable under GNG. --Crusio (talk) 11:53, 24 January 2010 (UTC)
 * Media citations affect neither mathscinet nor google scholar statistics. None of Shnider's publications is a concordance. Tkuvho (talk) 11:31, 24 January 2010 (UTC)


 * In references. Mentioning such numbers in the article body looks strange to me (unless there is something particularly notable and exceptional about those counts). However, I think citation counts are sometimes useful and appropriate in references. For example, when we mention that John Doe is the author of the famous handbook X, we could (and should) add references that shows that X is truly notable; in addition to book reviews, etc., we can mention citation counts if it is obvious that they are very high (say, >1000 citations). Or if we claim that John Doe is one of the most widely cited authors in the field X, then we could add a footnote with more detailed citation counts to back this claim, etc. H-index as such says very little; raw citation counts + a comparison with other people in the same field tells us much more; and a third-party source is naturally always better than our interpretation of the citation counts, but such sources are scarce. — Miym (talk) 11:51, 24 January 2010 (UTC)
 * I would be rather weary of inserting information like citation counts and h-index in an article. For arguments in an AfD such data often provides a quick raw indicator of where things stand, but using them in an article could easily create reliability and weight problems. As already mentioned, there are very significant differences between disciplines in publication rates and the typical lengths of publication cycles (the length of time from writing a paper to its appearing in print). Perhaps more importantly, the tools available (GScholar, WoS, Scopus, MathSciNet, etc) are highly imperfect and tend to produce significantly different results for the same person and for the same paper. For example, GoogleScholar, when computing citation counts, includes many sources that would not pass WP:RS under our standards; it also often counts essentially the same sources several times (first as a preprint and then as a published paper); on the other hand it does not capture citation data in many journals, books and conference proceedings, especially in humanities. WoS typically undercounts because it includes citation data in journals only, while in many disciplines (e.g. computer science) much if not most is published in conference proceedings. WoS has another problem in computing citation hits and h-index: it is very format-sensitive in terms of how exactly a citation is made in a given publication and, as a result, WoS tends to split citation block for a single paper into several pieces (e.g. if a paper is referenced without page/volume numbers but as "to appear" in a particular journal, WoS never collates such a citation with the main record of the paper, after it has already appeared; on the other hand, MathSciNet is quite good at collating such citations). MathSciNet has other weaknesses: its citation data is very incomplete. Only for some journals, and only for fairly recent years, does it include the References section of a paper in the record for this paper (when citation hits are computed, MathSciNet scans these References sections). For books and conference proceedings their bibliography sections are never included, and for older (pre-1995 or so) publications they are generally not included for journals either. MathSciNet is slowly expanding its citation data practices, covering more journals, but at the moment the picture it gives is very incomplete. Scopus has other problems yet. As a result, the same person can have an h-index of 25 in GScholar, of 20 in WoS and of 15 in MathSciNet. The same paper can have citation hits of 65 in GScholar, 20 in WoS and 25 in MathSciNet. And so on. A typical reader (or even a typical WP editor) would not be aware of all these differences, so including various raw numbers in an article could actually lead to WP:WEIGHT problems (whereas in an AfD such issues can be discussed more fully). If somebody is included in, say, ISI Highly Cited, this fact is easily verifiable and I would have no problems with mentioning this, with a ref, in an article about that person. If a paper is very highly cited (say in the thousands or close to that) according to WoS or Scopus or MathSciNet or Pubmed (which, unlike GScholar, only include WP:RS citations), it's probably OK to mention this explicitly in an article about the author. Perhaps it is also OK to include a GScholar link for a particular paper (without explicitly specifying citation counts) where this paper appears in the "Selected publications" section of a WP article about a person. Anything more would, IMO, be generally inadvisable. Nsk92 (talk) 15:27, 24 January 2010 (UTC)
 * Very good summary of the situation. I would like to emphasize two points: (1) mathscinet always undercounts, never overcounts; in this sense it is quite reliable as indication of notability.  (2) Google scholar is a reliable indicator when a particular text is very influential, as is the case with Shnider's book on Operads.  That's why I included it in the first place.  Tkuvho (talk) 15:58, 24 January 2010 (UTC)
 * Basically, I agree with (1) and, as a general point, with (2). However, I think that, by and large, unless the numbers in question are quite exceptional, they belong in a talk page/AfD discussion of the subject's notability, but not in an article itself. (A typical reader would have no clue what these numbers mean and it would be far too easy to misinterpret them, basically a WP:WEIGHT problem). That does not appear to be the case for the operads book of Shneiner that you mention. IMO, in this case the correct thing to do is to provide a more detailed discussion of the book within the text of the article, including why it is important, perhaps with quotes from the reviews of the book or references to it in other reviews or papers, with specific examples of where and by whom it was used etc. This takes more work, but is doable in this case and, IMO, would be a better way of demonstrating notability within the text of the article than including citation numbers there, from either WoS or MathSciNet. Nsk92 (talk) 16:19, 24 January 2010 (UTC)


 * It's tending towards original research, it's a number that doesn't stay stable or otherwise define the subject, and it's hard to interpret without a lot of experience with the particular subject and its citation patterns. I think it's ok to use citation numbers to pick out which of his papers are the most heavily cited, and even to state in the article that these papers are the most heavily cited according to such-and-such a database accessed at such-and-such a date, but I would prefer to shy away from actual numbers. Put it another way: even if we know a reliable source for the information, would we put someone's exact salary as a dollar amount in one of our articles? —David Eppstein (talk) 16:39, 24 January 2010 (UTC)
 * I fear including citation numbers would not be fair to scientists and might even further mis-prioritize the scientific community. I agree with both Eppstein and NSK92's comments: not only are citations personal in a way, but putting in a citation number invariably perpetuates the myth that some formula for influence has anything to do with how much influence on society that person's scientific work has had. Citation numbers are already recognized as a problematic oversimplification much abused in scientific circles; the greater public does not understand this and most will not learn about it even if we tell them to. There is no sense in wikipedia even accidentally suggesting that a scientists' worth is able to be quantified. In other words, simply because scientists have a dangerous habit of trying to quantify each other's influence, doesn't mean a scientist's influence is meaningfully quantifiable.
 * Including citation numbers with each scientist, beyond misleading the public and being personally degrading, could also have a bad reverb on the Scientific community. The scientific community already damages itself too much over the fallible system of citations. The cut-throat competition too often blocks open and friendly teamwork, and teamwork is critical to doing good science. Wikipedia must avoid transforming citations into a matter of general-public glory or shame, a continuum-based de facto measure of career assigned to the scientist's name in the public's record for hundreds of years after they die. Creating such an environment could only exacerbate career-oriented scientific short-sightedness. --Lyc. cooperi (talk) 06:18, 3 February 2010 (UTC)


 * I agree with you that the scientific community sometimes uses quantitative criteria, such as levels of mathscinet cites, that may be too mechanical to do justice to a scientist. I share your concern that we must be wary not to mislead the public, and be careful not to "be personally degrading", as you put it.  Where I disagree is the contention that the inclusion of mathscinet figures is either misleading of degrading.  As earlier users have commented, we are not using such figures not as a required condition for notability, but rather an indication toward sufficiency, of course in combination with other factors.  "career-oriented scientific short-sightedness" is a bad thing but I don't think we are contributing toward it, on the contrary.  Furthermore, since this discussion has strictly nothing to do with the page in question (the notability having been established beyond doubt), would you suggest an alternative venue for a possible continuation of this discussion?  Tkuvho (talk) 08:48, 4 February 2010 (UTC)
 * (Re: RfC location) As this BLP is unusual for including citation statistics, currently stating "According to Mathematical Reviews, Shnider's work has been cited over 300 times by 290 authors" and "The book on operads is cited over 200 times at Google Scholar", the RfC appears to be suitably placed. When the RfC is completed, I recommend archiving to a sub-page in order to keep this talk page specific to Shnider. Ash (talk) 10:27, 4 February 2010 (UTC)

Reply to RfC: I think the use of information such as current citation totals or calculations of h-index could be quite interesting and useful and should be included—IF (major "if") such info is accompanied by interpretation from a reliable secondary source. If I’m not mistaken, current citation totals and calculations of h-index are primary sources of information. According to WP:OR, primary sources may be used with care, but interpretation must come from secondary sources. And in this case, as the learned commentators have explained above in nice detail, such numbers do require interpretation—x number of citations or an h-index of x really require some explanation for the ordinary person (and perhaps even experts) to make sense of.--Early morning person (talk) 16:35, 8 February 2010 (UTC)
 * But then the question arises, does the inclusion of secondary source interpretation of info from a primary source violate SYNTH? It does not seem so. WP:PRIMARY implies that interpretation from secondary sources is fine. So if you can find a secondary source that provides benchmarks for these kinds of figures, it would seem fine to use them. --Early morning person (talk) 16:45, 8 February 2010 (UTC)
 * If we treat mathscinet or google scholar figures on Prof. X as primary sources, presumably we would also have to treat homepage of Prof. X as primary source, as well. Now there are very few biographies that provide details of that sort.  The mathematician Hardy mentioned earlier surely has many biographies written about him, but most wiki pages on Prof. X and others like him are not based on secondary sources.  Applying the rules in a such a draconian fashion may result in the deletion of most wiki biographies.  At some point we have to decide whether it is more important to apply a strict interpretation of the rules, or to be of service to wiki users.  Tkuvho (talk) 17:48, 9 February 2010 (UTC)

Summary awaiting independent contribution... Ash (talk) 11:02, 26 February 2010 (UTC)

And the point of this article is &hellip;
Some heated discussion there, but what I fail to understand, where are the goodies? This article reads every bit like a promotional page. I am not implying in any way that it's been authorized or even approved by its subject, but who cares about the number of MR citations and selected bibliography if the only notable fact presented is that Steve Shnider coauthored a book on operads reviewed by Alexander Voronov? A lot of mathematicians write books, is that a sufficient reason to create articles devoted to them? Do math books themselves deserve their own articles? What about WP:NOTDIR? Arcfrk (talk) 01:25, 8 April 2010 (UTC)