User talk:Yaris678/PSTS/Archive 2

I created User:Yaris678/PSTS to become a guideline page to replace WP:PSTS, which is currently part of WP:NOR. Hopefully, one day, the contents of the page will be moved to a page called something like Primary, secondary and tertiary sources. I have created it after a discussion on Wikipedia talk:No original research. Feel free to make changes to it if you think you can improve it. However, if I revert the change, please stick to discussing it here on the talk page until I agree with you - this is in my user space after all!

The first version of this page in the revision history is the text from the PSTS section in WP:NOR so you can compare the two.

I see this talk page being used mainly for discussion of:
 * What should be in the guidance on PSTS.
 * Whether that should have a separate page.

I can also anticipate some discussion on the status of the separate page. I think it should be a guideline because it is mostly about how to apply WP:NOR (and to a lesser extent WP:V), rather than being a stand-alone policy.

Yaris678 (talk) 21:18, 26 December 2009 (UTC)


 * Thanks for starting the ball rolling on this... I agree that it should probably be a Guideline rather than a Policy... but can understand some of the arguments for it going the other way. 21:54, 27 December 2009 (UTC)

Something else I should mention at the top of the talk page... I am not suggesting that WP:NOR should say nothing on the subject of primary, secondary and tertiary source.  Rather, it should say less than it does now and a link should be provided to the guideline that this page will become. Yaris678 (talk) 15:48, 3 January 2010 (UTC)

Archives
User talk:Yaris678/PSTS/Archive 1

Why having a guideline on PSTS is a good thing

 * 1) It allows the PSTS section of WP:NOR to concentrate on the original research aspect of PSTS.
 * 2) It allows us to write slightly more in the guideline and clarify a few things.
 * 3) The current situation has lead to many (inexperienced) editors miss-reading the policy to mean, for example "Primary sources are not allowed." This should be reduced by having a separate guideline for the reasons given in 1 and 2.
 * 4) Part of the reason why PSTS is important is that a primary source can not establish a topic's notability - but WP:Notability is a guideline and not policy. Referring to a guideline from a policy could give the impression that the guideline has been upgraded to a policy.
 * 5) WP:Reliable sources is a guideline that deals with a different aspect of source categorisation.  It is arguably more important than WP:PSTS, so why is WP:PSTS at the policy level?
 * 6) "No original research based on primary sources" is an application of "No original research". However, that does not dictate that WP:PSTS should be part of the WP:NOR page.  For example, the guideline WP:Do not include the full text of lengthy primary sources is an application of the policy WP:What Wikipedia is not; the guideline WP:Conflict of interest is an application of the policy WP:Neutral point of view.

I thought it would be handy to keep a list of reasons to create a PSTS guideline. I have done this above. Please feel free to modify the list or give comments below. As with the user page, if you make a change to the list and I revert it, please stick to comments until you have persuaded me otherwise. Thanks, Yaris678 (talk) 16:42, 31 December 2009 (UTC)


 * As far as WP:RS and WP:NOTE go... both are more than "just guidelines"... they have real impact (in fact, they have more of an impact than many of our Policy pages). To some extent they fall into a special category of their own.  I could see PSTS falling into this special category as well.  Perhaps we should designate them as "Core Guidelines", in the same way that we designate certain Policies as being "Core Polices".  Blueboar (talk) 17:20, 31 December 2009 (UTC)


 * As I see it, the primary reason for restricting primary sources is that these are not transparent to the untrained reader and therefore require interpretation in order to be used; their use invites -- and seems to be an inherent violation of -- WP:SYNTH. That is a much more important concern than the concern with notability.
 * If we concede that any primary source whose notability is attested by a secondary source can be used, then we open the way to all sorts of unwanted syntheses. This draft must stress the primacy of WP:NOR.  It cannot open the way to justifying fringe interpretations of Aristotle's Physics (for example) because it's "in the primary source."  The requirement that "All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source" is an essential bulwark against such Original Research. --SteveMcCluskey (talk) 18:42, 31 December 2009 (UTC)


 * After having written the above, I toke a close look at WP:NOTE. I find it has nothing to say about the notability of sources, although it does address the kinds of sources that are needed to establish a topic's notability.  The central concern of the notability guideline is to provide guidance for determining whether a topic is notable enough to merit a Wikipedia article.
 * What, then, does WP:NOTE have to do with WP:PSTS? --SteveMcCluskey (talk) 19:08, 31 December 2009 (UTC)
 * NOTE does not have anything to do with PSTS... PSTS however, has something to do with NOTE (in that you need secondary sources to establish that a topic is notable enough to merit a Wikipedia article). Blueboar (talk) 19:36, 31 December 2009 (UTC)
 * OK, I wasn't quite following the direction of the talk page discussion, which seemed to be stressing NOTE in a way that wasn't clear to me. I was juxtaposing point 1, about allowability of primary sources, with WP:NOTE to infer that the intent was to open the way to unrestricted use of notable primary sources.  Thanks for clarifying, SteveMcCluskey (talk) 19:40, 31 December 2009 (UTC)

SMcC, I'm glad that has been cleared up - it wasn't the intention to imply that any primary source can be used, provided that it is notable. Perhaps the term "expand things a bit" in point two was misleading - I have rephrased it now. Perhaps point 3 needs to be rephrased too. The point I was getting at was that you shouldn't really write an article about a book if the only source you have on the topic is the book itself. Perhaps one of you could make a more general point, or the same point in a clearer way.

On the fact that "No original research based on primary sources" is an application of "No original research"... That is true, but it does not dictate that WP:PSTS should go into WP:NOR. For example, the guideline WP:Do not include the full text of lengthy primary sources is an application of the policy WP:What Wikipedia is not; the guideline WP:Conflict of interest is an application of the policy WP:Neutral point of view.

Yaris678 (talk) 14:34, 1 January 2010 (UTC)


 * My thoughts on the final arrangement is that an abbreviated discussion of Primary, Secondary and Tertiary Sources should remain in the WP:NOR policy, while the expanded discussion we're working on here should be a separate guideline, with a template linking from WP:NOR to this article, following Summary style.  SteveMcCluskey (talk) 20:25, 2 January 2010 (UTC)


 * I agree, although I think it would be more of a than a  .  The part in WP:NOR would only talk about PSTS in relation to NOR, the guideline would talk about various aspects of PSTS, although NOR would be an important part of that.  Yaris678 (talk) 00:42, 3 January 2010 (UTC)


 * I have just modified point 3 and added a point 5, inline with discussions here. Yaris678 (talk) 00:53, 3 January 2010 (UTC)


 * I've just added to and re-jigged list. Yaris678 (talk) 13:06, 12 January 2010 (UTC)

Looking good
Yaris... this looks very good.

I would start to advertize it as a potential guideline/policy proposal. Not just at the talk pages of NOR or NOTE... but to some of the regular editors to those pages. Well done and thanks for doing the grunt work on this. Blueboar (talk) 15:21, 8 January 2010 (UTC)


 * Thanks Blueboar,
 * I was going to ask about how to get this accepted as a guideline. Am I right in thinking that you have been contributing to policy pages for a while?  Would the adverts sound better if they came from you?  Yaris678 (talk) 15:33, 8 January 2010 (UTC)


 * Possibly... I can certainly ask a few of the "usual suspects" to drop by and comment, and will do so. Blueboar (talk) 16:12, 8 January 2010 (UTC)


 * As an occasional contributor, I'm also favorably impressed by the current state of the draft. Comments from interested parties who have not been involved in the draft so far would be welcome.  --SteveMcCluskey (talk) 16:17, 8 January 2010 (UTC)


 * This is a proposal to remove a key part of NOR and reduce it to guideline status, which means people can happily ignore it. I'm struggling to see the benefit of that. SlimVirgin  TALK  contribs 12:28, 9 January 2010 (UTC)


 * Thanks for responding SV. Would you be happier if we proposed it as a Policy page (so people had to pay attention to it)?
 * As for "removing" a key part of NOR... that is not the intent. The intent is to allow NOR to say what it says more clearly, by shifting extranious explanitory material and definitions that have nothing to do with the concept of OR to this page (which can be pointed to at NOR). As far as I know, no one is suggesting that we completely remove all discussion of source types from NOR.  Certainly my intent is that we will keep the key statements of NOR (especially the statement that we need to use extra caution when using Primary sources).  All that we would "remove" is the distracting verbage.
 * Another benefit of having this page is that we can discuss PSTS issues that relate to other guidelines and policies... such as how different source types are important to WP:NOTE and WP:RS. Blueboar (talk) 15:00, 9 January 2010 (UTC)


 * I see the benefit of having a page in the form of an essay that expands on the primary/secondary sourcing issue, but calling it a guideline implies that it's not policy, when it is, and having it as a standalone policy separates it from the reason it needs to be policy, namely that misuse of primary sources is a key source of OR. That is a central part of the NOR policy, and I see the detail there as an important component. I'd worry that this might become another WP:RS nightmare, where we have two pages that essentially say the same thing (one a policy, one a guideline) or, worse, two pages that ought to say the same thing, but don't. SlimVirgin  TALK  contribs 15:16, 9 January 2010 (UTC)


 * Just to expand, I often see the argument, and it's implied in this proposal, that it's just as easy to do OR based on secondary sources as it is on primary sources. But it isn't. OR based on primary sources isn't just a question of SYN, which is basically just bad editing. The misuse of a primary source is this, for example: you're writing an article about a notable writer, let's say not a living one so we don't get mixed up with BLP. You fill the article with secondary sources about his work, and some examples from his novels (primary sources,but you use them in a purely descriptive way). All well and good. Then you hop down to his local court house, and you do a search to see whether he was ever divorced, sued, arrested, or whatever. STOP! This is one of the key misuses of primary sources, and it is policy that editors should exercise extreme caution when using sources in this way, and would have to show that any edit that depended on those sources was entirely uncontentious, because no source other than you has said this is notable. It's in large measure to prevent this kind of editing that it's important to retain the primary/secondary issue as a key part of the NOR policy. SlimVirgin  TALK  contribs 15:28, 9 January 2010 (UTC)
 * I understand what you are saying... and for the most part I completely agree... the problem is that the statements about the misuse of sources are lost in all the verbage that explains what the different types of sources are. This gives the impression that the problem is with the source, and not in the misuse.  It leads to innumerable arguments over whether a source is primary or secondary.
 * To use your example... if this came up at NORN, I would fully expect to see the argument that not all court documents are primary sources. The judge's final decision, for example, is secondary... a judge is a legal expert, who's job is to analyze the primary sources (the testemony presented, the legal arguments presented by the attorneys, etc) to reach an expert conclusion.  This leads to the further argument that going to the court house is no different than going to a library or searching on line.
 * I would also expect to see arguments saying that this is not the case... that a judge's decision is primary in that it deals with a specific case and not the Law in general.
 * Now, I am not really interested in debating which of these arguments is correct ... I raise it purely to make a point... I agree that what you are discribing in your example is a red flag, and is very likely to result in OR. But whether it is OR or not is determined by how the doucment is used.  What would make it OR is the misuse of the document, not the document itself.  Yet because NOR spends an inordinate amount of space on definition, it focuses editors on the question of whether a source is primary or secondary, and that encourages the misconception primary/secondary determines OR/not OR.  What we need at NOR is language that focuses on whether the source is being misused or not.
 * Understanding "primary vs. secondary" is important and helpful... but it is not the key to NOR... the key to NOR is understanding "approprate use vs. misuse". Blueboar (talk) 16:22, 9 January 2010 (UTC)

Blueboar, you asked me to comment on this, so here is my take. I feel that NOR and PSTS are very closely inter-related, and if we remove the definition of PSTS from NOR we'll weaken NOR, and create more confusion, given the wiki process. I agree with much of what you say. In my opinion, for example, a court's ruling is primary, because the judge is also the one in charge of collecting and controlling the testimony and evidence, as well as making the decisions. I would require an independent secondary source to analyze the court's ruling, although if the latter includes a brief "bottom line" summary, that summary can be quoted in its entirety, with reduced risk of selective highlighting. As I see it, we need to clarify that primary sources are very close to the raw data and its collection and management process. We need a secondary source to interpret and summarize primary sources, and put them in perspective. If a primary source includes a brief built-in summary, it can generally be included as a quote, but for a good overview and perspective we'd still need a secondary source. The point is that the issue of deciding what is primary vs. secondary is so closely related to the issue of NOR itself that I believe that removing the PSTS discussion outside of NOR would weaken NOR, which is a key content policy. Crum375 (talk) 16:43, 9 January 2010 (UTC)
 * Thanks Crum... I totally agree that we need a discussion of PSTS in NOR, but not the one we currently have. I think the way we currently discuss it is more harmful than it is helpful. Far too often we end up with a discussion of PSTS instead of a discussion of NOR.  The tail is wagging the dog.  Also, there are PSTS issues other than OR (such as needed a secondary source to establish notability).  I think it is helpful to have one guideline/policy page that discusses all the PSTS issues. Blueboar (talk) 17:23, 9 January 2010 (UTC)
 * I think NOR and PSTS are so closely related that a discussion of PSTS is often also a discussion of NOR (and vice-versa), so they can't be easily separated. I agree that notability is also an issue which relates to PSTS, as are probably NPOV and UNDUE, as well as WP:SOURCES, but I think that the most logical location for PSTS is inside NOR. The problem is that the wiki process all too easily leads to discrepancies and confusion, as we have today with RS and V, so separating things out is not a good idea if we wish to retain coherency. Crum375 (talk) 17:33, 9 January 2010 (UTC)


 * (ec; reply to Blueboar) The establishment of notability via secondary sources is an important part of the NOR policy too; you can't separate them out like this. The primary/secondary distinction (and we do need to explain the terms in the policy) has been a key component of it for around five years. In fact, it was that issue that triggered the end of ATT. Jimbo found a group of editors using a primary source on a BLP. He explained to them that it was against policy. They argued with him, so he went to the NOR policy to find that section, which is when he saw someone had redirected NOR to ATT; he objected and the rest is history. The danger of tampering with such a long-standing and crucial part of the policy is that it will be weakened if relegated to a guideline or if separated from its context. The only reason primary sources are problematic is that they often lead to OR. SlimVirgin  TALK  contribs 17:38, 9 January 2010 (UTC)


 * Just another brief point: the essence of NOR is that Wikipedians should not produce primary or secondary sources. We are supposed to provide an overview of both, a tertiary source. We may use primary sources only to describe what is in them, and only if secondary sources have already discussed them or the issues they deal with. The issue of primary/secondary/tertiary therefore needs to be understood for the policy to make sense. SlimVirgin  TALK  contribs 17:43, 9 January 2010 (UTC)
 * Whoa... you started off fine, SV... but you are way off base with: "...and only if secondary sources have already discussed them or the issues they deal with." That is not part of any Wikipedia Policy. Are you perhaps confusing sourcing an article's topic with supporting a statement within an article?  A topic does need to be discussed by secondary sources (Per NOTE)... but many individual statements can be supported solely by reference to a primary source (a perfect example of this are basic plot summaries for works of fiction).Blueboar (talk) 19:04, 9 January 2010 (UTC)
 * I did say, "or the issues they deal with." SlimVirgin  TALK  contribs 19:59, 9 January 2010 (UTC)
 * Hmmm... in which case, perhaps you need to clarify what you mean by "or the issues they deal with"... because I suspect that I am reading a much narrower meaning into that phrase than you seem to. How does your phrase allow for a plot summary to be cited to the primary source ie the work itself? Blueboar (talk) 20:53, 9 January 2010 (UTC)


 * If the work has been discussed by secondary sources, it's fine to use primary sources for general plot summaries, but if the particular plot elements are contentious, or the choice of them might be, then secondary sources would be needed for those particular edits too, not only for the article overall. We should be asking ourselves constantly when we write, "Who, apart from me, says that this matters?" The aim is to avoid producing an article that itself amounts to a primary source (a firsthand account) or a secondary source (one that discusses and interprets primary sources). SlimVirgin  TALK  contribs 20:58, 9 January 2010 (UTC)
 * I understand what you are saying... But one could ask the same question about a plot summary... "who, besides me, says that the plot of this book matters". Blueboar (talk) 21:16, 9 January 2010 (UTC)


 * The question would be, "Who, apart from me, thinks this book matters?" not the plot as such; otherwise we'd have to imagine a reliable source who says, "Love the book, couldn't care less what it says." SlimVirgin  TALK  contribs 22:34, 9 January 2010 (UTC)
 * I agree... Now let us relate this to some other article... Say a BLP. How is using a primary source to support a plot summary different than using a primary source such as a court ruling to support a statement that the subject was sued and lost his suit? Blueboar (talk) 22:46, 9 January 2010 (UTC)


 * Because no one other than you has said it matters. Going to search for someone's criminal record, which no other source has seen fit to mention or even knows about, is like highlighting a plot element that no one has mentioned, but which is contentious and changes your view of the whole work. Added to which, you may not even get it right, because you're operating on your own without secondary-source guidance. Ditto with the criminal record issue (or divorce, or whatever). You may misunderstand it, report it incorrectly, and yet it could seriously change the way the BLP is viewed. Nice, mild-mannered politician becomes monster who battered his wife during their divorce. SlimVirgin  TALK  contribs 22:54, 9 January 2010 (UTC)
 * I still am not sure I see the distinction you are making... How is simply stating: "On March 15, 2001, Joe Blow was successfully sued for breach of contract by John Smith  " different than stating "A month later, Harry leaves the Dursleys' home to catch the Hogwarts Express from King's Cross railway station". In both cases, I am the one saying that this bit of fact matters in the context of the topic.  Yes, a mistake in a BLP has real-life consequences that a mistake in a book article does not have... and I whole heartedly agree that we need to take far more care to get it absolutely right in a BLP than we do for a book article... but a mistake or misuse is not inherant in the nature of the source.  The source just sits there waiting to be used correctly or incorrectly. Blueboar (talk) 23:37, 9 January 2010 (UTC)
 * The reason we are not allowed to rely on primary sources for negative BLP material is two-fold: first, primary sources are "raw data" in nature and therefore easy to misinterpret by WP editors, thereby creating real-life harm to a living person, and second, there is the issue of notability and UNDUE. If the raw data in the primary source (e.g. court judgment) is notable and significant in the person's career, some secondary source should pick it up and report on it. If no secondary source has done so, then either it's not notable or significant enough, or we are misinterpreting it. In all such cases, we should err on the safe side and wait for the secondary source to publish it. This is vastly different than a plot summary, where if we make a mistake we would only harm a fictitious person. Crum375 (talk) 05:09, 10 January 2010 (UTC)
 * Well, this returns us to the issue of whether court rulings are primary or secondary... Perhaps we need to distinguish between a final court judgement and a filing (or allegation). I am still not convinced that a Final Judgement is primary (or at least not convinced that it is completely primary).  final judgements are published annually in tertiary reference books (in both hard copy and by on line services such as Nexus/Lexis).  these are the equivalent of a legal encyclopedia.  I would say they are not "raw data" (the "raw data" would be the facts and testimony presented in the case, and the legal arguments made by the attorneys).  The judge is a step removed from the facts. In other words... I am not convinced that it is OR to mention a conviction (or a final judgement in a civil case).  As for "who says it matters"... I would argue that being convicted of a crime is inherantly notable.  The legal system has said it matters.
 * However, I would agree that mentioning the fact that someone has been accused of a crime, or alleged to have committed some civil misconduct, is different. Accusations and allegations are not published in legal reference materials.  They are "raw data".  For accusations or allegations, I would agree that we would need it to be mentioned in a reliable source.  Does this distinction make sense?


 * Blueboar, the thing is that primary/secondary is a relative term. We can't formulate a description in a policy that is going to cover every eventuality. The point is that a primary source is very close to the topic, close to the point of involvement. That's really all we should say, because as the topic shifts, the relationship to it of the source shifts too. Trying to pin things down even further in a policy won't work. These discussions belong on the article talk pages where the issue has arisen, so they can be examined in context. SlimVirgin  TALK  contribs 16:37, 10 January 2010 (UTC)
 * I completely agree... and this is exactly why I find the PSTS section of NOR so problematic. Every case of OR needs to be examined based on specific contex... whether the source (be it primary, secondary, or hexidecimalary) is being misused or used appropriately. As you say: "Trying to pin things down even further in a policy won't work". Blueboar (talk) 18:36, 10 January 2010 (UTC)
 * And the NOR policy doesn't try to pin it down. We say articles should be based on secondary sources, and that primary sources should be used only for descriptive claims. The rest we leave up to editors. Which part of PSTS do you feel pins things down even further? SlimVirgin  TALK  contribs 19:08, 10 January 2010 (UTC)
 * Almost everything after "Primary Sources:" tries to pin it down. We give a definition in order to pin it down... we give examples of each type of source in order to pin it down.  19:44, 10 January 2010 (UTC)
 * But we don't say anything about when they should or shouldn't be used, except that primary-source material should only be used descriptively, and that articles shouldn't be based on such sources. SlimVirgin  TALK  contribs 19:49, 10 January 2010 (UTC)

excluding tertiary secondary sources in favor of secondary primary ones
If I understand this page's assertions, Tertiary and secondary sources are to be used over primary ones. This is exactly 180 degrees reversed from my general experience in climate change pages where secondary sources are regularly denied inclusion and only peer reviewed papers (which I understand are primary sources) are accepted by a large number of editors who will edit war you to exhaustion if you try to include secondary sources, even those mentioning the underlying research. What's the proper way to resolve this? I'm hesitant to call for sanctions and am looking for constructive alternatives. Are there any? TMLutas (talk) 19:08, 9 January 2010 (UTC)


 * Um... Peer reviewed journal papers are normally considered secondary sources.Blueboar (talk) 19:11, 9 January 2010 (UTC)


 * Hm, I guess that I was misinformed then. It won't be the first time. The problem is thus excluding tertiary sources in favor of secondary ones. Still an issue if not what I thought it was. Or is excluding tertiary in favor of secondary acceptable? TMLutas (talk) 19:57, 9 January 2010 (UTC)


 * Peer-reviewed journal papers are primary sources if written by people who were involved in whatever study they're writing about. SlimVirgin  TALK  contribs 20:00, 9 January 2010 (UTC)


 * SV, If peer reviewed journal papers are primary... then we have a conflict between core policies that needs resolving. WP:V firmly encourages the use of peer-reviewed journals.  NOR on the other hand discourages the use of primary sources.  Or am I missing something in what you wrote. Blueboar (talk) 21:12, 9 January 2010 (UTC)


 * I think it depends on the type of study and the specific part of it. If it's a scientific experiment, with lots of raw data and analysis, it would be essentially primary. If there is an introductory section which reviews the state of the art, discussing other studies, that part would be secondary. The concept of secondary sources is that they interpret and provide perspective for other published sources, typically primary. And we are allowed to use primary sources, just more carefully, since they are trickier to use properly, without introducing OR. Crum375 (talk) 21:41, 9 January 2010 (UTC)
 * Here is one of the sources we use in the PSTS section. For sciences, it says "report of scientific discoveries" is primary, whereas if the source "analyzes and interprets scientific discoveries", it is secondary. The bottom line is distance from the data being reported on. Crum375 (talk) 22:00, 9 January 2010 (UTC)

I think this issue is dealt with quite nicely in the section called "Complex source categorisation". Yaris678 (talk) 22:56, 9 January 2010 (UTC)
 * I agree with some parts of it, but disagree that "conclusions" of a scientific report are automatically considered secondary. I would say if it is a review of other people's work, it would be secondary, but if it's the authors' own work reported in this document, it would still be primary, until reported on by others, since there is insufficient "distance" from the data being reported. This is not to say we can't use this as source &mdash; we can and we do &mdash; but it has to be done carefully, and if there is a secondary source discussing it, it would be a preferable starting point for a top level view. Crum375 (talk) 23:14, 9 January 2010 (UTC)


 * And here we are... once again parsing out whether the source is primary or secondary, or which parts are primary or secondary... and no one has bothered to answer the underlying issue that TMLutas raises: Whether the journal articles that are used at the climate change articles are being used appropriately or not. This is what I find so frustrating.  All the focus goes to arguing "it's primary" and "no, it's secondary"... and we never get to "it is being used correctly/incorrectly". Blueboar (talk) 23:47, 9 January 2010 (UTC)


 * I am not familiar with those articles, but I would say, without looking, that if they are reviews of other articles, they would be secondary, while if they present their own data and analysis they would be primary. But regardless of the primary/secondary classification, in a contentious issue one has to be extra careful not to create original analysis or interpretation, and rely on sources which review other sources wherever possible, instead of doing the reviews ourselves. Crum375 (talk) 00:14, 10 January 2010 (UTC)


 * Obviously, we can change the wording of that section is we decide that it is better not to classify primary publications as partly primary sources and partly secondary. However, I think I should first describe why the current wording is as it is.  It was done to address an apparent discrepancy between how we deal with sources from say archeology, in comparison to how we deal with sources from say astronomy.
 * In archeology, an archeologist will search away until they find, say, a piece of ancient pottery. This pottery will then be a primary source. The archeologist (or a number of archeologists, perhaps lead by someone other than the archaeologist that made the discovery, but nonetheless including that person) will then write a paper about the pottery including analysis and interpretation.  This paper will be reviewed by a number of peers before appearing in a peer-reviewed journal.  This paper will be considered a secondary source.
 * In astronomy, an astronomer will search away until they find, say, an exo-planet. The observations that point to it being an exo-planet would be considered a primary source but they are highly unlikely to be published on their own. The astronomer (or a number of astronomers, perhaps lead by someone other than the astronomer that made the discovery, but nonetheless including that person) will then write a paper about the observations of the exo-planet including analysis and interpretation.  This paper will be reviewed by a number of peers before appearing in a peer-reviewed journal.  This paper will be considered a primary source.
 * I hope you can see the inconsistency here and appreciate that if we are to be consistent, we should consider the observations of the exo-planet to be a primary source and the analysis and interpretation to be secondary. Fortunately, it appears that the correct term for the paper on the exo-planet is not "primary source" but "primary publication" (see the reference provided).  Hence, it makes most sense to state that, in the sciences, the observations which appear in a primary publication should be considered to be primary sources but the analysis and interpretation should be considered secondary sources.
 * Of course, if a source has more distance and a greater scope to synthesise information this is obviously a good thing. This is why the current wording says that greater weight should be given to such secondary publications.  However, if we only allow secondary publications to be treated as secondary sources we are not really being fair to the sciences.
 * Yaris678 (talk) 09:27, 10 January 2010 (UTC)
 * There is no perfect external definition of PSTS. An external source cited in the PSTS section sheds some light, but is not absolutely clear. But for WP's purposes, we traditionally define "primary" as a source close to the raw data, and "secondary" as a source which reviews and interprets a primary source, i.e. provides a more distant view of the data. So for your example, WP-wise, a piece of pottery would be "raw data", the archeologists' original paper reporting their discovery of this pottery would be a primary source, and a subsequent review and analysis of this report would be a secondary source. The point is that if a scientist is directly involved in the collection and initial analysis of the data, he would be "close to the data", hence his report would be primary. When someone subsequently writes about and interprets the primary report, it adds distance (and perspective), and this would be a secondary source. Crum375 (talk) 13:27, 10 January 2010 (UTC)
 * To emphasize, a reliable primary source is not "taboo", and is in fact highly desirable. It's just that we need to rely on a secondary source describing the primary source to establish its notability and provide perspective and top level interpretation and analysis. We may still refer to the original primary report for additional details, as long as we don't add our own interpretation or selectively highlight some parts (and exclude others) to advance a position. Ideally, the secondary source(s) should give us the overall framework (the forest), with the primary source(s) providing the details (the trees). Crum375 (talk) 14:13, 10 January 2010 (UTC)

Problem
One of the problems with this page is that it's engaging in OR itself. For example:

"In the sciences, research papers in which ideas are first published are commonly called primary literature or primary publications.[7] Such papers may include experimental data, which is a primary source for Wikipedia. However, such papers will also contain analysis of experimental data and drawing of conclusions - these parts should be considered secondary sources within Wikipedia's usage of the term. A scientific paper may also include a survey of previous work, which is also a secondary source."

That's not correct if the scientists are writing about their own work, whether current, previous, raw data, or analysis. Scientists writing about their own research is primary-source material, no matter how it's packaged.

If editors want to write an article on primary/secondary, I suggest you develop the articles that already exist about them in mainspace, where you'll have to cite academic sources who discuss sourcing. Then if you still feel the primary/secondary section in NOR is too long, we can link to the mainspace articles, and put some of our examples in a footnote. That would shorten that section without losing any material, and without creating a fork that may be misleading. SlimVirgin TALK  contribs 14:50, 10 January 2010 (UTC)


 * For refereed publications, this is incorrect. The interpretation of the authors (and I use the plural intentionally, because especially in the sciences, most papers have multiple authors) has been examined and critiqued by the referees and journal editor, and if their objections are not met, the article is ordinarily not published.--Curtis Clark (talk) 15:23, 10 January 2010 (UTC)


 * That doesn't change that it's a primary source. The people who conducted the experiment also wrote it up. SlimVirgin  TALK  contribs 16:38, 10 January 2010 (UTC)


 * Slim, The core issue in the Wikipedia Primary/Secondary distinction has long been, as Jimbo once noted, that primary sources require interptetation by the Wikipedia editor, and therefore invite the kind of original research or synthesis that are prohibited in Wikipedia. The results of experiments and original historical sources were put on the same level; this is the principal reason that the use of such primary sources is discouraged.


 * In all published research papers, whether by a historian or a scientist, the author(s) present the raw data (measurements, experimental results, or quotations from original sources) as well as their own interpretations of the data (whether those data are the results of their own or someone else's experiments, measurements, or archival research). These interpretations are clearly stated (and for published papers have been vetted through peer review and the editorial process) and do not require interpretation by a Wikipedia editor.  Consequently all such published papers should be treated equivalently in Wikipedia.  To call scientific research primary sources and historical research secondary sources is both inappropriate and confusing.  --SteveMcCluskey (talk) 16:09, 10 January 2010 (UTC)
 * As I noted above, the distinction between primary and secondary sources is their distance, or perspective, from the data they are reporting on. As example, according to this source, which is cited on NOR, "report of scientific discoveries" is a primary source, while "analysis and interpretation of scientific discoveries" is a secondary source. The difference is the distance: if you are involved in the collection and initial analysis of the data, your report is a primary source, while if you write about such a report and interpret it, your work is a secondary source for that data. Crum375 (talk) 16:40, 10 January 2010 (UTC)
 * Let me add that the peer review process per se adds a layer of vetting to the reported results, and therefore makes the source more reliable, but it does not add perspective and distance. The latter is only added when a report or review is written about the original primary report, which makes the former a secondary source. Crum375 (talk) 16:46, 10 January 2010 (UTC)


 * (ec, for Steve) I don't see any comparison. A scientist says, "Let's mix A and B and see what happens. Oh look, there was an explosion. Therefore, A plus B leads to C in these circumstances." His report about that is a primary source, clearly, because he was directly involved in the events he describes. Indeed, he created them.


 * I don't see what the comparison is with an historian who describes what other people did many years ago, events that he was not involved in at all. SlimVirgin  TALK  contribs 16:41, 10 January 2010 (UTC)
 * Slim, by your logic, a history of the Vietnamese War, written by a historian who served in Veitnam would have to be considered a primary source... as the historian was "directly involved in the event". As for the scientific report... what if someone else (say a lab assistent) mixed the two chemicals that resulted in the explosion... is the report still primary? Blueboar (talk) 17:21, 10 January 2010 (UTC)


 * If he was writing about issues he had direct and personal knowledge of, then yes, of course. We have that situation in the Israel-Palestine history articles. Some of the early historians about 1948 were themselves involved in it, and were being paid by the IDF's history department, so we have to treat what they write with great caution, because their work is almost (or is) primary-source material. SlimVirgin  TALK  contribs 18:11, 10 January 2010 (UTC)


 * Blueboar, I think you're trying to grapple with issues here that would need to be discussed on the article talk pages, if and when those issues arose -- "what if the scientist later had amnesia and couldn't remember he was the one who conducted the experiment, if he wrote about it would that still be a primary source?". No policy can cover every possibility. What we do is give the broad brushstrokes of the meaning of the terms, explain what our policy is and why, and leave the rest to editors. SlimVirgin  TALK  contribs 18:13, 10 January 2010 (UTC)


 * Actually, I am not grappling with the issues... I fully understand what PSTS is saying... to some degree I have been playing devil's advocate... trying to demonstrate the kind of arguments that the current language generates over and over again on the policy talk pages and at NORN. I fully agree with what PSTS says... and what it says isn't "wrong"... however, my feeling is that in the process of saying it, we create confusion.  It encourages editors to ask exactly the type of nit-picky (and pointless) questions, and raise the same pettty points that I have been asking and raising.  In our various discussions, you have been able to sum up quite clearly what the PSTS section is all about - in very clear one or two line statements.  I would be much happier if PSTS did the same. Blueboar (talk) 15:31, 11 January 2010 (UTC)


 * Bear in mind, Blueboar, that the people who come to the NOR page asking the kinds of questions being raised here are the ones who (by definition) don't understand what the policy is saying. I see these terms being used daily on article talk pages by editors who clearly do understand them and who are using them correctly. SlimVirgin  TALK  contribs 19:10, 11 January 2010 (UTC)


 * Just to add: I have no problem at all if we tighten the PSTS section, so long as it's written by editors who understand what the terms mean. The problem lies in creating a policy fork. SlimVirgin  TALK  contribs 19:12, 11 January 2010 (UTC)


 * I agree... but when enough people don't understand what the policy is saying and, more to the point, when their misunderstanding stems from confusion about the same section, you begin to think that maybe the problem isn't with the editors, but with the policy. Since I don't think what the policy says is wrong, I have come to the realization that the misunderstandings must stem from how we say it. If we can clear this up without the need for a seperate PSTS page, fine... but I am not sure we can.  Blueboar (talk) 22:19, 11 January 2010 (UTC)

Definitions
O.K. The issue here seems to be that we have two different definitions of a secondary source.

Version 1 – A secondary source is a work of interpretation and analysis of primary sources.

Version 2 – A secondary source is separated from primary sources by “distance”. This basically means it is written by people independent of the primary source.

Version 1 is supported by the two sources cited so far in this argument: the post by Jimbo and the BMCC library website. I see little evidence for Version 2 so far although I dare say some people use the term to mean that. I suggest that we should call Version 2 independent sources, rather than secondary sources. This fits in nicely with WP:N.

Yaris678 (talk) 00:32, 11 January 2010 (UTC)


 * No, version 2 is wrong. A secondary source has distance from the topic, not from any other primary source. It may also have distance from the latter, but that's a side issue.


 * I crash my car and give a statement about it to the police. That statement is a primary source, because I was there, I saw it. A journalist takes my statement and writes a story about it. That story is a secondary source, because he wasn't there, he didn't see it. Every judgment about whether something is a primary or secondary source can be boiled down to that simple scenario. SlimVirgin  TALK  contribs 01:10, 11 January 2010 (UTC)


 * Substitute your definition for Version 2 and everything else I said still holds. Yaris678 (talk) 01:19, 11 January 2010 (UTC)


 * I don't understand what you mean. SlimVirgin  TALK  contribs 01:21, 11 January 2010 (UTC)


 * Version 1 isn't really correct either. A secondary source might simply be a description of something the writer has no direct involvement in. It needn't be analysis. SlimVirgin  TALK  contribs 01:22, 11 January 2010 (UTC)

This is what I mean by "Substitute your definition for Version 2 and everything else I said still holds":

O.K. The issue here seems to be that we have two different definitions of a secondary source.

Version 1 – A secondary source is a work of interpretation and analysis of primary sources.

Version 2 – A secondary source has distance from the topic.

Version 1 is supported by the two sources cited so far in this argument: the post by Jimbo and the BMCC library website. I see little evidence for Version 2 so far although I dare say some people use the term to mean that. I suggest that we should call Version 2 independent sources, rather than secondary sources. This fits in nicely with WP:N.

Yaris678 (talk) 01:41, 11 January 2010 (UTC)


 * Both are correct (the second always, the first mostly), and there's no need to introduce a new term. Yaris, I see you've made 500 article edits. I suggest you spend some more time on articles where these sourcing issues matter, and perhaps reading what academics say about primary/secondary, and then if you still feel there's a problem, come back to this at a later date. There's no substitute for editing to see how the policies work in practice&mdash;and they do work. I don't think I've ever seen a content dispute that a really thorough application of all the content polices can't resolve. SlimVirgin  TALK  contribs 02:01, 11 January 2010 (UTC)


 * Our mainspace article, Primary source, gives a pretty good overview. SlimVirgin  TALK  contribs 02:06, 11 January 2010 (UTC)


 * This gives a good definition too, as does Secondary source. Hope this helps. SlimVirgin  TALK  contribs 02:09, 11 January 2010 (UTC)


 * Sorry, one more point then I'll shut up. The thing to grasp is that the key issue is distance. Whether something is primary or secondary depends on the relationship to the issue. So for example, with the car accident I described above, my statement is a primary source, because I was there; the reporter's story is a secondary source because he wasn't. But in 100 years' time, the reporter's story will also be a primary source for students of car crashes in the year 2010. Similarly, Wikipedia is now a tertiary source. Let's supposing it ends tomorrow, and is somehow lost, then a copy of it resurfaces in a thousand years' time under a rock somewhere. It will then be a primary source of material about the years 2001 to 2010, because of its closeness to that period, because it was of that period. SlimVirgin  TALK  contribs 02:22, 11 January 2010 (UTC)


 * I've been away for a few hours and have to catch up, but as I read the early discussion I cited above, the primary concern behind Wikipedia's PSTS distinction is to avoid interpretation of raw observations. Avoiding original interpretation is the essential element to avoid original research.  Published interpretations (secondary sources) generally also imply personal distance from the observations, but this is not necessarily so in all cases.
 * That being said, we should try to source any discussions about these terms by reliable scholarly discussions of them. A well-sourced essay/guideline of this sort seems a useful place to do it.  That's my two cents here.  --SteveMcCluskey (talk) 03:48, 11 January 2010 (UTC)


 * No SlimVirgin. Even if 100 years pass, and someone digs up that research it will still be a secondary source. The paper created based on that source will be considered a tersiary source. Now if he had a camera footage of the crash, that would still be a primary source. Tertiary source gives a good definition of distinction in such cases between secondary and tertiary. Since the person 100 years from now would be so peripheral, he would be more of a tertiary source. 陣 内 Jinnai 04:02, 11 January 2010 (UTC)
 * I think Slim has it right here... look at this from another angle... If I were to write a study on how car crashes are covered by the media through history, then SV's news article about the car crash (a secondary source when it comes to discussing that specific car crash) is. for me, a primary source... it is primary within the context of my study. I am not using it to discuss the specific car crash... but to discuss how SV covered the car crash.
 * Also, the age of a document does (to some degree) influence whether historians consider a document primary or secondary. Livy's "History of Rome" is considered a primary source by historians, even though it would have been considered a secondary source at the time he wrote it. Blueboar (talk) 15:16, 11 January 2010 (UTC)

How this proposal got started
Yaris, I'm curious about how this proposal evolved. Are you the anon who asked the initial question that triggered it here? The reason I ask is that we've had quite a few situations of new users finding they had problems adding something to an article, and then wanting to change the policies as a result, when in fact it was just a misunderstanding about how to apply the policies that was the issue. SlimVirgin TALK  contribs 19:58, 10 January 2010 (UTC)


 * You have the right conversation, but I was not the anon. You can see the evolution of the idea if you look at the whole conversation in the archive at WT:No original research/Archive 48.
 * There were various posts sparked by the anon users comment and then Blueboar said that he had long thought that PSTS needed to be re-examined because the current situation leads to people asking those sort of questions, which actually had nothing to do with original research. Crum375 suggested that one solution would be to move PSTS to WP:SOURCES.  Various people agreed, including myself - that was my first entry into the debate.  I came back, after thinking about it, and suggested it might actually be better to make PSTS a guideline.  Blueboar said that is what he'd been thinking for a long time.  The original debate (about the anon user's suggestion) continued.  After a bit Blueboar reiterated the call to have a separate guideline so I decided to come up with a draft in my userspace, which is what we have now.
 * Yaris678 (talk) 23:53, 10 January 2010 (UTC)


 * Okay, thanks, Yaris. I think really that question could have been dealt with either on the policy talk page or the talk page of whichever article prompted it. I don't think it's appropriate to remove, or seriously dilute, a long-standing key part of a core content policy because of a misunderstanding about scientific papers. The solution is: if you're the scientist who both conducted and is writing about the research, what you write is a primary source. If you're writing about someone else's work, it's a secondary source. Both can be used, but the former must be used very carefully; no analysis can be based on it; no article should rely on such sources. Creating a policy fork because of that one point would be a mistake, in my view. SlimVirgin  TALK  contribs 00:22, 11 January 2010 (UTC)


 * The question by the anon was dealt with on the policy page. Secondly, there are two separate issues here and I don't want to conflate them
 * Whether to have a separate guideline on PSTS
 * How we categorise scientific papers that contain analysis and interpretation as well as experimental results.
 * Thirdly, that one point made by the anon user is not the only reason for creating a separate guideline. It was just the trigger.  The main reason is to allow the NOR policy to stick to the NOR aspects of PSTS and have a place where we can talk about all the things related to PSTS – not just the NOR aspects – i.e. The definitions, issues around notability etc.  This should reduce the number of comments we get in future from confused novice users.
 * Yaris678 (talk) 01:13, 11 January 2010 (UTC)


 * All those issues are related to NOR -- the notability issue is also related to NOR. I find this proposal worrying. I see definitions being made up on the hoof, issues being conflated and misunderstood, and a policy fork being proposed of one of our most important policies.


 * As for the scientific paper thing, can you give an example so we can see clearly what you mean? If the paper is analysing work by the writer, it's a primary source. If not, a secondary one. Do you have an example where that distinction is unclear? SlimVirgin  TALK  contribs 01:19, 11 January 2010 (UTC)


 * I think you're right here. Two scientific papers that I mentioned above (Watson and Crick's DNA paper and Einstein's Special Relativity paper) are both original syntheses that draw (primarily) on other people's experimental work or on existing basic principles.  These are clearly secondary sources because they are syntheses, although some scientific usage would call them "primary publications" or "primary literature," because they are the first places where the idea appears in print.  I don't think the first published appearance of an idea is what Wikipedia means by "primary source".  Our meaning seems to be closer to the historians' usage of it as the original/raw/uninterpreted source material.  --SteveMcCluskey (talk) 03:59, 11 January 2010 (UTC)


 * I suppose there might be ways in which these could be used that would make them secondary sources, but if they're being used to describe Watson and Crick's and Einstein's work, they're primary sources. SlimVirgin  TALK  contribs 21:04, 11 January 2010 (UTC)


 * I would disagree on the classification of these papers. If a scientist evaluates information and comes up with a novel conclusion or discovery, the original paper which presents the conclusion and the data it derives from is primary, because no one has ever reached that novel conclusion before. The scientist is directly involved in collecting the background information, performing calculations or derivations, and making the discovery, which is akin to an invention. We can report on this primary paper descriptively, but to put it in perspective (e.g. assess its notability) and analyze it we need a secondary source doing it for us. If the topic is "theory of special relativity", and Einstein is directly involved in discovering it, which he reports in his original paper, he is very close to the topic and thus his paper is primary. Similarly for Watson/Crick DNA. Crum375 (talk) 04:31, 11 January 2010 (UTC)
 * We both agree and disagree. In scientific usage these are "primary publications," since scientific usage makes "the first disclosure..." the element of primary publication. (Council of Biology Editors [now Council of Science Editors], "Proposed definition of a primary publication" Newsletter, Council of Biology Editors, 1968; as quoted in Michael Derntal, Basics of Research Paper Writing and Publishing, 2009).  This is a different thing than Wikipedia's (and humanistic discipline's in general) concept of primary sources.


 * Historians or literary critics perform exactly the same functions you describe: "collecting the background information, performing calculations or derivations, and making the discovery...", yet we do not call their research primary sources. One of the reasons this essay was proposed was to resolve the confusion arising from these two very different concepts.


 * Although Watson & Crick and Einstein may be "primary publications," in the scientific sense, to the extent that they provide syntheses of the evidence, they are not primary sources in the humanistic/historical sense. SteveMcCluskey (talk) 14:39, 11 January 2010 (UTC) -- revised 14:46, 11 January 2010 (UTC)

I think the only "sense" that matters here is the Wikipedia one, where these papers are primary sources. This means that they may not be interpreted, analyzed or put into historical perspective without a secondary source which reviews them. Crum375 (talk) 20:55, 11 January 2010 (UTC)
 * I agree that the Wikipedia sense should be the operative one here, but where does it say in Wikipedia that "the first disclosure of an idea" is a primary source. If it does, when did this get added to the Wikipedia definition?  --SteveMcCluskey (talk) 21:08, 11 January 2010 (UTC)


 * Steve, why is this being made so horribly complicated? If you do something, then write about it&mdash;whether you're describing or analysing your actions&mdash;what you write is a primary source of material about the thing you did. If I read your account, then write up my own, what I write is a secondary source. Whether it's in the sciences or humanities makes no difference. SlimVirgin  TALK  contribs 21:15, 11 January 2010 (UTC)


 * I'll agree if it's a case where you do something and then write about it; but in the case of the two papers I mentioned above, the majority of the evidence discussed by Watson & Crick and Einstein was done by other scientists. I want to avoid the claim that a scientific paper that first publishes an idea (the Structure of DNA, the Theory of Special Relativity) is ipso facto a primary source and can't be used to discuss DNA or Special Relativity.
 * To add confusion to this discussion, in other contexts, say discussing the history of those scientific discoveries, these papers would be primary sources which require interpretation by professionals to understand the contributions of Watson, Crick, and Einstein--in your terms, what they did. The distinction isn't simple and can't be reduced to a few short lines.  --SteveMcCluskey (talk) 21:26, 11 January 2010 (UTC)


 * You need to be careful, though, that you're not engaging in OR when you describe Watson and Crick's work, for example, as a secondary source. If you're writing about their work, their papers are primary sources for that work, even if you think a lot of it came from elsewhere. A paper that first publishes an idea can indeed be used as a source to discuss its own idea, and e.g. DNA in general. Primary sources may be used. They just can't be used as the basis for interpretation. SlimVirgin  TALK  contribs 21:44, 11 January 2010 (UTC)


 * Steve, if there were a background section in Einstein's article reviewing and summarizing the state of the art prior to his discovery, that could be a secondary source for the level of scientific knowledge prior to Einstein's work. But the discovery itself, despite being a "synthesis", based on and derived from other people's contributions, is known as "the special theory of relativity" and is a novel concept. So for that specific topic, as well as its derivation, the paper would be primary. Similarly for Watson/Crick and DNA. Crum375 (talk) 22:35, 11 January 2010 (UTC)

Tag
Yaris, please don't add the guideline tag. It's not for use in userspace. SlimVirgin TALK  contribs 18:22, 12 January 2010 (UTC)

Apologies for the the cryptic edit summary. That was supposed to read "Of course its not a guideline - it's in my user space."

By putting the template there I am not trying to pretend it is a guideline. I don't think anyone could mistake it for a guideline since there is the hatnote explaining what it is. Since we are trying to make a draft guideline, it makes sense to show what it will look like when it is a guideline.


 * People know what it will look like. Please don't add misleading tags. SlimVirgin  TALK  contribs 18:23, 12 January 2010 (UTC)


 * Its not misleading, as I explained. Please stop dicking about with my user space.  Yaris678 (talk) 18:25, 12 January 2010 (UTC)


 * Yaris, I also think it's misleading to place official policy or guideline tags inside user space. Newer editors could easily land there by mistake and misinterpret them. There is no need to add them anywhere unless the proposal is accepted. Using false and possibly misleading policy or guideline tags anywhere on Wikipedia is disruptive. Crum375 (talk) 18:44, 12 January 2010 (UTC)

I've put in a different tag now, to show where it will go. I hope this is acceptable. Yaris678 (talk) 18:47, 12 January 2010 (UTC)


 * May I suggest that we not include any tags at this point... this is still a user's draft... it isn't even an essay or a formal proposal at this stage. Patience. Blueboar (talk) 19:52, 12 January 2010 (UTC)


 * Think I've sorted it now. Rather than having two tags, one hatnote and a title trying to give a vague impression of what its all about, I have a single tag which, I hope, explains it clearly.  Yaris678 (talk) 09:59, 13 January 2010 (UTC)