Talk:Research data archiving

William's deletion of the subsection Climate Science Research
William has deleted a section discussing one controversy around the lack of data archiving in climate science. He claims it has the look of an "attack." The section was not written as an attack but to explain to readers how the issue has played out in different examples. William is welcome to edit the passage to make it better, but a wholesale deletion is not warranted.RonCram 22:35, 23 March 2007 (UTC)


 * You're welcome to do a complete re-write to make it something other than an attack pice, but until you do, please don't put it back. Wholesale deletion of what you added was entirely warranted William M. Connolley 13:02, 24 March 2007 (UTC)


 * William, I already put it back. If you can point to something in what I wrote that is not accurate or well-sourced, I will change it.  But if you cannot, there is no point in attempting a rewrite.RonCram 13:42, 24 March 2007 (UTC)


 * Ron, I already took it out. If you can't see that the whole thing is badly unbalanced, you shouldn't be trying to edit this stuff William M. Connolley 13:57, 24 March 2007 (UTC)


 * William, your POV is showing. You cannot point to a single misstatement or single ommission that would make it unbalanced.  Your response is to just delete it.  That is not Wikipedia policy.  I am afraid we are going to have to seek some kind of conflict resolution here.  You and I both know the facts are correct.  If you want to add something and can support it, please do.  But data withholding in climate science is a big issue and the Mann/McIntyre controversy is the most commonly known example.  I do not see how we can avoid discussing it.RonCram 14:15, 24 March 2007 (UTC)


 * Ron, I'm getting really bored with you and yours citing fake policy to try to force your edits in. Bad text can be deleted. I think your text is bad - the whole thing is just hopelessly one sided. Data withholding in CS is *not* a big issue; and the M&M stuff is the only example. I'd be happy to have it discussed in a neutral way, though. Why are you afraid of seeking conflist resolution? William M. Connolley 14:26, 24 March 2007 (UTC)

William, conflict resolution on Wikipedia is a hassle and is often based more on politics than truth. However, I think it may be unavoidable here. Your POV is such that you are unwilling for the truth to be available to Wikipedia readers. How can it the Mann/McIntyre stuff be discussed in a "neutral" way without disclosing that Mann denied access to data and code for a long time and that it took an act of Congress to get him to release his code? Your POV will not allow you to admit the facts and call them "neutral." You will always claim the facts themselves are unbalanced. RonCram 14:33, 24 March 2007 (UTC)


 * It didn't take an act of congress. What exactly are you referring to? William M. Connolley 14:53, 24 March 2007 (UTC)

William, I think that you need to explain your specific points against this section rather than simply erase and say that the whole is not "appropriate" according to your view. This will allow the author to fix what you feel is wrong and perhaps find a more neutral wording. --Childhood&#39;s End 15:16, 24 March 2007 (UTC)


 * William, Mann did not provide his source code until Congress got involved. (I corrected your typo. Hope you don't mind)  BTW, I agree with Childhoodsend.RonCram 15:30, 24 March 2007 (UTC)


 * Ron Act of Congress is as described. If you don't mean that, what do you mean? William M. Connolley 17:09, 24 March 2007 (UTC)
 * William, I didn't mean act of Congress in the technical sense, only that Congress had to act - hold hearings, call Mann to testify and ask him to release his code. To avoid misunderstanding in the future, I will say "nearly took an act of Congress." RonCram 12:19, 25 March 2007 (UTC)
 * Ron, if you say AoC you can expect people to assume the std meaning. Its deeply misleading to suddenly veer off into an alternative meaning. As to your proposed new words... how can you possibly defend that? Firstly, we've no idea (now) in what sense you mean AoC. Secondly, how can you possibly know that it "nearly" to an Act, or was it jsut an act? William M. Connolley 13:31, 25 March 2007 (UTC)
 * Sorry I am responding late. I just found this note.  The latest version (which is now part of Scientific data withholding links to the letter from congressmen asking Mann to provide his data, methods and source code.  This was the act of Congress I referred to, an act of investigation. I made a mistake in that I thought the technical term for making a new law required a capital A as in "Act of Congress."  RonCram 05:05, 3 April 2007 (UTC)

Stephan
Please read the article before claiming that methods are not required to be archived. All of the major journals and funding agencies require methods and source code to be archived. Regarding your fact tag on mathematics journals, I got that information from you on this Talk page. You said archiving was not required because in the field of mathematics everything needed to replicate would be included in the article. If this is not correct, please make the statement accurate and remove the tag. Thanks! RonCram 18:50, 1 April 2007 (UTC)
 * Ron, I listed a number of computer science and maths journals (because these are the ones I'm familiar about) to counter your claim that data archiving is a general requirement. The induction is all yours! --Stephan Schulz 19:17, 1 April 2007 (UTC)
 * Stephan, after listing a few journals, you wrote: "As far as I can tell, the paper itself should be detailed enough to allow reproduction." The whole point of requiring archiving is so researchers can reproduce the study. Clearly without archived data, reproduction is not possible in many fields like earth sciences and medical research. How was I supposed to understand your comment? I believe data archiving is generally required in most fields.   I will build a sample list from different fields here:
 * Astrophysics See section 3.8.1 of "Astronomy & Astrophysics: Author’s guide" July 2006 RonCram 20:55, 1 April 2007 (UTC)
 * Good idea. But I have the feeling that you still do not understand the scientific requirement for reproducibility. Research must be reproducable, not auditable. In medical research, original data collection may be impossible to reproduce because of ethical reasons (once you figure out a treatment is not working, you cannot usually try again). In that case, access to the original data is necessary for reproduction. But in most cases, reproduction is possible without access to the original data. In these cases, data availibility is certainly desireable, but in no way necessary. M&M could have collected their own proxy series (Mann et al had extensive documentation about what series were used deposited with Nature). But they were not interested in reproducing Mann et al (and, in fact, not qualified to do so). What they were trying was not to refute the result, but to discredit the result by finding problems with the methods. --Stephan Schulz 21:07, 1 April 2007 (UTC)
 * Stephan, you seem to think there is no scientific value to discrediting a result which used uninformed and biased data selection methods and statistical methods. Your opinion shocks me.  This is how science progresses.  Methods are important.  And so is a complete knowledge of the data selected.  For example, if a doctor attempts a particular cure with 35 cases of cancer and 25 patients die from the treatment but the other 10 live - should the doctor be able to claim he "cured" cancer in 10 out of 10 cases? If you have read anything about the hockey stick controversy, you know that Mann changed the number of proxies he claimed he used to create the hockey stick.  He also claimed methods after the fact that were not claimed in the article itself. I do think you and I have a different idea about reproducibility.  In some situations, you may be trying to reproduce a certain result from an experiment.  In others, you may be trying to reproduce a particular image that was generated from certain datasets.  In that case, you would want to know exactly what datasets were used and the provenance of those datasets.  You would also want to know how the data was organized and analyzed.  Was smoothing used?  What kind of smoothing?  Do you see? RonCram 01:19, 2 April 2007 (UTC)
 * Oh, I see the value. But it does not affect the basic reproducibility that is required by the scientific method. And what M&M do is not refuting the result. They try to discredit the method, ignoring that a lot of other methods come to very similar results. I've read some of M&M, and know what they claim (well, some of it...they keep trying and trying). That's fairly different from what I believe. Are you aware of the fact that they have tried to attack various aspects of antropogenic global warming before, always coming to similar results, but later to be refuted due to  really embarassing errors? I suggest you believe nohing that you cannot independently verify....--Stephan Schulz 01:50, 2 April 2007 (UTC)

Scientific data withholding
Looks like RC has just xferring the "controversial" material into Scientific data withholding which looks ripe for VFD as a POV fork William M. Connolley 22:15, 1 April 2007 (UTC)
 * It is not a POV fork. Part of the criticism of the material earlier was that it was more about data withholding than data archiving.  Not archiving data is a violation of the policy in many, if not most, science journals.  However, it is not the nearly as bad as data withholding which is not just a violation of journal policy but also contrary to the scientific method. In the Scientific data withholding article, the material should not be considered controversial at all.  It is all well sourced and NPOV.RonCram 00:31, 2 April 2007 (UTC)


 * It cannot be a POV fork unless it takes a different point of view from the consensus of the present article - thus creating two articles on the same topic from different points of view.


 * This is a spin-off and a legitimate one. You can propose a merge, and I'd second the proposal and even be willing to help with the merge. --Uncle Ed 01:41, 2 April 2007 (UTC)

Looks like a blatant fork to me; and a neologism to boot. I've put it up for AFD William M. Connolley 11:39, 2 April 2007 (UTC)

Proposed merge of Data library into Research data archiving
similar definition; and data archive is mostly a list of external links fgnievinski (talk) 00:13, 20 April 2022 (UTC)
 * ✅ Klbrain (talk) 08:55, 14 January 2023 (UTC)