Talk:Cross-industry standard process for data mining

Link to IBM ASUM-DM
The Link to IBM ASUM-DM is broken. Can someone fix it?

Link to Shearer's paper on the Journal of Data Warehousing
The link is broken, it points to IBM's SPSS page. Can someone fix it?

--Lucas Gallindo (talk) 12:54, 28 July 2011 (UTC)

fixed broken links, removed other dead links, and updated some text Karl (talk) 01:19, 6 November 2012 (UTC)

Added sentence to compare CRISP-DM to SEMMA. The SEMMA page also recently got an ORPHAN wiki-tag, and this page seemed to be one of the most natural pages to link to SEMMA. Karl (talk) 16:18, 15 November 2012 (UTC)

FYI - Spanish Source
FYI - Spanish Source, in case this is of interest to anyone: http://www.oldemarrodriguez.com/yahoo_site_admin/assets/docs/Documento_CRISP-DM.2385037.pdf. It reviews several data mining process models. I'm not sure exactly how to cite this source, or whether Wikipedia's convention is to cite sources in non-English languages or not. Karl (talk) 05:56, 18 December 2012 (UTC)

Inclusion of some CRISP-DM 2.0 material
The section titled CRISP-DM 2.0 was recently deleted. I agree that it does not warrant it's own section, but I think that some of the deleted material should be retained in the history section. I think that it is useful to readers to have information in this article that points out that the original consortium is no longer working together, the original www.crisp-dm.org website is gone, and that the initiative to create a revised/updated CRISP-DM 2.0 is no longer active (no activity for years, website gone, etc).

I've spent some time looking, and I can't find any sources to cite about CRISP-DM 2.0 being inactive. But I suppose that this is a challenge that other wikipedia pages have faced also. Can other people please suggest how to document this. When something goes away or stops, there's not always a published source that says it's gone. But (in my opinion) it can still be something worth noting on a wikipedia page.

For now, I will add some very brief information to the CRISP-DM page. But I welcome others to please edit it to improve it and make it comply better to wikipedia standards for this kind of thing. Thanks. Karl (talk) 15:34, 18 December 2012 (UTC)


 * "I can't find any sources to cite about CRISP-DM 2.0 being inactive". Then we simply don't say that.
 * Verifiability policy requires that Even if you're sure something is true, it must be verifiable before you can add it. Deltahedron (talk) 17:50, 18 December 2012 (UTC)
 * I agree in principle. So I have removed my interpretation that the "efforts have stalled".
 * But I think that it is just a statement of fact that 1) the websites are now gone, and 2) the consortium leaders have not communicated to the CRISP-DM 2.0 SIG members (I was one of them, and I know other SIG members). It seems to me to be worthwhile to point these things out to readers of this page, because it gives readers a sense of the current status of CRISP-DM, and whether or not it was a static thing that happened around 2000, or if there is a vibrant group working to update it.  However, it seems impossible to expect that the absence of a webpage is going to be backed up by a peer-reviewed journal citation. So, in my opinion, stating that a website is no longer there is not a controversial claim that needs a citation.  Anyone who looks will see that it is not there.  And, if in the future someone puts the website back up, then edits can be made to this wikipedia entry to reflect that the website is active.
 * And since Deltahedron has had concerns about my COI in the past, let me state openly that:
 * I participated in one meeting in 2006 that was held to discuss possible things to address if CRISP-DM 2.0 revisions were to move forward
 * In 2006 I was on a CRISP-DM 2.0 SIG email list.
 * However, I do not feel that these two things (over 6 years ago) mean that I have a COI with CRISP-DM. It is not central to my work or my life, and is not something I spend much time thinking about.  Karl (talk) 18:50, 18 December 2012 (UTC)
 * I can only reiterate Wikipedia policy on the matter: this is not optional. If you wish for further comments, there is always Reliable sources/Noticeboard.  The requirement is not for academic peer-reviewed sources but for "reliable, third-party, published sources with a reputation for fact-checking and accuracy".  Using one's own personal experience is simply not acceptable.  Stating that a website is "no longer there" asserts (1) that it was once there and (2) it is not there now.  Part (2) may be capable of direct verification by the reader, but (1) certainly is not.  If no one else in the world has published a comment on the status of this group, then it is clearly not worthwhile to do so here.  Deltahedron (talk) 19:13, 18 December 2012 (UTC)
 * Good points. I will revise it.  However, it is my personal POV that such strictness reduces the overall quality of the article.  I also feel that such strictness is not applied consistently across articles or even within this article.  E.g., Each of the sentences in the first 2 paragraphs of the History section contain claims that are not backed up with citations.  I think it is fine for the authors of those sentences to have written them the way they did, without citations for each point.  Over time, if other wikipedia authors did not agree with the material, it would get modified.  I personally think that this more relaxed standard should be applied now too.  Karl (talk) 19:35, 18 December 2012 (UTC)
 * OK, Deltahedron, I've modified it. In the end, I think it looks fine.  Thanks for your coaching, even if I wan't always happy to hear it.  OK, that's all the time and energy I have available to devote to this page now.  I'll step back and let others make modifications to enhance it further. Happy holidays. Karl (talk) 20:26, 18 December 2012 (UTC)

Standard Methodology for Analytical Models (SMAM)
References to this Wikipedia article continue to appear in this article. This concept appears to be original research and has created a circular reference back to this article. I have not been able to find any other published material that relates to this subject - Glenryman (talk) 05:19, 6 July 2015 (UTC)

Source link for "CRISP-DM 1.0 Step-by-step data mining guide"? (current one is wrong)
As of right now the sources include the following link under the name "CRISP-DM 1.0 Step-by-step data mining guides":

ftp://ftp.software.ibm.com/software/analytics/spss/documentation/modeler/14.2/en/CRISP_DM.pdf

This leads to a document that has the title "IBM SPSS Modeler CRISP-DM Guide" - which is an entirely different document than stated in the name. (see https://inseaddataanalytics.github.io/INSEADAnalytics/CRISP_DM.pdf for a non-FTP link to a copy of that document)

I've found other online sources that have some version of the CRISP-DM Guide 1.0, but they are not so "official" / "reliable":

https://www.semanticscholar.org/paper/CRISP-DM-1.0%3A-Step-by-step-data-mining-guide-Chapman-Clinton/54bad20bbc7938991bf34f86dde0babfbd2d5a72

the above in turn has links to pdfs hosted by university of Kassel (unfortunately with some pixelated graphics):

http://www.kde.cs.uni-kassel.de/lehre/ws2016-17/kdd/files/CRISPWP-0800.pdf

https://www.kde.cs.uni-kassel.de/wp-content/uploads/lehre/ws2015-16/kdd/files/CRISPWP-0800.pdf

I've also found another pdf with better graphics here:

https://www.the-modeling-agency.com/crisp-dm.pdf

I think I'll change the source to reference on of the university of Kassel links and also the link with the better graphics. However, it would be great to have better / more reliable sources.