Wikipedia:Articles for deletion/Big data (2nd nomination)


 * The following discussion is an archived debate of the proposed deletion of the article below. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review).  No further edits should be made to this page.

The result was   keep.   A rbitrarily 0   ( talk ) 23:19, 28 April 2010 (UTC)

Big data
AfDs for this article: 
 * – ( View AfD View log  •  )

A (different) version was sent to AFD and consensus was to delete. This version was tagged as speedy, but it is not. Bringing here, for community to assess this version. I am neutral on it. -- Cirt (talk) 23:57, 21 April 2010 (UTC)
 * Merge to Computer data processing as "Large data sets" (a more usual term) then delete. The original article discussed the same subject, and the reasons for prior deletion - the term "big data" itself, and that the subject is insufficiently separable from generic data handling - still apply. Big is an adjective that can be applied to many, many subjects; there is nothing more notable about this combination that any other. "Big cake" (the first example that springs to mind) will require different preparation from small cake and may require a specialist oven - but it would be madness to have a separate article for this, and so it is here. I42 (talk) 06:47, 22 April 2010 (UTC)
 * But then you are merging this notable article containing numerous citations with a generic entry that has none?! Your "Big cake" argument is much more applicable to the Computer data processing article, don't you think? jk (talk) 19:51, 22 April 2010 (UTC)


 * Keep - The term is notable and has a distinct definition and technical implication. See many precedents where the sum meaning of the words is distinctly unique. Imagine a parallel debate for these terms:
 * Big Bang -- "Let's redirect Sir Fred Hoyle's coining to Nucleosynthesis."
 * Big_band -- "Let's merge this musicology term to Swing_(genre)."
 * And other entries where size matters:
 * Large_Hadron_Collider
 * Integrated_circuit
 * Let's focus on arguing the notability of the term - as referenced by numerous articles and books.
 * See particularly the Economist link where the reporter says "We are at a different period because of so much information,” says James Cortada of IBM, who has written a couple of dozen books on the history of information in society. Joe Hellerstein, a computer scientist at the University of California in Berkeley, calls it “the industrial revolution of data”. The effect is being felt everywhere, from business to science, from government to the arts. Scientists and computer engineers have coined a new term for the phenomenon: “big data”." jk (talk) 03:29, 23 April 2010 (UTC)
 * Do you think this newly coined term passes the high bar of WP:NEO? Abductive  (reasoning) 04:58, 23 April 2010 (UTC)
 * Yes it does pass the bar for neologism. Here is the key distinction: "secondary sources such as books and papers about the term or concept" including full articles from ACM, Wired, Gigaom and Release 2.0. Reliable sources are writing about the term itself and not simply using the term in passing. In contrast, visit the categories at bottom to see hundreds of terms without secondary sources. Many of those need Afd notes, I would think: Dirty_data, Relational_calculus, Deductive_database, Multivalued_dependency, Media_hacker, Photoblog... jk (talk) 06:46, 23 April 2010 (UTC)
 * (ec with Abductive) The Economist article reinforces a prevailing view at the first AfD that the term is a neologism. See the lead of Avoid neologisms: "Neologisms are words and terms that have recently been coined, generally do not appear in any dictionary, but may be used widely or within certain communities" (which seems to exactly describe "Scientists and computer engineers have coined a new term"). The remainder of the guideline, and the policy WP:NOTNEO, explain why the article content and title are inappropriate, and I believe it would be best dealt with by following my recommendation that we use a more accepted term ("large data sets"), as part of the larger subject.
 * Your (JK) counter examples reinforce my point. The Big Bang was a specific event; the term is not a reference to bangs in general which are bigger than norm. Similarly, Big Bands are a specific kind of band and the LHC is a specific name for a single piece of equipment. However, the closest equivalent you have found - Large Scale Integration - is dealt with as part of the Integrated circuit page and this exactly mirrors my recommendation that large data sets be dealt with a part of computer data processing in general.
 * If my recommendations are followed then your work will not be undone, it will be incorporated into a parent article - and redirects will lead you to it even if you search for "big data" or "large data sets". But Wikipedia policy and guidelines seem to me to be quite clear about this. I42 (talk) 05:21, 23 April 2010 (UTC)
 * Where do you think the concept belongs - "parent article"? jk (talk) 05:55, 23 April 2010 (UTC)


 * Note: This debate has been included in the list of Computing-related deletion discussions.  -- • Gene93k (talk) 17:57, 22 April 2010 (UTC)
 * Keep, this article has a proper definition and explanation of the the term "big data", and has proper references. The old version only consisted of a couple of sentences and a link to an article that didn't even mention the term. This version looks much better. J I P  | Talk 05:49, 23 April 2010 (UTC)
 * I agree that this is much better than the last article. Whether we decide to refactor it or not, there certainly is stuff here worth keeping in some form. I still have some doubts over the article title because the article could just as easily be called something like "large data sets" but I am not sure if that would point us towards a rename or a merge. I don't think that there would be anything harmful either in merging it and redirecting, or in keeping it separate and discussing the optimal name. I am not sure which is best so I will stay neutral. If we do keep them separate then a few sentences on the subject should be included in Computer data processing with a "main article" link to here and this article must link back to Computer data processing in order to have context. --DanielRigal (talk) 15:35, 23 April 2010 (UTC)
 * But the term, "Big data" is addressed specifically in secondary sources by Reliable sources such as ZDNet, which makes the term unique and notable. "Large data sets" generically descriptive and is a different species altogether. jk (talk) 17:35, 23 April 2010 (UTC)


 * Keep. Sources about this neologism itself exist. Big data is also the main topic of a number of sources. How much a few months brings, eh? Abductive  (reasoning) 04:04, 26 April 2010 (UTC)
 * Keep. Several sources, appears to show notability. Massively improved on its earlier versions. Alzarian16 (talk) 14:17, 26 April 2010 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review). No further edits should be made to this page.