Wikipedia:Bots/Requests for approval/Metriki


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

Metriki
Operator:

Time filed: 21:24, Tuesday February 28, 2012 (UTC)

Automatic, Supervised, or Manual: Manual

Programming language(s): Java

Source code available: no, not fully written yet

Function overview: I'm trying to write a bot that downloads page history information for data mining for my MS research.

Links to relevant discussions (where appropriate):

Edit period(s): no editing will be done, we are only downloading information, plan to use periodic batch runs

Estimated number of pages affected: no pages will be edited, estimate downloading page histories for hundreds of pages

Exclusion compliant (Y/N):Y

Already has a bot flag (Y/N): N

Function details: We are not editing any pages. We are interested in high volume downloads without facing the limit of the 500 revisions that will be returned.'''

Discussion
Can you use database dump? — HELL KNOWZ  ▎TALK 21:26, 28 February 2012 (UTC)

No, we don't have terabytes of space available for use. We are looking to download a representative sample of version histories from 100s, not thousands of examples.--Metriki (talk) 21:31, 28 February 2012 (UTC)
 * This bot has edited its own BRFA page. Bot policy states that the bot account is only for edits on approved tasks or trials approved by BAG; the operator must log into their normal account to make any non-bot edits. AnomieBOT ⚡ 21:33, 28 February 2012 (UTC)
 * Note 2 : Presumably the bot would be named MetrikiBot or similar, but the user is new and might not have expected the BRFA process to use the name of the bot, rather than the name of the user. Headbomb {talk / contribs / physics / books} 22:17, 28 February 2012 (UTC)
 * I have it downloaded and it takes about 1.6 TB with all the revision history for the English version with no talk pages. If you don't need all the revision history it drops to about 400-600 GB andn gets smaller as you start breaking things off you don't need (templates for example). Also, take a look here. I'm not sure if a bot of this type would be allowed. 71.163.243.232 (talk) 02:42, 29 February 2012 (UTC)
 * Kumioko, if you're going to retire, retire. Or at the very least don't disrupt BRFAs by making BS claims about what BAG allows or does not allow for bots. Plenty of bots like this were approved in the past, and plenty will be in the future too. Headbomb {talk / contribs / physics / books} 14:04, 29 February 2012 (UTC)
 * I concur with Headbomb that this bot's purpose is allowed as soon as the requester confirms a new account name.  MBisanz  talk 17:59, 29 February 2012 (UTC)

 MBisanz  talk 20:37, 29 February 2012 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.