Wikipedia:Bots/Requests for approval/CensusBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

CensusBot
Operator:

Time filed: 18:15, Friday, June 2, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available:  https://github.com/CommerceDataService/census-wikidata-bot (specific bot referred to in this request is wikipedia_bot.py)

Function overview:  A bot for checking total population and ranking values in U.S. State page infoboxes and editing to add official values from U.S. Census Bureau API's

Links to relevant discussions (where appropriate): 

Edit period(s):  One time currently

Estimated number of pages affected:  50

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No):  No

Function details:

This bot has been made in conjunction with the U.S. Census Bureau and the main purpose is to check Wikipedia pages to make sure that they contain the most up to date information available from the Census Bureau and that the entries are complete. Specifically, the bot does the following:
 * Reaches out to the U.S. Census Bureau's API's to grab state total population values for the most recent year available (starting from now and going back to 2013). The API it uses is the U.S. Census Bureau Population and Unit Housing Estimates program
 * Creates population ranking for states in Census Bureau API response (exludes PR and DC)
 * Iterates through each item in the Census Bureau API response and for each item, it searches Wikipedia for the appropriate page (considering redirects as well).
 * Loops through the content of the page to find the template containing the infobox and specifically containing the appropriate properties for total population and ranking (includes all possible variants of properties used for these)
 * When these properties are found, the value is compared against the API response to see if there is a match.
 * If the data does not match (e.g., the year does not match or the value does not match or it is not a complete entry), then an up-to-date entry which is properly formatted will be substituted for the existing entry and saved to the page with a preset comment ("Updating population estimate and associated population rank (when applicable) with latest value from Census Bureau").

This bot will be started manually, but in the future, the hope is to automate it once it has been further developed. It will also be expanded to touch on other relevant information outside of just state population values.

Discussion
I have created a similar bot for Wikidata already. I was granted a bot flag and successfully ran my bot for 2015 state population values and 2015 county population values. That bot is in the same code base and relevant discussions can be viewed here: https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/CensusBotSasan-CDS (talk) 18:15, 2 June 2017 (UTC)
 * I unblocked the bot for you. Could you please create a userpage for it with some info, including the bot template and a link to this approval? Since the number of pages involved is small, would it be possible to determine the changes that would be made (without editing) and build a list of them here? This would give a clearer sense of what we are dealing with. — Earwig   talk  21:36, 2 June 2017 (UTC)
 * Thanks. I just updated the userpage for CensusBot .  Let me know if I need to add additional information here.Sasan-CDS (talk) 16:36, 6 June 2017 (UTC)
 * For a 50-edit one-time only run this will be fairly trivial to process and not really in need of a bot flag; assuming you want to automate it - how frequently will updates be made? — xaosflux  Talk 00:18, 3 June 2017 (UTC)
 * I will automate this task eventually. Updates would only be made as new statistics are released by the Census Bureau (yearly roughly).  I also plan on eventually touching additional metrics beyond just total pop and ranking.  This is just the first metric I planned to work on to create a working prototype and prove that it works in order to make necessary edits.  I assumed I would just make a broader request eventually. Sasan-CDS (talk) 16:36, 6 June 2017 (UTC)

I am not sure what others feel, but given that the census data is likely to be the most reliable source for population, this is a non controversial task and thus does not need a whole separate thread to show consensus (like at the WP:VPP). I am pretty sure that such consensus is already existent and many states already use this source (Florida for example) and the US Census Bureau is the most reliable authority to provide this information. However, the BRFA should still assess the usual technical implementation/questions on implementation and trial process. TheMagikCow (T) (C) 16:58, 3 June 2017 (UTC)
 * This bot has edited its own BRFA page. Bot policy states that the bot account is only for edits on approved tasks or trials approved by BAG; the operator must log into their normal account to make any non-bot edits. AnomieBOT ⚡ 15:47, 6 June 2017 (UTC)

I just realized that these state pages are locked. Do I need to make a separate edit request within the talk page for each one or is there a way for me to edit it by running my bot as I planned to do? Thanks! Sasan-CDS (talk) 16:36, 6 June 2017 (UTC)
 * can you identify a specific page that is protected that we can check on? Bot's can be granted various flags that allow editing protected pages if need be. —  xaosflux  Talk 17:14, 6 June 2017 (UTC)
 * It appears that all of the State pages I am trying to edit are protected. One example is Florida  I just read a page on editing multiple protected pages and so I added a new topic to the Florida talk page as well: https://en.wikipedia.org/wiki/Talk:Florida#Updating_Total_Population_and_Ranking_numbers_in_Infoboxes — Preceding unsigned comment added by Sasan-CDS (talk • contribs).
 * Florida is only semi-protected, what happens when you try to edit it? Please show your steps to reproduce the problem.  Also, please sign your posts when replying or pings won't be fired. Also, why are you using external links instead of wikilinks?  I'm worried you are not familiar enough with editing Wikipedia in general to have an automated process make edits on your behalf. —  xaosflux  Talk 20:24, 6 June 2017 (UTC)
 * Also no edits should be getting made by your bot account to articles until it is approved for at least a trial. — xaosflux  Talk 20:25, 6 June 2017 (UTC)
 * The reason I am using external links instead of wikilinks is because I am referencing the exact source in which this data came from (which is the Census Bureau API for Population Estimates). Please let me know if I am making some kind of error in my formatting of this reference.  The exact text for this reference is  Sasan-CDS (talk) 14:50, 8 June 2017 (UTC)
 * I don't think you understand what I am saying. For example above you put in Florida instead of Florida . This indicates you are not familiar with wikiediting. —  xaosflux  Talk 14:54, 8 June 2017 (UTC)
 * I am following what you are saying now. Going forward, when providing internal links, I will use the proper wiki code rather than providing an external link.  Thanks for pointing that out.Sasan-CDS (talk) 15:08, 8 June 2017 (UTC)
 * This is a larger concern though, that is not just displaying a wikilink. The issue is indicative of the fact that you may not yet understand enough wiki markup to operate a bot. Adding to templates is more difficult than wiki markup and I am concerned that you do not have enough knowledge yet. TheMagikCow (T) (C) 16:22, 9 June 2017 (UTC)


 * This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT ⚡ 15:24, 7 June 2017 (UTC)
 * As well, the edits are problematic, they break the infobox templates and all need to be reverted/fixed. ɱ  (talk) · vbm  · coi) 15:41, 7 June 2017 (UTC)
 * I have reverted my changes.Sasan-CDS (talk) 14:50, 8 June 2017 (UTC)


 * I am concerned about the edits that were made earlier today. The bot has had no trial or approval, so the bot must not be editing until these have been given (WP:BOTAPPROVAL). Bot owners must be trusted members of the community, and given the lack of experience and editing after being told not to yesterday, I am concerned that the community can't trust the owner. Further, when the bot was editing, the templates were broken, and a line break was deleted. Bugs can of course be fixed, and a bug is no reason to deny the bot outright and I do like the idea behind the bot. However, at this time I don't feel that the operator is familiar with WP:BOTPOL, or general Wikipedia policies and markup, (signing posts, semi protection knowledge), so I don't feel can confidently operate a Wikipedia bot yet. TheMagikCow (T) (C) 17:52, 7 June 2017 (UTC)
 * This bot has been indef blocked, again. — xaosflux  Talk 20:36, 7 June 2017 (UTC)
 * I was not intentionally ignoring anyone's request not to make any edits. I applied for a bot approval to begin with (trying to follow proper protocol), provided the exact code that was running that bot and was told that the flag should not be necessary for the specific edits that I listed and to go ahead and run it.  I then tried to run it, but the edits were blocked due to the semi-protected nature of the pages.  I then filed for permission to do that within the Florida talk page as instructed by the wiki page regarding making edits to protected pages (and made reference to that request).  Then in the midst of the back and forth, I tried to run the bot again just to test if anything had changed and it ran successfully.  While checking over the edits, I then noticed that the bot made an error by leaving off the property tag with the property replacement text so I went in and reverted each change as to not cause any problems. While I am indeed still learning the ins and outs of operating a bot and following Wikipedia rules properly, I am trying my best to get up to speed and am not attempting to openly defy anyone's requests.  Let me know what you need me to do in order to follow proper protocols or what I may be missing in regard to trials (since I seem to be getting conflicted messages from different admins) and I will be happy to comply.  Thanks :)Sasan-CDS (talk) 14:50, 8 June 2017 (UTC)
 * Did you just not read the explicit message I left above, Also no edits should be getting made by your bot account to articles until it is approved for at least a trial. ? — xaosflux  Talk 14:55, 8 June 2017 (UTC)
 * After reviewing this conversation again, I think I am now understanding that I misinterpreted the previous unblocking of my bot account. I will now look into what information is available regarding getting approved for a trial to see what I can learn about this process in order to properly follow it going forward.  Please let me know any other information that you think I may need to know in order to help facilitate this process.  Thanks for your assistance. Sasan-CDS (talk) 15:14, 8 June 2017 (UTC)

Your bot has been blocked for the unauthorized editing of articles. Please see above. — xaosflux  Talk 20:40, 7 June 2017 (UTC)
 * Personally, I'm inclined to decline, although I'll let a more experienced BAG member make the final call. WP:BOTAPPROVAL states "prospective bot operators should be editors in good standing, and with demonstrable experience with the kind of tasks the bot proposes to do." I agree with that the prospective operator doesn't have the requisite experience on-site to meet this criteria. This section can be overlooked when an operator is unusally competent, cautious, and taking on tasks that are relatively low-risk (see, for instance, Bots/Requests for approval/Wiki Feed Bot), but I don't think that's the case here. The running of the bot task without authorization after AnomieBOT already posted above noting that bots require authorization to edit factors negatively when looking for unusual high competency and caution. This task directly edits high-view pages in the mainspace, so I also wouldn't call it low-risk. ~ Rob 13 Talk 22:36, 7 June 2017 (UTC)
 * I'm sorry the CensusBot broke the state page infoboxes when attempting to update the population values. I understand the concern right now with the bot approval is with my action as a bot operator. The Census Bureau and Department of Commerce are admittedly new to the Wikimedia community, and we're doing our best to operate in a community of experts. I'm also not operating independently, but as part of a team that hopes to provide more frequent, reliable updates to geographic entity pages within Wikipedia. Even though this first attempt wasn't done right, we still think the mission is important, and we've been working with others in the larger Wikimedia community to get to this point. Please let us know what you would like to see before the bot approval is granted.Sasan-CDS (talk) 18:04, 8 June 2017 (UTC)
 * A first step would be to read WP:BOTPOL in full to get a good understanding of the requirements a bot operator must abide by. ~ Rob 13 Talk 00:21, 9 June 2017 (UTC)
 * Thanks. I took a look at that and I think I have a decent understanding now.  I also changed my code to make sure there are no errors and have thoroughly tested it.  If you would like to see the sandbox I conducted my tests in and what the changes the bot performed, you can look here: User:Sasan-CDS/sandbox.  Can you let me know what I can do next to be compliant and how I can go about getting approval for a trial run?  Thanks. - Sasan-CDS (talk) 16:25, 9 June 2017 (UTC)

You never responded to my 3 June question: what will be the editing frequency for this bot? (e.g. 1x/day/page; 1x/month/page) —  xaosflux  Talk 14:45, 10 June 2017 (UTC)
 * Sorry, must have missed it somehow. The bot will be editing at a frequency of approximately 2x per year per page. Sasan-CDS (talk) 21:27, 12 June 2017 (UTC)
 * OK, I've unblocked this bot account. You can do one VERY LIMITED trial to demonstrate edits. —  xaosflux  Talk 23:04, 12 June 2017 (UTC)
 * — xaosflux  Talk 23:04, 12 June 2017 (UTC)
 * Thanks for approving the trial. I ran my trial just now.  My bot had some issues with duplicate reference tag names initially, but I resolved that.  My errors were reverted and then my bot successfully made an edit to the NewYork page.  Thanks. Sasan-CDS (talk) 14:29, 13 June 2017 (UTC)
 * Request for input left at Wikipedia_talk:WikiProject_United_States. — xaosflux  Talk 14:36, 13 June 2017 (UTC)
 * Please make 4 more edits (you pick the states). — xaosflux  Talk 03:56, 14 June 2017 (UTC)
 * Thanks. I went ahead and ran the bot to edit four more states(Illinois, Pennsylvania, Ohio, and Georgia).  FYI, I checked over the four edits and I noticed that the PopRank was properly edited on the page but the total population property (2010Pop) still had an error with the tag as happened with the last run of my bot.  I realized that this was due to me not undoing the edit last time.  So if you look at the history, you will see that I reverted the latest edit first (the population rank change), then undid the previous improper edit so I could revert the page back to its original state.  Then I ran the bot again and it properly edited that page with the correct reference (and tags).  Let me know if you have any additional questions regarding the edits my bot performed.  Thanks.  Sasan-CDS (talk) 15:59, 14 June 2017 (UTC)
 * Thanks. I went ahead and ran the bot to edit four more states(Illinois, Pennsylvania, Ohio, and Georgia).  FYI, I checked over the four edits and I noticed that the PopRank was properly edited on the page but the total population property (2010Pop) still had an error with the tag as happened with the last run of my bot.  I realized that this was due to me not undoing the edit last time.  So if you look at the history, you will see that I reverted the latest edit first (the population rank change), then undid the previous improper edit so I could revert the page back to its original state.  Then I ran the bot again and it properly edited that page with the correct reference (and tags).  Let me know if you have any additional questions regarding the edits my bot performed.  Thanks.  Sasan-CDS (talk) 15:59, 14 June 2017 (UTC)


 * go ahead and do 20 more, demonstrate with no errors. Due to the infrequent editing requirements and possibly useful edits appearing for watchlisters approval will likely be without a bot "flag". —  xaosflux  Talk 14:56, 17 June 2017 (UTC)
 * Thanks. The 20 allotted edits have now been performed.  Sasan-CDS (talk) 17:29, 19 June 2017 (UTC)
 * OK, need to review these. — xaosflux  Talk 23:27, 19 June 2017 (UTC)


 * Hi I noticed you are updating both the   and   parameters, can you do these in the same edit instead of in multiple edits? —  xaosflux  Talk 03:25, 20 June 2017 (UTC)
 * I see what you are saying. Yes, I can do it in the same edit.  I just changed my code to do all changes in a single update rather than an update per page change.  I have tested the change in my sandbox but let me know if you want me to run an additional trial incorporating this change into my bot.  Thanks. Sasan-CDS (talk) 21:12, 20 June 2017 (UTC)
 * Please do 10 that will incorporate both changes, when done post back here please. — xaosflux  Talk 22:16, 20 June 2017 (UTC)
 * I just checked over all affected pages and unfortunately, there are no state pages left that need to have both parameters altered. I could undo 10 changes from the last trial run and then run the altered bot though to demonstrate this change.  Let me know if you want me to do this.  Thanks. Sasan-CDS (talk) 13:31, 21 June 2017 (UTC)
 * undo 5, use edit summary "Population update bot testing Bots/Requests for approval/CensusBot" then run them. — xaosflux  Talk 15:16, 21 June 2017 (UTC)
 * The 5 edits are complete. Thanks. Sasan-CDS (talk) 20:30, 21 June 2017 (UTC)
 * — xaosflux  Talk 23:26, 21 June 2017 (UTC)
 * This bot is approved. Due to its very slow scope of articles and edit rate, it may run for this specific task, but will not be hidden from watchlists with a bot 'flag'.  Should you want to add new tasks in the future, please file additional bot review requests first. —  xaosflux  Talk 23:26, 21 June 2017 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.