Wikipedia:Bots/Requests for approval/Wiki Feed Bot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

Wiki Feed Bot
Operator:

Time filed: 18:57, Wednesday, January 11, 2017 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): Python

Source code available: https://github.com/fako/datascope

Function overview: Get information from the API in batches. Edit a page in the user namespace when an user transcludes User:Wiki_Feed_Bot/feed. Notify users on their talk pages with an automated message (in the future)

Links to relevant discussions (where appropriate):

Edit period(s): It will edit a page that transcludes User:Wiki_Feed_Bot/feed on a daily basis and when a user clicks the "force update" link that is added to the page through the transclusion.

Estimated number of pages affected: Depends on how popular the tool will become. Each user will typically have one feed. Perhaps some will create more than one.

Exclusion compliant (Yes/No): Yes, it only edits user pages where editors places the transclusion tag. It does not check for the bots template, but it will check that the page is in the users namespace and placed by the user who owns the page.

Already has a bot flag (Yes/No):

Function details:

Demo
You can see the tool in action on its demo page.

Bot read rights
The Wiki Feed system preprocesses information once a day. It fetches all recent changes from yesterday, groups them in pages and then starts getting meta information about these pages. It gets this information from the API and other services like Wikidata and in the future the Pageview API.

To be able to do this as efficient as possible the Wiki Feed Bot would like bot read rights to fetch 5.000 items in one go. The bot reads information for about 40.000 pages each day.

Currently Wiki Feed does not use the RCStream. We're considering it, but we need some time to implement this as it requires a fair amount of changes to the system.

Edit of pages in users namespace
To use Wiki Feed people need to paste some wiki text onto a page in their own user's space. This wiki text has the following markup. When users add this to a page they own Wiki Feed will create a feed on that page. The feed will show pages that have been recently changed. The user can decide how these pages should get ranked by specifying which "modules" should be used for the feed. If the transclusion specifies revision_count=1 and category_count=2 as modules than recently edited pages with many categories and many revisions will come on top. Where the amount of categories is twice as important as edits.

Transcluding User:Wiki_Feed_Bot/feed with the syntax above will also add a link to the page that says: "force refresh". When clicking this link the feed gets placed immediately instead of once a day. The tool makes the user wait until it is done. Once the feed has been calculated the results are added to the page where the link originated from and the user gets redirected back to their user page.

Discussion

 * This request specifies the bot account as the operator. A bot may not operate itself; please update the "Operator" field to indicate the account of the human running this bot. AnomieBOT ⚡ 19:09, 11 January 2017 (UTC)


 * This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT ⚡ 19:09, 11 January 2017 (UTC)
 * I just looked at the code, and when I click on the "force refresh" link on the sample feed this code runs. It looks like you're using the requests library without a User-Agent header or maxlag parameter. Are there plans to add both before this bot hits production? Enterprisey (talk!) 19:13, 11 January 2017 (UTC)
 * I've updated the operator Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
 * As far as I know this bot has not been editing outside of its or my userspace. There is no mechanism in place to enforce this though, so it could be misused, but I was not expecting that to happen. I'll look into where this edit was made and report on this discussion thread. Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
 * I'm making maxlag a high priority I only discovered its existence through this approval process. I'll make a ticket for the user agent. Both will be in place before we start announcing Wiki Feed to the public. See: &  Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
 * Usersearch does not reveal any edits on the enwiki for this user. Don't know what AnomieBOT found (maybe these approval pages?) and whether things are already reverted. Wiki Feed Bot (talk) 20:09, 11 January 2017 (UTC)
 * , Special:Contributions/Wiki Feed Bot is what AnomieBOT is looking at. You should stop using your bot account to contribute to this BRFA, since (see WP:BOTACC) you should be using your regular account (Fako85, I assume) for responding to these. Enterprisey (talk!) 20:16, 11 January 2017 (UTC)
 * Ok will do, but I can't edit Wiki Feed Bot's user page with my own user, because I'm not editing enough. It would be great if that's possible, but otherwise I'll keep switching accounts. Fako85 (talk) 20:20, 11 January 2017 (UTC)
 * , you can get that permission manually by becoming confirmed; see WP:RFP/C for instructions on how to do that. Enterprisey (talk!) 20:34, 11 January 2017 (UTC)
 * I'm highly skeptical on whether we should grant a bot flag to a bot run by an editor who isn't yet even autoconfirmed. Given the amount of damage that can be done with a bot account, bot operators are typically editors who have been around for at least a little while and built up trust with the community. ~ Rob 13 Talk 21:03, 11 January 2017 (UTC)
 * I'm here at the dev summit on my own accord flying in from Europe. Surely that counts for something. Sitting at table #10 if you want to say hi. Fako85 (talk) 21:21, 11 January 2017 (UTC)
 * Also my partner in this is Ed Saperia who organized Wikimania 2014 Fako85 (talk) 21:24, 11 January 2017 (UTC)
 * So I'm actually at the Wikimedia Developer Summit and was able to chat with Fako85 about this. The idea is pretty neat – you can do cool things like get the most edited articles in a certain category over the past day, or get a list of recent articles documenting natural disasters, sorted by number of deaths. There is a web interface for the news feed, but Fako and Ed were hoping to bring it to the wiki as a subscription service. I personally think this could be useful, e.g. WikiProject Women could have a dedicated page that lists the most recent articles on Women, or the most recently edited by number of pageviews, etc.For now I'd like to put this BRFA on hold until the tool is more developed and we are able to discuss the idea further with the community. Given it would be subscription-only, I don't think it's particularly controversial, but the community may have input on how it should function. We should also respect community norms that we generally don't grant advanced rights to new-ish users. In that regard I can at least offer my word that the project Fako and Ed are working on is legitimate, and I do not think they are going to use the bot account to intentionally disrupt the wiki &mdash; MusikAnimal  talk  22:15, 11 January 2017 (UTC)
 * So I've been thinking about this more, and even being a subscription service, I think we should account for any potential misuse. My understanding, and correct me if I'm wrong, you subscribe by adding a configured link (that points to Tool Labs) to a wiki page, then click on the link. That will trigger the bot to update the page with the requested results. For this reason there are a few safeguards we should put in place:
 * For the userspace, the bot should only edit the page if the link was added by that user. This prevents a vandal from adding the link to someone's user page and making the bot add some unwanted content.
 * For now, the bot should only edit the userspace. If people show interest, we could extend this to the Wikipedia namespace (e.g. WikiProjects), and perhaps the template namespace. At the very least, the mainspace is a strict no-no.
 * If and when we do extend this to WikiProjects (and all of the Wikipedia or Template namespace), we'll want some sort of approval process. Again, a vandal could make the bot add unwanted, potentially offensive content unrelated to the WikiProject.
 * I'm not sure what the best approach is for the last point – having an approval process, but first we should consult a few major WikiProjects and see if they are interested. I'm going talk to Fako more about this while we're here at the dev summit, and there also happens to be some WikiProject experts here as well who I'm sure will have something to say. I will ask any in-person participants to comment here as needed (rather than me speaking for them) &mdash; MusikAnimal  talk  22:43, 11 January 2017 (UTC)
 * Talking to MusikAnimal about this we came up with a better way to include feeds on pages. In short: people will need to add a template to their user pages and we'll check if this template has indeed been added by the user to prevent misuse. The process is more precisely described in this ticket: Wiki Feed Bot (talk) 00:28, 12 January 2017 (UTC)
 * To summarize the discussion till now. We'll be looking for people in the community that want to use this. So far responses have been enthusiastic. We need to implement these tickets before going live:
 * Add a template that people can use
 * Specify a user agent when calling the API
 * Implement the maxlag API parameter
 * Thanks everybody for the feedback. It has been very helpful Wiki Feed Bot (talk) 00:28, 12 January 2017 (UTC)
 * Reminder to use your personal account when editing as a human! :) &mdash; MusikAnimal  talk  00:35, 12 January 2017 (UTC)


 * Regarding the use of loading images to these pages - what kind of check are you doing to ensure that fair-use images are not used? — xaosflux  Talk 02:58, 12 January 2017 (UTC)
 * It gets all the info through the API. No external images are being used. I saw a recent change in the API where some (pageprop) images are postfixed with _free and some aren't. Is that related to this topic? Currently the system uses the _free images and ignores the others. If possible I would like to show an image whenever one is available of course (even if it's not "free"), but I don't understand the policies completely. Fako85 (talk) 21:56, 12 January 2017 (UTC)
 * The policies are the enwiki-hosted images may be "fair use", and as such they can not normally be placed on pages such as user pages, project pages, etc. commons: does not have fair-use, so it is always safe to use a file from commons, but for an image from enwiki you would need to examine the licensing restrictions before including it on userpages.  —  xaosflux  Talk 02:44, 13 January 2017 (UTC)
 * Good point Xaosflux. I didn't know about these policy requirements. However recently they seem to have changed the behavior of the API (Nov 30, 2016) as described in this ticket. I'll make sure that I use the free images and stay clear from fair-use ones, which may mean that some pages will not show images in the feed. Fako85 (talk) 22:00, 13 January 2017 (UTC)
 * While works on implementing the above, I'd like to ping  who helps with WikiProject X, to get his input on whether this bot would be helpful for WikiProjects &mdash;  MusikAnimal  talk  02:35, 16 January 2017 (UTC)
 * Any updates on the above issues? &mdash; MusikAnimal  talk  16:55, 31 January 2017 (UTC)
 * It might be helpful, MusikAnimal? Depends on what filtering criteria you could use for generating lists of articles. Harej (talk) 10:50, 23 February 2017 (UTC)
 * As of now the tool has an user agent that mentions WikiFeedBot. It also makes requests with maxlag=5 and respects the Retry-After header. The remaining issue to add a template that people can use is still open and I hope to finish it somewhere in February. Fako85 (talk) 19:54, 1 February 2017 (UTC)
 * OK sounds good. I will leave this open for now and check back with you at a later time &mdash; MusikAnimal  talk  20:09, 4 February 2017 (UTC)
 * Any updates on the planned changes? &mdash; MusikAnimal  talk  21:23, 13 March 2017 (UTC)
 * there is progress, but none that I can show. I expect to finish it by the end of next weekend. Keep you posted and thanks for your patience Fako85 (talk) 17:55, 15 March 2017 (UTC)
 * To test what happens when you include a wiki feed tag on a non-user page I'm going to include one here. It will do a fake run, so no edits will appear, but it will include some text from the feed page. Fako85 (talk) 09:47, 19 March 2017 (UTC)
 * I have it working locally now, but the tools environment is giving me some problems. Won't be able to finish this weekend. Hopefully I can make some time in the weekend to come. Keep you posted Fako85 (talk) 18:02, 19 March 2017 (UTC)
 * It's done. Sorry for the delay. What is the next step ? Fako85 (talk) 12:26, 30 March 2017 (UTC)
 * Sorry for *my* delay! I've been at WMCON but am back home now. So did we resolve this issue, whereby users add a template to a user page to have the bot update it? One thing with BRFAs is to keep the "Function details" updated. It looks like maybe the functionality described in is out of date. Let's update the function details to outline exactly how the bot will work then we'll go from there :) &mdash;  MusikAnimal  talk  21:47, 5 April 2017 (UTC)
 * it's done and squashed some bugs underway. I wonder what you think about the proposal now Fako85 (talk) 10:11, 20 April 2017 (UTC)
 * The function details look great! The only thing is I question the need to notify users when the feed is ready. If they want an immediate update, they could use the "force refresh" link, and continue their on-wiki work in a different tab in their browser. The intention of the bot is otherwise to get regular daily updates, so I don't think many would be upset if they didn't get an immediate notification. Rather, they'll just watchlist the page or remember to check back tomorrow. How does that sound?Lastly, we need some documentation on the available modules. I see mention of revision_count and category_count at User:Wiki Feed Bot, is there anything else? &mdash; MusikAnimal  talk  02:00, 27 April 2017 (UTC)
 * Thanks! The notification takes place when you press "force refresh". It takes about 30s to update the page. Currently you get redirected to a wait page. The idea was to immediately return somewhere instead of going to a wait page and notify when the page is done. However I think we can still improve on the performance quite a bit. Then perhaps the wait will be less long. This optimization recently occurred to me and I don't mind dropping the talk page requirement for now and add it if we really need it. So I removed it. The documentation is a good point. We also need many more modules. The next step is that people can write Javascript functions on pages which will get used as modules (in a sandboxed environment). Until that time we'll document the modules with comments in the methods and people can make a PR if they want to add anything. Information about this process can be next to the "force refresh" link. I think Ed Saperia should have a say in how we involve the community as he'll be taking the lead there more than me. However Britain is in the middle of an election as you probably know and he is very busy with campaigning. So we can pick this up earliest in June. Do you already have an idea what kind of module you would like to have? We can write one or two for testing purposes ;) Fako85 (talk) 08:31, 27 April 2017 (UTC)
 * At Dev Summit you did mention using pageviews, which would be cool :) But frankly I don't have many opinions on what modules to include. My position here is more to help you get this out the door as a bot approver. In order to approve the bot, I don't think we need to test every single module you think you'll ever add, but it may be good to cover a lot of ground and check the numbers for accuracy. The custom JavaScript modules also sound interesting, and it may be good to get that tested as part of this BRFA, if you intend on adding that functionality anytime soon &mdash; MusikAnimal  talk  15:52, 28 April 2017 (UTC)
 * pageviews are possible, but relatively expensive. Because you can't get batches from the API yet it takes as many API calls as pages in the set. I'm looking for more efficient ways, but maybe I should enable it before the improvements to see if it is useful in the first place. The dynamic modules will take a while to do it right I think. Perhaps it will be done after the summer. Fako85 (talk) 20:22, 8 May 2017 (UTC)
 * is this going to be on hold for a while? — xaosflux  Talk 23:36, 8 June 2017 (UTC)
 * I hope not. I'd prefer to develop this project agile and not get stuck with the approval, because new features may get introduced in the coming months. If we get approval we can start asking developers and editors to participate. If we do not have approval we'd infringe the rules afaiu. would you like to see more before approval? Fako85 (talk) 19:51, 12 June 2017 (UTC)
 * We need to see the bot actually run (after a trial is approved, just to be clear) before being able to approve it. When you're ready to run a small trial, please let us know and we can approve one. Right now, I don't think it's very clear where we are on the development of this bot. ~ Rob 13 Talk 15:38, 13 June 2017 (UTC)
 * thanks for clarifying that. I'm new to all these procedures. The bot is ready for a trial period. had a few suggestions, but they have been implemented. However I'm leaving for a holiday with no internet tomorrow. So I think it is best if I'll ping people here when I'm back in July to start the trial. Fako85 (talk) 16:32, 15 June 2017 (UTC)
 * Sounds good! I don't think there's any issue leaving this open, so long as we get to a trial at some point. Enjoy your holiday! Just give us a ping when you return. Looking forward to it &mdash; MusikAnimal  talk  16:04, 16 June 2017 (UTC)
 * I'm back from my holiday and I found 4 people outside of BAG interested to test. I'll need to write some modules for them, which I'll try to finish this weekend. After that these users would like to participate in the trial. Anything else that I need to do to start the trial? Fako85 (talk) 09:33, 12 July 2017 (UTC)
 * will you be able to trial without  ? —  xaosflux  Talk 17:39, 15 July 2017 (UTC)
 * It would definitely be possible. Not sure if it is desirable. Don't we want this to be part of the test?
 * I've found 3 users that are willing to be part of the test. One looks at possible bias in Wikipedia articles. The other two will watch "breaking news". I've created some modules for them, but I'll have to debug the breaking news one. I'll try to take a look at that on Wednesday.


 * OK to trial, if this can't function without highapi's please let me know - it will mean having to flag the account as a bot early. — xaosflux  Talk 22:21, 17 July 2017 (UTC)
 * Trial stopped, bot account is blocked pending operator response here. Any admin may unblock without consultation if the issue is resolved. —  xaosflux  Talk 01:50, 18 July 2017 (UTC)


 * Despite the response above regarding the use of non-free images, this account is still being used to place clearly marked non-free images outside of articles, in violation of fair-use practices. See page history for reported examples. — xaosflux  Talk 01:54, 18 July 2017 (UTC)
 * The response above is about a similar, but slightly different issue. The problem is that the image is initially marked as free. It is marked as free when the bot makes its edits. Then something happens in the real world and the commons image license gets changed. It is these changes that the bot is not picking up on. I've started a conversation with the editor that runs into problems with this case. I'm hoping to learn how things would work out for her. Fako85 (talk) 06:13, 18 July 2017 (UTC)
 * Out of pure curiosity. How does the bot block mechanism work? Does it disallow edits from those users? For good measure I've stopped the cronjob for the time being. Fako85 (talk) 06:14, 18 July 2017 (UTC)
 * It is the same as an editor block, disallows edits - can be removed by any admin. —  xaosflux  Talk 10:55, 18 July 2017 (UTC)
 * See also related discussion at Wikipedia_talk:Non-free_content. — xaosflux  Talk 10:55, 18 July 2017 (UTC)


 * To summarize the outcomes of discussions outside this page. The page_image_free property from the API is unreliable. Xaosflux and me decided that it would be better to check all images. Any images from commons are ok to use. Any images from enwiki that specify Category:All free media are also ok. If an editor accidentally places an image in this category the Wiki Feed Bot may use that image. When this mistake is corrected the image may remain visible in feeds for at most 24 hours. After that Wiki Feed Bot will remove or replace the image. We'll have to explain this policy clearly somewhere. I'm close to finishing these changes. Fako85 (talk) 12:42, 22 July 2017 (UTC)


 * the bot has been unblocked, and trials may proceed. — xaosflux  Talk 21:53, 22 July 2017 (UTC)


 * — xaosflux  Talk 21:53, 22 July 2017 (UTC)
 * Your bot trial appears to be complete. Do you wish to continue on this BRFA?— CYBERPOWER  (Around ) 06:45, 20 August 2017 (UTC)
 * Also pinging — CYBERPOWER  (Around ) 06:47, 20 August 2017 (UTC)
 * Seeing as there are no further complaints regarding copyright, I am marking this approved.— CYBERPOWER  ( Chat ) 08:40, 26 August 2017 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.