Wikipedia:Bots/Requests for approval/GalliumBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard. The result of the discussion was

GalliumBot 2
Operator:

Time filed: 11:34, Tuesday, December 13, 2022 (UTC)

Function overview: Updates pages within Did you know/Statistics and notifies DYK nominators when their hooks perform exceptionally well

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: yes

Links to relevant discussions (where appropriate): None obtained, but I've been doing this semi-automatically for about a year and no one's objected. Also, it's not a very controversial task

Edit period(s): Daily

Estimated number of pages affected: 20 pages per month, with the exception of one-time full-sweeps

Namespace(s): Wikipedia, User talk

Exclusion compliant (Yes/No): Not yet

Function details: After hooks appear on Template:Did you know, they are archived to Recent additions. From Recent additions, editors used to manually add entries to Did you know/Statistics, manually and somewhat less than comprehensively, to log the best-performing hooks. With the addition of a semi-automatic script and standardizing templates about a year ago, the updating of Did you know/Statistics/Monthly DYK pageview leaders has become much easier, but it's something I should probably be running a bot for instead of clicking a button every day. I've finally rewritten the code in python, and it's actually better than the original! It works like this: theleekycauldron (talk • contribs) (she/her) 23:39, 16 December 2022 (UTC)
 * 1) Every day, at around 03:00 UTC, it goes through Recent additions
 * 2) For each hook in the page, it creates a Hook object containing start and end time, the text of the hook, and the image file. From there, the Hook object self-executes a function to calculate its pageview data.
 * 3) For each Hook object, a function puts the objects in terms of the DYK stats table templates, creating a statistics table of all the hooks and their pageviews that the bot edits to the stats page. An example of this can be found in the second table at Did you know/Statistics/Monthly DYK pageview leaders.
 * 4) Another function analyzes the pageviews data from the set of Hook objects and creates a summary table with some basic information such as low, high, median, and the percentage of hooks that clear a certain bar (600 views per hour for non-imaged hooks and 1000 views per hour for imaged hooks). An example of this table can be found at the top of Did you know/Statistics/Monthly DYK pageview leaders, as well as in the lines of Did you know/Statistics/Monthly summary statistics.
 * 5) Finally, for each Hook object that passes the threshold, the bot uses a search to find anyone given a DYK credit for the hook and gives them a DYK views template, which just gives them a nice compliment and a link to their hook on the stats page.

Discussion
Primefac (talk) 11:35, 31 January 2023 (UTC)
 * BAG assistance needed Forgive me for this buzz, but it seems like this nomination has languished without the attention of a BAG for a bit long. The template said to wait a week, it's been I believe 8 days, and most of the current BRFAs are getting an initial comment from a BAG member on the day of or the day after. That includes a bot in the same field, DYK, that was thrown to trial almost immediately. Is there something I'm doing wrong? theleekycauldron (talk • contribs) (she/her) 05:49, 22 December 2022 (UTC)
 * -- The SandDoctor Talk 23:44, 25 December 2022 (UTC)
 * Thanks, ! :) Okay, breaking down the edits made, because there were a lot of 'em:
 * User talk pages. These were the only edits outside of Did you know/Statistics and its subpages for this trial run.
 * WP:DYKSTATS edits are a bit more difficult to break down, because there are so many of them – I ended up doing many more edits than I was expecting in this category, trying to standardize all the past statistics pages. The relevant pages would be any edits post-trial to the following pages:
 * Did you know/Statistics/Monthly DYK pageview leaders and all of its month-by-month subpages, and their summary subpages. Not all of these need review, but I'd be happy to go into more detail about the bugs encountered through the process, where I ran into them, and how I got around them.
 * Did you know/Statistics/Monthly summary statistics and all of its year-and-type subpages
 * Monthly summary statistics/Navigation and Monthly DYK pageview leaders/Navigation, updated with a special script to handle new months and years.
 * Truly sorry about the edit volume! Let me know if there's anything more specific you want to know. theleekycauldron (talk • contribs) (she/her) 05:56, 2 January 2023 (UTC)
 * Oh, and there's an ongoing bug has been telling me about – a page got moved during its DYK run, and the script isn't currently set up to count that correctly. I may implement some kind of system for that at a later time. theleekycauldron (talk • contribs) (she/her) 05:58, 2 January 2023 (UTC)
 * I see no reason for the bot to take care of that special condition, which simply should not happen. I warned the very day when it happened last, seeing that coming, and sure enough nobody listened. I should change my user name to Cassandra ;) - Private question: if the minimum for stats is 12 * 1.000 = 12.000, why didn't I get the nice compliment for Talia Or? What about stats for the image also, in relation? ('cause I'm sure in her case the hook didn't matter at all.) For more fine tuning, you could set the results in relation to the views of the Main page that specific day, which varies, seen on User talk:Dank, with a low on the Christmas days, and a spike afterwards (which I had also predicted). Happy new year! --Gerda Arendt (talk) 06:35, 2 January 2023 (UTC)
 * Hah! Cassandra indeed :) the answer to your question would be that Talia Or had a 24-hour run, not a 12-hour run, so the minimum bar would be 24,000 (for a consistent average of 1,000 views per hour). Or scored 545.7vph, which is about average for images in December 2022. Happy new year! theleekycauldron (talk • contribs) (she/her) 06:48, 2 January 2023 (UTC)
 * As it turns out (courtesy ping to @Gerda Arendt), if you've thumbed through the number of views on the Main Page over time, you'll see weird jumps and falls – changes of millions of views per day at a time. They're not spikes, though, they're lasting sudden changes of millions of views per day. Those are likely caused by web scrapers or otherwise, and aren't easily adjusted for in calculating DYK views. theleekycauldron (talk • contribs) (she/her) 06:50, 2 January 2023 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Bots/Noticeboard.