User talk:Andrewa/Wikilink alert proposal

Why this page
It's been suggested by and  that a (much needed) change in software that allowed a list of terms which would require a mandatory cross-check for disambiguation before the edit using any of those terms is saved  might be proposed, something to catch and alert on bad links before the edit is saved. .

This is a page to discuss that proposal. Andrewa (talk) 00:17, 10 June 2017 (UTC)

Some links
Or that's as I see it. I have never used Phabricator, and it's years since I last used Bugzilla. Have I got it right? Andrewa (talk) 11:16, 10 June 2017 (UTC)
 * Feature request (archive) doesn't seem to include the declined request referred to here, presumably because the request was denied before the current system was implemented
 * But it does seem to be in Phabricator, see below Andrewa (talk) 17:17, 11 June 2017 (UTC)
 * Bug reports and feature requests describes the current system
 * Phabricator is the software currently used to raise feature requests
 * Bugzilla is also used to process and track feature requests, but these days only the developers and MediaWiki administrators (a different group to the English Wikipedia administrators) use it, taking input from Phabricator, whereas before Phabricator was implemented feature requestors would update Bugzilla directly

Phabricator task
This link may be relevant: Alert editors before they save an edit creating a disambiguation link (or at least a high-traffic disambiguation link).

Of course, it would only work for disambiguation pages, not articles with misleading titles which attract links to both the article's subject and a different topic! Certes (talk) 14:28, 10 June 2017 (UTC)


 * Looks good! Well spotted. Andrewa (talk) 21:40, 10 June 2017 (UTC)

Original proposal
This came out of New York naming discussions, see above.

The motivation of as I see it is to reduce or eliminate the overhead currently experienced when users wrongly link to New York meaning the city, presumably not having previewed the edit and checked the destination of the wikilink. This overhead has been cited as one of several reasons to move the article on New York State away from the base name.

If the destination of the base name were to become New York City, then these particular mislinkings would not occur. However, many users also currently link correctly to New York meaning the state, so this move would also create other mislinkings. It is unknown how many of these would occur long term, as at least some of these correct but undesirable wikilinks are created by users who have checked the link destination, and these would decrease after such a move. But they would not immediately vanish, as the editor may have previously checked the destination and would not be alerted to the fact that it had changed.

So it has been proposed that it would be better for the base name New York to have a DAB as its destination. The result of this would be that editors received an alert from DPL bot whenever they linked to New York undisambiguated, and so all of these undesirable links would be discouraged.

The initial proposal pointed out two current limitations of DPL bot:
 * It takes some time for the alert to be posted, unlike the blacklist warning which occurs instantly.
 * It requires the destination to be a DAB. There is no way to easily flag another page as a probable mislinking.

It was subsequently agreed that the first limitation is inherent in any bot, owing to performance considerations. Therefore the possibility of a Mediawiki feature request was raised.

How am I doing? Andrewa (talk) 22:17, 10 June 2017 (UTC)
 * Nicely stated, thank you!!! Castncoot (talk) 04:39, 11 June 2017 (UTC)

Discussion of original proposal
I have several questions, and some proposed answers.

Who would add and remove alertable names to the list?

It seems to me that, for a start and for ease of administration, all logged-in users should be allowed to at first at least. But it's not good to hard-code that. Better to have a configuration parameter that allows any class of users to be selected. Then if there's a problem, it's relatively easy for a developer to change this to just be admins, or even to set up a new user class. But a new user class, with all the admin overhead that this implies, seems unnecessary.

Or perhaps two parameters, one for adding and the other for removal, and initially set the add to all logged-in users but removal to admins only, remembering that there's not a lot of overhead to having unwanted entries but that someone needs to clean them up and alert those adding them to the correct procedures (which also need to be written of course, but we don't need a developer to do that). The extra work to have the second configuration parameter, rather than just the one, seems trivial to me, so that's what I'd do. Comments?

The exact format of the entries in the table controlling this function needs to be thought out. A simple entry such as New York is an obvious need, but there may be others. Would allowing a wildcard character in the syntax would be a good idea... perhaps, but perhaps we don't want just anyone to add those, that might need to be admins only, and I can't think of any example that this would usefully cover, so maybe it's not such a good idea after all. But monitoring cross-namespace links would be useful, we would use that for example to alert to any link from the main namespace to the user namespace, or to any talk namespace. And monitoring links to pages in a particular category would be useful too, alerting to links to DABs for example would provide an efficient and more timely function similar to that already provided by DPL bot. It wouldn't replace the bot but would reduce its workload.

But we'd want to restrict the alerts for links to DABs to those from the main namespace. So the table is beginning to look quite elaborate.

Comments? More ideas? Andrewa (talk) 22:56, 11 June 2017 (UTC)


 * Phase 1 could cover links to dab pages with names not ending in (disambiguation). That doesn't need anyone to maintain a list; the software can identify these pages dynamically.  We may need a short whitelist of exceptions in case some such pages have legitimate links, but I can't think of any.  I would also be wary of allowing ordinary users to do something like add to a list but not to undo it without admin help.  We already have far too many cases like that in Wikipedia.  (I understand why my privileges would let me create an article full of libel and nonsense but not delete it, but it's not an ideal state.) Certes (talk) 08:10, 12 June 2017 (UTC)


 * Yes, the software can identify these pages dynamically. But IMO the effort required to specify, write, implement and maintain such code is going to be considerable, while if we drive it from an easily updated table it's a prototyping exercise, and we develop, test and maintain it "live", and the effort is still nontrivial but far less. Andrewa (talk) 10:58, 14 June 2017 (UTC)
 * Our most frequent problems come from a fairly discreet set of frequently linked disambiguation pages - examples include link generally intended for genres (pop, rock, and heavy metal); for musical instruments (bass, keyboard, strings, horn); for scientific concepts (cell, tissue, organ); for language/ethnicity/nationality terms (English, German, Russian, Chinese, Tamil); for news outlets (Bloomberg, Fortune, Variety); for technology-related products (Amazon, Android); and for two and three letter acronyms like BA, CA, EP, LP, MA. I would suggest an approach that initially focuses on these areas, and does not distinguish between IPs and registered editors. bd2412  T 12:38, 14 June 2017 (UTC)
 * Fascinating... we definitely can and should do something about that, and I may need to retract my dismissal above of suggestion, which this excellent research supports.


 * All of these are DABs and with one exception (strings (disambiguation)) the corresponding redir exists (see here). As Certes suggests, these would all be caught by a simple routine that identified links to dab pages with names not ending in "(disambiguation)". That doesn't need anyone to maintain a list; the software can identify these pages dynamically... (my slight change in format).


 * It makes me wonder, are we right to ever have dab pages with names not ending in "(disambiguation)"? Is WP:PRIMARYTOPIC wrong to allow two options... the term should be the title of a disambiguation page (or should redirect to a disambiguation page on which more than one term is disambiguated) (, my emphasis)? Perhaps what it should say is just the term should redirect to a disambiguation page.... Then we would never have a DAB at a base name, but in the case of no primary topic, would always redirect the base name to a DAB.


 * But perhaps that's a suggestion for another time. Certes' suggestion is vindicated in any case. Andrewa (talk) 16:46, 14 June 2017 (UTC)


 * Redirects such as Pop (disambiguation) →‎ Pop provide a way to mark deliberate links to the disambiguation page, for example in Pops. Linking to Pop (disambiguation) means "I really did mean to link to a dab page"; links to Pop from article space need to be fixed.


 * Pop →‎ Pop (disambiguation) would work too. We're just preferring the simplest alternative as the page title, in the same way that New York City, New York, United States redirects to New York City rather than vice versa.


 * I think the redirect bit in WP:PRIMARYTOPIC is to cover cases like Strings. That's a redirect to dab String, which covers topics called Strings as well as those called String.  There's no Strings (disambiguation) because deliberate links-to-dab can point to  String (disambiguation) instead. Certes (talk) 17:29, 14 June 2017 (UTC)


 * Good point regarding strings etc. But this situation would be even more clearly covered by my rephrasing above, would it not? Andrewa (talk) 00:00, 15 June 2017 (UTC)


 * As a rule, we do not redirect "Foo" titles to "Foo (disambiguation)" titles, per WP:MALPLACED because it creates the misleading impression that the "Foo" title is available for a primary topic article. We could undoubtedly change the rule and move all disambiguation pages to their "Foo (disambiguation)" title, but that would involve moving hundreds of thousands of pages and reconfiguring a lot of our tools, which are not well suited to fix disambiguation redirect links. bd2412  T 14:12, 15 June 2017 (UTC)


 * I should have said "Pop →‎ Pop (disambiguation) would work have worked too." As BD4212 said, lots of complicated machinery now relies on doing things the other way with Pop being the actual page.  It would be difficult and unwise to change the convention. Certes (talk) 17:58, 15 June 2017 (UTC)


 * That seems settled, and it's good that it was raised IMO. Andrewa (talk) 22:06, 15 June 2017 (UTC)

At the risk of taking a tangent here, I had a look at Category:All article disambiguation pages - ''This is a tracking category for disambiguation pages. It enables us to use  to get the exact number of disambiguation pages in the main namespace... - 281,238 pages, and Category:All disambiguation pages - This category lists disambiguation pages in all namespaces. (For technical reasons it does not list pages in the template namespace, but there should be no disambiguation pages there anyway.... See also Category:All article disambiguation pages...'' - 281,008 pages.

Does anyone else see it as surprising that there seem to be more DABs in the main namespace than there are in all namespaces combined?

Or is it something about Category:Tracking categories? Both wp:tracking category and wp:tracking categories are currently redlinks.

More on-topic, at some stage we should give the participants at Disambiguation pages with links a heads-up. But perhaps that's not appropriate while this is in my user space. Andrewa (talk) 23:32, 14 June 2017 (UTC)


 * Interesting. A search for pages in All article disambiguation pages but not All disambiguation pages draws a blank. Certes (talk) 10:45, 15 June 2017 (UTC)


 * Curiouser and curiouser. It seems possible that the tools that we're trying not to confuse are already rather confused. Andrewa (talk) 01:36, 16 June 2017 (UTC)

Disclosure
I should have said before, I'm skeptical that this is a useful suggestion. There's a likelihood that it will actually create more work than it avoids.

But anxious to be proven wrong. I'm going to do my best to get a positive result, as I did with the HLJC. Andrewa (talk) 11:02, 14 June 2017 (UTC)


 * In view of discussion above, the likelihood is that something practical will come out of this. I stand corrected. Andrewa (talk) 16:46, 14 June 2017 (UTC)