Wikipedia:Disambiguation page traffic test

One of the two factors in deciding whether a particular topic is primary for a given term is usage: whether it is highly likely—much more likely than any other single topic, and more likely than all the other topics combined—to be the topic sought when a reader searches for that term. A typical question, therefore, is: what proportion of readers who land on a given disambiguation page are interested in a particular article linked there. Relevant information can usually be extracted from the clickstream dataset, but in rare cases a dedicated test can be performed.

Clickstream option
The clickstream dataset shows how many times readers have arrived at article A from article B (typically by following a link). If article A is a disambiguation page, then this data will neatly show how many of the viewers of the disambiguation page have sought each of the linked articles there.

The most recent month is visualised by https://wikinav.toolforge.org. Data from earlier months (going back to 2017) is available at https://dumps.wikimedia.org/other/clickstream/, where it's accessible for those who are comfortable with querying text files or using spreadsheets.

There are some limitations: the data is available only on a monthly basis, and links that have been followed fewer than 10 times in the given month are not included.

Redirect test
In cases when the clickstream data is not available or not easily applicable, usage can be gauged by piping the links of interest via redirects that otherwise receive no traffic, and observing what traffic they will now get after a certain period of time. The procedure is as follows:
 * 1) Find a redirect for the topic of interest that doesn't receive any meaningful views. If no such redirect is available, create one, and if this redirect will otherwise serve no purpose, tag it with  to avoid confusing people who stumble upon it.
 * 2) Pipe that redirect from the relevant entry on the disambiguation page (or from a hatnote).
 * 3) Wait a couple of weeks and then analyze the page views of that redirect vis-à-vis those of the disambiguation or another redirect..
 * 4) Clean up after the test: restore the links to their previous state; if any otherwise implausible dedicated redirects have been created, have them deleted.

There are obviously limits to this method's usefulness. It takes effort and time, and the piped links may confuse readers as well as editors. Article traffic from other search engines (e.g. Google) is not factored in. (Then again, those search engines have their own ways of getting readers to the correct article.) If any new redirects have been created, they may happen to get traffic from other sources too (thereby affecting the reliability of the data).

History
Previous discussions and uses of the latter method are for:


 * Stressed Out at Talk:Stressed Out (is psychological stress a significant or primary topic for Stressed Out?)
 * New York (state) at Talk:New York (is New York (state) the primary topic for New York — as explained here, the page titles in question underwent a number of traffic tests)
 * Ovens at Talk:Ovens (disambiguation) (is oven the primary topic for ovens? — extensive testing and discussion)
 * Kylie at Talk:Kylie (is Kylie Minogue the primary topic for Kylie?)

An additional suggestion comes from one of these discussions:


 * "One confounding issue is that some of the people who land on the dab page will just balk without proceeding to select any of the links. We don't know what percentage of visitors will do that. Perhaps we should use "special redirects" for all of the serious candidates and then compare those page view numbers to each other. Only then will we know what topics are attracting visitors who come through the dab page." —BarrelProof (talk) 20:51, 1 July 2017 (UTC)

An extensive discussion of various "data collection" redirects occurred at WT:WikiProject Redirect in early 2021.