User:Chlod/Analysis/2021 Pacific typhoon season

The 2021 Pacific typhoon season article has accumulated a total of 4,268 revisions (and counting) after only 9 months. Much of season articles in WikiProject Tropical cyclones succumb from inflated edit counts. For 2021 (not including two-year season articles), the North Atlantic basin accumulated a total of 3,582 revisions, the Eastern Pacific basin accumulated a total of 2,454 revisions, the North Indian Ocean basin accumulated a total of 1,137 revisions. The 2020 Pacific typhoon season accumulated a total of 5,980 revisions.

The extremely inflated edit count prompted me to investigate what exactly causes all these edits. This page documents the results of that investigation.

A brief note: "diff size" refers to the change in bytes of a diff.

Why bother?
Edits with barely any changes are rather mundane. They mostly change numbers and storm data around in order to present the latest values. The inflated edit counts, however, pose a significant problem for editors working on prose.


 * 1) An inflated edit count means it will be harder to navigate the page history for significant additions. Special:PageHistory does not include special filters for diff size. This is especially hard for those looking for specific revisions, such as those working on attributing intrawiki copies, which requires a revision ID as much as possible.
 * 2) Modifying current storm data on the season article will inevitably inflate edit counts on the storm articles as well, given that they are published while the storm is active. This is because editors will attempt to update data on all articles instead of just those on the season article.
 * 3) For new editors who have edited no other article, this might be considered a form of gaming extended-confirmed rights.

Interpretation
Based on the data above, at least % of edits can be attributed to low-byte data-related changes. These edits (which eventually get removed when a storm passes) do nothing but provide filler for the edit history, making it hard to navigate and also a computational waste of resources. Due to the sizes of season articles (some going over 250 kB), saving edits may take a long time for some editors. When crawling through diffs to look for insertions, the ~250 kB page must be re-rendered over and over again due to small number changes that are, at the end of a storm's passing, essentially in vain.

Solutions
What can be done about it? We are presented with a few options.

Option 1
Entirely stop the usage of current storm information sections and current tropical cyclone infoboxes. This is the least suggested option.

Current storm information sections and infobox tropical cyclone current are highly visible parts of an article that provide the latest information to a reader. Wikipedia must recognize its existence as a source of information, and that said, we should not remove these sections for the sake of edit counts, especially when readers expect to see them most.

Option 2
Move all data-related edits to the template space. This way, edits to templates are entirely isolated and kept separate from article edits, effectively reducing the edit count for articles.

This system follows a similar scheme to the X1 sandbox template set, where each tropical cyclone will be assigned a placeholder value which it will hold until it dissipates. This is similar to the method that the Japan Meteorological Agency handles tropical cyclones on their cyclone details page (number assigned from 60 to 65). This method requires only wikitext, and is fairly easy to implement. This does, however, have the issue of inflating edit counts for the specific templates being used, or even worse, showing the incorrect storm information for a storm that existed in the earlier revision of the article. Thus, it is considered a less likely option.

Option 3
Move all current storm information and current tracking data to Wikidata, where it should be. This is the most suggested option, as a structured data system like Wikidata is already optimized to handle large edit counts and minute data changes.

This falls under the idea of keeping data edits on Wikidata and prose edits on Wikipedia, forming a fine separation between the two. Implementation would require a but of Lua programming, but will definitely be useful in the long term. A problem that has to be addressed with this, however is the Watches and warnings section, which cannot be simply imported from structured data without a complicated data structure. If this option is used, W&W sections may need to be kept in prose anyways. Nonetheless, this already cuts down a lot of edits that would otherwise be in the season article.

Implementation plans for this option are detailed in User:Chlod/WikiProject Tropical cyclones data migration.