User:Pbsouthwood/Minimum standard for article creation

The problem exists that we want to encourage the creation of articles on notable topics by new users, and at the same time manage the creation of articles by established users. (quality control)

New users may have interests beyond the existing range of Wikipedia articles, and there are many gaps in our coverage. Thy may contribute to cover some part of those gaps. Established editors are more likely to expand coverage of topics we already cover, but in more detail. Both of these are desirable in principle.

New users can be expected to not be familiar with the criteria, and should be helped and given enough slack that they feel welcome and get an opportunity to learn the ropes and possibly become established contributors.

New users using AfC are inherently prevented from flooding the NPP queue, but the work involved in reviewing AfC is not generally less than for draft or mainspace articles, often more. Autoconfirmed users can bypass the queue and create directly into mainspace, which has advantages and disadvantages. By being autoconfirmed they can create articles without ever having to learn the rules, which can make life difficult for patrollers. A possible way past this problem is to autoconfirm only after the editor has displayed a minimum level of competence in article creation. This would require criteria for a minimum quality of article and an automated way to assess it, or successfully creating one or more drafts without needing major assistance, which would not really be autoconfirm. This would conflict with other purposes of autoconfirm. At present notability is one criterion, but it is open to interpretation and could not easily be automated. It should be possible to automate checking for the existence of one or more plausible references. It may also be possible to use automation to identify how many unpatrolled articles any given editor has outstanding. (Xtools type of thing? or a user subpage) see below.

Competence is required
Some competence is required to create articles that are worth the effort of review, but we allow autoconfirmed users to create mainspace articles without first demonstrating even very basic competence. This puts a huge burden on competent editors who could be doing more useful work than weeding out rubbish. Allowing "anybody" to edit has costs. It also encourages wannabe gatekeepers to get involved in patrolling, and this can have disadvantages if they RfD Prod or Speedy articles that should be accepted and we consequently lose editors that might have reduced some of the systemic unbalance in coverage. (Our notability criteria enforce some systemic unbalance, but that is a separate issue.)

Flooding the NPP queue
Established users can reasonably be expected to know the rules and comply with them. This is particularly important when they create large numbers of articles over a short period and thereby flood the new pages patrol feed. One of the problems is that the rules can be interpreted differently, as some flexibility is required. The recent development of large language models with a tendency to hallucinate and no facility to provide references supporting their statements could aggravate this situation as it becomes easier to draft an article, but it still requires patrollers to check verifiability.

A method of throttling the flow of new articles could help. The most basic would be to not allow any new articles in mainspace when the NPP queue is too long. This could encourage editors who want to create new articles to do some patrolling to make space, but unless they patrol competently, it just exacerbates the problem. NPP is an area where competence actually is required and assessed (whether the required competence criteria are realistic or adequate is an orthogonal question). More competence is required for NPP than for article creation, so the load remains on the patrollers.

Limits linked to editor
Another way of throttling the flow would be to limit the number of unpatrolled articles per editor, so if their personal backlog gets too high they cannot create new articles in mainspace or move drafts into mainspace until it is reduced. This should be amenable to automation. It would require a personal queue for each user, which is reduced by each article reviewed, and increased by each article added to mainspace. When opening a new article the list should be automatically checked for available space and a message displayed that it will not be published if the queue is full. An editor could apply for a longer queue limit if and when needed. The practicability of such a queue is critical, but it should be relatively straightforward, it looks like most of the software infrastructure already exists. This queue would probably only apply to mainspace articles, though similar queues could be made for redirects and other page types. Excessive NPP backlog could be used to shorten queue lengths across the board if community agrees, but this is a tradeoff that might not be accepted regardless of utility.

Maybe queue overruns could be shunted to draft space. Autoconfirmed users can move them to mainspace as and when their queues shorten sufficiently.

If an article makes reviewing easy by compliance with MoS, good referencing etc., it is likely to be reviewed sooner than an article that makes reviewing difficult.

Autopatrolled bypasses the system entirely, as list will normally never grow, except for unpatrolled pages which should be rare.

The intention is to encourage non-autopatrolled editors to put more effort into making new pages easier to assess, which is a win for all.

Conflicting characteristics
A large new article with several sources is more valuable if it is well structured and provides a lot of encyclopedic information. It is also more difficult and more work to review effectively. A basic stub providing only enough to show notability and scope of the topic is less work to review but of less value to the reader.

Minimum standard
Any article created directly into mainspace must have at least one reference indicating notability (GNG or SNG) It must clearly indicate the topic of the article and should reasonably indicate scope.

Minimum standard should take into account possibility of AI page creation. We may have to require a higher standard of referencing to check verifiability. Problems with identifying LLM pages are that they will often look more plausible and better written than human edited pages, but are inherently unreliable due to possibility of unpredictable hallucinations. Of course human page creation is also susceptible to accidental and deliberate misinformation. The editor remains responsible for all content they add. If something looks dubious it can be tagged if it can't be checked.

Tracking unreviewed articles created by an editor
An on-Wiki way of keeping track of the number, rate of creation and rate of assimilation into the Wiki of a user's new article contributions as a possible solution to the "mass article creation" issue. A list could be stored on a user subpage, and updated by a bot, or several bots for different types of update. The initial population could be done by one bot, then additions to the list might be possible by a bot or other routine monitoring article creation, and removals by something monitoring deletions and something monitoring review status, which could be autoreview, NPP review through the toolset, or other possibilities. What Is less clear is what options there are for methods of doing this, and what the overheads would be. As this would be needed for all articles created, reviewed and deleted, there might be quite a high setup overhead and an ongoing overhead of unknown magnitude, so there would be a cost-benefit playoff of a size I cannot guess. The possible benefits of such a system would be a way to put a brake on high rate article creation automatically based on the NPP backlog, the number of articles the user has in the queue for review, and possibly input from ORES on article quality. People occasionally creating articles of reasonable quality might never notice the system, those creating large numbers of stubs in a small period would hit a wall until their articles were reviewed or deleted to make space in their personal list for more. List length could be extended on request if needed, or reduced in problem cases, or when NPP is overloaded. It might encourage more article creators to help with NPP if they had skin in the game, and were competent. Probably also factors I have not thought of yet, but that is where I am now. Some idea of feasibility and practicability would help decide if this is worth further investigation.
 * Date of creation order, latest on top?
 * Procedure to deal with unreviewing. Should put article back in list, but where? Top of list or date of original creation? Probably top of list where it is clear that it is newly unreviewed. This may be useful for identifying and assessing some kinds of problem. Brings up question of how review and unreview are recorded.
 * Record ORES quality assessment? at which point? Some people publish when quite a bit of work has been done, others create a minimal article and then work on it for a while, others create a minimal article and abandon it to create the next. The most relevant time is probably when applying for an extension (longer list permitted for special reason, more convincing argument to get extension if most articles on unreviewed list are start or higher by ORES). Articles that have passed review will no longer be on the list, so irrelevant.

Records
Information that a page has been reviewed or unreviewed is stored on WP:

The SQL logging table keeps a record of the history in the Page Curation log. The SQL pagetriage_page table keeps a record of the current status, of articles reviewed within the last month. System assumes an article is reviewed if not in the log.

SQL logging table for review and unreview:
 * Review log, *Unreview log.

Discussion/information

 * https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_(technical)&oldid=1178364642#Tracking_unreviewed_new_articles_by_user.
 * the RfC on Mass article creation after the Lugnuts arbitration case Arbitration Committee/Requests for comment/Article creation at scale in which there was no consensus on what "creation at scale" would mean, so the discussion collapsed, although there was general agreement that there is a problem with some cases of mass stub creation, particularly if based on a database. Large scale rapid creation of any quality od article is recognised as a problem for NPP even if theoretically otherwise beneficial to the encyclopedia.
 * The Lugnuts arbitration case https://en.wikipedia.org/wiki/Wikipedia:Arbitration/Requests/Case/Conduct_in_deletion-related_editing