User:John Broughton/Wikipedia 2.0/Rationale

Summary/top
Clay Shirkey, "A Group Is Its Own Worst Enemy", speech at ETech, April, 2003

Founder of Wikipedia wants quality; reviewed ideas of others; this whitepaper is a proposal for taking Wikipedia quality to the next level.

Quality problems in Wikipedia 1.0

 * New articles that should not have been added.
 * Moves that should not have been made.
 * Addition of unsourced or poorly sourced information
 * Link rot
 * Vandalism (direct damage and the time and energy to fight it). (Measuring vandalism:  here's a link to the most popular articles in Wikipedia; that's one subpopulation of interest:
 * Too many screens, keystrokes, and complicated processes for things other than editing articles (for example, proposing an article for deletion; reporting vandalism)
 * Article talk pages where chronology and authorship is difficult or impossible to understand.

Roughly 4,000 articles are added each day, and about half that number are deleted that same day, Wikipedia says, by administrators who determine that an article is not up to standards.

One day's analysis of deletions (about 4,000), posted to WP:AN on 1/4/2007, showed about 2,100 mainspace deletions, with talk pages (presumably most associated with corresponding articles) being the second largest category of deletions.

See: Why Wikipedia is not so great

What this proposal doesn't address
This proposal doesn't propose ways to reduce arguments between editors, or how to improve grammar, spelling, and other writing deficiencies in articles, or the lack of depth of articles. Neverthesless, even for such problems, the recommendations in this paper can help, indirectly, because as articles improve and the complexity of being an editor decreases, good editors will have more time to edit; good editors will be more likely to stay; and good editors will be more likely to join.

Wikipedia 2.0 uses the following new (or modified) concepts:

Current situation
Currently, wikipedia uses the label "project page" for three different categories of pages (all of which have the "Wikipedia" prefix as part of the name):
 * Discussions pertaining to individual pages, such as Articles for deletion/Kevin Zeese
 * Pages which have multiple discussions ongoing related to different pages: Articles for deletion/Log/2006 September 25.
 * Pages on policy and/or process: Articles for deletion.

Proposed

 * A project page is a permanent page for a discussion about a single page: to create an article, move an article, or delete an article. It is used in cases where  previously information was editable as a section of a page (articles being proposed for deletion; votes on moving a page; etc.)  A project page is normally closed after a certain time, either automatically or by an administrator.
 * A project log contains no content editable by a regular user. It consists of some introductory text (or a template), protected, and system-generated links (with limited text) to individual project pages.  Someone reading the project log for (say) articles proposed for deletion would click on a link to a given project page to see a detailed discussion and/or vote.
 * A policy page has no ongoing discussions that are editable by a user; there can be one or more links to project logs. The page is editable, of course, just as other policy pages are, but what is being edited is the policy, not an ongoing discussion about another page.

Task page
A task page is essentially a very smart dialog box, which displays information and accepts user input (some of which may be just checking a box). A task page is a temporary page; when the user is done ("submit"), the information on the task page is written to various places (typically a project page, an article's talk/discussion page, or a user talk page). Normally a set of automated processes (posting a template to a talk page, for example) is part of the process of using a talk page.

Task pages are designed to (a) improve quality, because they structure things -- for example, a user proposing a new article on a person would be shown any existing article that differed only because a middle initial was or was not included in a name); a user proposing a new article must include a certain number of or a user voting on a proposed deletion can't vote twice; (b) eliminate some user keying (for example, it's impossible to pretend to be someone else when voting on deleting an article, because the task page process handles adding a signature and date/time; and (c) help administrators (for example, closing out a discussion by using check boxes rather than directly editing a page).

Process-generated restricted section
A process-generated section (PGRS) is a section of a page that mentions and points to a project page, or is generated by a task page, or is otherwise generated by an automated process rather than by a human editor. Such a section consists of two parts: system-generated, not-editable, and free-form (comments allowed).

A PGRS is similar to a template, except that it cannot be removed by anyone other than an administrator and normally is not edited by administrators.

EXACTLY HOW SIMILAR NEEDS SOME THOUGHT: possibly - (a) it typically is found in the middle of page, (b) it can be updated by the system - for example, when a project page is closed; (c) it looks like a normal section except for the background shading.

''Intent: primarily for placing warnings on user talk pages. ''

Current situation
As of mid-September 2006, about 15 pages per day were being placed on Requested moves, asking for administrator's to do moves.

Why causes moves to be needed?
A causal analysis of the reason for this volume (representing X% of new articles added each day) is needed, but they include:

Mostly preventable

 * There is a change in the name of an organization; someone starts a new article rather than doing a move.
 * Articles started under two names (for example, Frank Clair Stadium at Lansdowne Park rather than just Frank Clair Stadium).
 * Naming convention issues (for example, whether a person should have a middle initial, as Thomas R. Carper, moved from Thomas Carper).

Sometimes preventable

 * Errors or confusion in moving pages, or trying to correct errors in mvoing pages.

Probably not preventable

 * Disagreement about best name (sometimes POV/NPOV issue).
 * Disagreement about disambiguation pages

Reducing the number of needed moves
A move where the target article already exists (other than a redirect) is generally a defect; where there are a lot of defects with a similar cause, changes in procedures should be considered.

Improving the process for both creating and naming new articles would significantly reduce the number of moves that are needed, and the amount of administrator time needed to deal with these.

Proposed new procedure
There are three goals of the new procedure:
 * More automated (reduces errors)
 * Mandatory review by others before move occurs (reduces errors)
 * Speedier moves in cases where administrator assistance is needed
 * Less work for administrators where administrator assistance is needed

Initial nomination
A task page appears when the "move" tab is selected, which asks for the target page. The system checks for possible duplicates (e.g., change in capitalization if more than one word; possibly a google search or weighted search); the user is asked whether he/she still wants to do the move (yes; no; yes, but change the target).

The system checks if the target page exists, and, if so, if it is only a redirect. If neither is true, the user is asked if he/she wants to propose a merge instead. (If yes, see "Merges" section.)


 * If not, the system compares create dates, number of edits to the articles and discussion pages, number of links, and character counts, and displays this information. A cancel/continue option is then displayed for the user.


 * If the user selects "continue", and two of more of these four characteristics indicate the target article is more "durable" than the article to be moved, the system asks the user again if he/she really wants to continue to proposed the move; the user can cancel out, propose a merge, or continue.


 * If the user wants to continue, he/she must enter an explanation of why a merge is inappropriate. (If the bulk of article history and discussion is on the page to be moved, the user can just check a couple of boxes.)


 * The user then adds a brief comment that (if the move is successful) will go into the history log.


 * The system posts a template on the article page and talk page of the page proposed for a move; creates a project page; and posts a section on the talk page that is a mirror of the project page.


 * If a target page exists, the system posts similar templates on the target page and its talk page, and a similar mirror to the same project page. If the target page is only a redirect, changes to it and its talk page are blocked (to preserve it for a move).  If the target page does not exist, then the system records in a log (hidden history) that a move has been proposed and that the page should not be created other than by a move. (Blocks should expire in 7 days, as a precaution against system problems.)

Review by others
The system identifies the most recent 20 editors of the article proposed for a move, its talk page, the target page, and its talk page. If the most recent 20 do not include at least 5 from one of the article/talk page pair, then the system adds additional editors for notification until either 5 is reached or all users have been included. The system also adds to the notification list ALL those who have the article on their watchlist.

The system posts a notice (NOT a PGRS) on the talk/discussion pages of those users, providing a link to the project page and notifying the users that the move could take place within 24 to 72 hours. (24 hours would be put into the notice if no target page exists; 72 hours would be used where administrator action is needed), and that how fast the move takes place, if at all, depends on how many users respond, and how quickly.

A user who follows the link, from either his/her talk page or the talk page of an article, to the project page can select "vote", which generates a task page to record the user's vote.


 * This is a standard "voting task page": (a) Choices: Support, oppose, comment without supporting or opposing, comment on another user's edit, or change a prior edit by the same user; and (b) text. Users are notified if another user comments on their vote or comment.

The system uses heuristics to decide when to close the project page if no administrator assistance is required for the move. For example, if no target article exists, the move would occur automatically at the 24-hour point if 5 or more users have supported the move by that time and none have opposed it.

The project page is placed in an administrator queue based on similar heuristics (the project page remains open), when administrator help is needed to make the move. It also goes into the queue when a proposed move gets at least 75% support but does not meet the criteria for an automatic move (for example, 3 or more "oppose" votes, with 80% support overall).

Closure
When a proposed move is closed, it updates the notices on user pages if these have been left as is; if they have been edited or deleted, then no update is posted.

Different types of editing
Edits consist of one or more of the following:
 * Reverts in full
 * Copyediting: minor adds and deletes of words; revising words; no new sources - purpose is improved accuracy and clarity of existing information
 * Reorganization: moving text between sections; moving sections around
 * Adding information - new text with (if quality is to be improved) mandatory sourcing
 * Removing information - removal of information other than as part of a copyedit normally should be done because of specific reasons: unsourced negative information, non-notable or uninteresting information; duplicate information; information contained in a main article; etc.
 * Adding an external link or footnote/reference
 * Adding a category

Anonymous editing (IP address only)
Anonymous editing should be continued to be allowed, but might be limited to copyediting and additions of information WITHIN a single section. (A user could change more than one sections by doing separate, multiple sections.)

Anonymous editing should be required to be approved by a non-anonymous editor before posting to an article. The suggested approach:


 * When an anonymous editor completes an edit,

TO BE CONTINUED <<<

Advantages of treating different types of edits differently

 * Disclosing what one is doing (or claims to be doing) aids other editors in distinguishing vandalism or ill-intended edits.


 * If the editor indicates a type of edit, the system can evaluate whether that type of edit was in fact done (for example, saying "copyedit" while deleting all content can be stopped by the system).


 * The system can prompt users who are adding information or links to provide the full information, not just a URL, via a task page.


 * A revert should be done only for a specific reason, which could well be from a pull-down menu. Reverting another editor's changes might be done for a reason that should be posted to the user's talk page, or otherwise "counted against" that user.

== Building quality into the edit an talk/discussion page process. Edits to talk/discussion pages should ONLY be allowed via a task page. That would immediately solve a number of problems - new sections at the top rather than bottom, failing to add a signature/time; faking a signature/time (vandalism); deleting comments of others; editing comments of others; poor/nonexistent indenting; and even (to some extent) uncivility (by running text through a "civility checker", and asking the user if he/she REALLY wants to post exactly what he/she just typed).

Link rot
Goes into archive.org when added to page. If unable to add, user gets warning notice, can ignore but link is flagged by system. (Note that wikipedia currently does block some links from being added, to reduce linkspam.)

If link goes bad
Editors can flag a link as "bad". This automatically results in (a) a link to archive.org being added next to the link reported as bad; (b) testing of the link by wikipedia to confirm (page comparison).

If owner is unwilling to let archive.org have a copy
If the link is from a source that either initially or subsequently requests archive.org to not store its contents, the copy of the page is given to wikipedia to be stored with very limited (sequestered) access (definitely not publicly viewable). The concept is that the page is in a "filing cabinet", not in the public domain, and that the "filing cabinet" approach still is fair use. (There should obviously be an appeals process for this, in case of privacy violations, for example.)

There are several ways that sequestered pages (sources) could be handled, in cases of questions from editors. One would to allow the page to be viewed only by an administrator, who would have the responsibility of changing the text in the article if necessary. A second would be provide the editor with a copy, similar to what is done with deleted pages ((an image, not HTML?). A third would be to have a separate group of editors (librarians?) who would have viewing privileges and subsequent editing responsibilities.

Ideally, the text that is supported by a sequestered link should be protected from subsequent editing after being edited by an administrator/librarian; this would stop repeated objections to valid but controversial text.

Echo chamber
Is it true that Neil Bush "was restricted from undertaking future banking activity"? When I looked in September 2006, I found numerous articles with that phrase, but they all pointed back to the wikipedia article. In the end, I removed it for lack of a source.

Unsourced articles
As of September 12, 2006, the English version of wikipedia had over 230,000 articles without any sources whatsoever, almost 20 percent of the total number of articles.