User:Tony1/International survey of overlinking

Work in progress: should be completed and transferred to project space within a few days; please feel free to join below.

Internal linking is one of the most important features of all of the Wikipedias (WPs). It binds each project into an interconnected whole and provides instant pathways to locations within the project that are likely to increase our readers' understanding of the topic at hand. This hyperlinking feature is an important advantage of an online encyclopedia.

As the project emerges from the exponential growth phase of its life cycle, the priorities are expected to change. The first eight years of the grand WP project paid little emphasis to the skill that is required for optimal wikilinking. Only recently, and very unevenly, has there begun to develop a sense that while the internal linking is a great strength, the decision of whether to link an item should involve a weighing up of its 'costs' against its advantages. Three potential disadvantages to excessive linking are its potential to (i) dilute other links in the vicinity, (ii) reduce the ease of reading, and (iii) reduce the professional look of the text. When these costs are out of balance with the relevance and utility of a link to the readership (and the likelihood that the link will be clicked on), the result is overlinking. We believe that now, striking a correct balance between overlinking and underlinking across all languages would be essential to retaining the project's leadership as the Internet's number-one information site.

This MetaProject aims to provide the information all WPs need, so each can take stock of its wikilinking practices and decide, for the benefit of its readers, whether and how it needs to act to get the balance right. This is a significant step in the history of WikiMedia that requires inter-wiki, cross-language collaboration on a basic matter of content and formatting. A primary principle of ISOL is that our readers' interests are the primary concern of all WPs, and should be given greater weight than the desires of individual editors and groups of editors.

Membership
This MetaProject is open to anyone who has a good-faith interest in optimising wikilinking for the readers of all WP editions through the avoidance of overlinking. We prefer members to have an account on at least one WikiMedia site, but IP contributions are acceptable. Members provide information to us about internal linking practices, rules and guidelines on their own and other WPs, and provide lines of communication between the projects on overlinking. Particularly valuable are users with skills in the languages covered by the WP projects and who are already contributors to those projects. People who speak only one language are also welcome.

Inter-wiki communication. The default language of this WikiProject is English. Because contributions by those whose native language is not English are critical to our goals, we will do our best to translate entries and threads that are in other languages, by human-improved computer translation, and where necessary by copy-editing contributions that are in English. If you feel you can express yourself in passable but faulty English, that would be preferable (we all write faulty English—native and non-native speakers alike)!

Google computer translator—the raw computer translations almost always needs human editing.

List of members
We are delighted to accept bona fide members who wish to participate in this inter-wiki project. Please sign up below if you are interested in being involved.

Please provide the following information, in this order:
 * Sign with the four tildes;
 * State your languages - rating yourself for each as 5 (native), 4 (near native), 3 (reasonable facility), 2 (intermediate), or 1 (beginner)
 * State your level activity on the Wikipedias
 * If you're willing to survey a particular Wikipedia, please say so.


 * 1) Tony   (talk)  10:06, 13 May 2009 (UTC) English 5, French 3, German 2. Highly active on en.WP. Willing to contribute to surveys of en.WP and others.
 * 2) Ohconfucius (talk) 15:07, 13 May 2009 (UTC) English 5, Chinese 5, French 4, Czech 2. Highly active on en.WP. Willing to translate and contribute to surveys of en.WP, fr.WP, zh.WP and others.
 * 3) --Goodmorningworld (talk) 16:21, 13 May 2009 (UTC) English 5, German 5, French 2. Mostly active on en.WP, some activity on de.WP. Willing to contribute to surveys of en.WP and de.WP.
 * 4) Laser brain   (talk)  22:43, 14 May 2009 (UTC); English 5, Italian 3, Spanish 2, German 1; Reasonably active on en.wikipedia; Willing to survey it.wikipedia and possibly es.wikipedia if no one better steps up.
 * 5) Bishonen | talk 20:36, 15 May 2009 (UTC). English 5, Swedish 5, Danish 3, Norwegian 3. Active on en.wiki. I'm willing to contribute to surveys and translations from Swedish, although I'm not active on sv.wiki.
 * 6) Waltham, The Duke of 23:24, 2 June 2009 (UTC). Greek 5, English 4, German 1. I am willing to contribute to surveys of el.wikipedia, although I am not active there.

Methodology
See also Methodology subpageHow the survey is conducted and the data expressed has not been finalised, and is subject to agreement on the talk page, where anyone is invited to provide feedback.

Thus far, the plan is that survey editors do two things for their chosen WP edition: (i) to identify any guidelines, rules or even essays on wikilinking in the edition, and if so, to provide a link to them; (ii) to review six "benchmark" articles in the edition, (iii) to provide a summary statement about wikilinking in the edition, and (iv) to provide responses to a set of standardised questions down in the discussion sections.

Survey the benchmark articles

 * We invite survey editors to determine for a representative part of each of the six articles the percentage of (i) common "dictionary" items (e.g., in most contexts, "tourism", "United States", "music", "sport"), and (ii) chronological items (e.g., years, months, decades, and centuries that are unnecessarily linked. The six articles are chosen on the basis of wide cross-cultural recognition; substantial article treatment over many WP editions; and dissimilarity of topic (sportsperson, nation, artist, etc).
 * We ask that survey editors analyse (i) the lead of the article, and (ii) one other, substantial and representative section below the lead (i.e., a section that generally appears to be linked about as much as the article as a whole).
 * Since it is essential that a single set of criteria be used throughout the survey (without this, a comparison is invalid), we suggest that the English WP's guidelines be used to identify overlinked items. In no way should this be regarded as a prescription for any edition other than en.WP: naturally, each edition has been and will continue to be free to determine its own guidelines and practices for wikilinking. The aim of the survey is simply to produce good data, comparative and absolute.
 * The survey should include the main text, titles and subtitles, table items [?], and image and figure captions.
 * The survey should exclude direct quotations, links to daughter articles that are set off at the start of sections; text within navbox and infobox; the sections at the bottom, including References and See also (there are separate questions about these aspects in the discussion sections below).
 * Reviews should be based on the article version at 23:59, 30 April 2009 (UTC). and to make brief summary observations about the WP edition in question in the summary section.

Write your summary statement
Insert this statement under the table row allocated for your edition. It should be succinct—if possible < 150 words. More can be said underneath in the discussion section for your edition.

Respond to the standardised questions
We have posed ?five key questions about overlinking in the discussion section for each edition. We ask that survey editors respond to these questions.

Ratings of the overall level of overlinking (OL)
Ratings are on a nine-point scale from 0 (the ultimate goal for all Wikipedias) to 8 (extreme overlinking), applied separately to dates and other words. [These need to be illustrated by examples.]


 * 0: (very low) -very few instances of dates linked in body or reference sections; linking of relevant words on first occurrences only; no linking of commonly used terms
 * 1: -most date links restricted to infoboxes; multiple repeated linking of a relevant term, no linking of commonly used terms
 * 2: low -occasional/selective dates linked in infoboxes and body; multiple repeated linking of fewer than 5 relevant terms, no linking of commonly used terms
 * 3:
 * 4: moderate -few links to mmdd or ddmm dates in body or reference sections, years mostly linked; multiple repeated linking of fewer than 10 relevant terms, some linking of commonly used terms
 * 5:
 * 6: high -many instances of dates linked in infoboxes and body and reference sections; extensive multiple repeated linking of relevant terms, pervasive linking of commonly used terms
 * 7:
 * 8: (extreme) -almost all instances of dates linked in body and reference sections; multiple-linking of terms, linking of commonly used terms which appears to be indiscriminate

English
en: User:Tony1/International_survey_of_overlinking/Data

German
de:

French
fr:

Japanese
ja:

Polish
pl:

Italian
it:

Dutch
nl:

Chinese
zh:

Portuguese
pt:

Spanish
es:

Russian
ru:

Swedish
sv:

Catalan Wikipedia
ca:

Czech Wikipedia
cs:

Danish Wikipedia
da:

Esperanto Wikipedia
eo:

Finnish Wikipedia
fi:

Hungarian Wikipedia
hu:

Indonesian Wikipedia
id:

Norwegian Wikipedia
no:

Romanian Wikipedia
ro:

Slovak Wikipedia
sk:

Turkish Wikipedia
tr:

Ukrainian Wikipedia
uk:

Arabic Wikipedia
ar:

Bulgarian Wikipedia
bg:

Croatian Wikipedia
hr:

Estonian Wikipedia
et:

Haitian Creole Wikipedia
ht:

Hebrew Wikipedia
he:

Korean Wikipedia
ko:

Lithuanian Wikipedia
lt:

Persian Wikipedia
fa:

Simple English Wikipedia
simple:

Serbian Wikipedia
sr:

Slovene Wikipedia
sl:

Vietnamese Wikipedia
vi: