User talk:Yurik/Interwiki Bot FAQ

ATTENTION
Most of messages I get are in the form

'Your bot keeps adding interwiki xx:xxxxx to the article xx:xxxxx. I have removed it 10 times, but it still does it. They are not the same articles. Please make it stop!!!'.

Before you remove it again, or leave an angry message on my talk page, please understand why it happens:
 * Not sure if this is the right place to comment, but this is already in the "talk" namespace apparently, so here goes:
 * It's not my problem why it happens. It is inappropriate of you to suggest that the malfunction of the interwiki bots should be addressed by editors having to go around to an unlimited number of other wikis and correct their links.  The bots should not do this, period.  If they are reverted once, they must stop. --Trovatore (talk) 01:29, 11 February 2012 (UTC)

How does the bot operate?

 * The bot is given a single site (in this example, ru). The bot takes one page, and looks at all interwiki links to other sites. It then takes interwikies from all those sites. The process is repeated until there are no new links from any of the sites. If there is no more than one page per site, the bot places links to all found sites on all the pages involved. As a result, all pages become interlinked.
 * Example: ru:Wikipedia has links to en and fr, fr has links to zh, fr, and da, etc... As the result, the list will include pages from ru,en,fr,zh,da, and any other found. As long as each site has only one page, bot will place links to all found pages on each one of them.
 * Conflicts: If bot finds more than one page on any of the sites, it stops and asks operator for help. The operator has to analyze each page and choose one page that most accurately reflect the original topic. Once all conflicts are resolved, all pages are updated with the new information.

The bot does not know anything about the subject matter, nor does it care if they are the same or not. If the bot placed a link, it means that the link already exists somewhere else, and it just got copied. Removing it on one page will not fix the problem - somewhere some human made a mistake of linking two unrelated articles, and bot propagated that mistake to another site (see more details below). To fix it, you must manually remove all the bad links. If just one remains, it will come back. I am still working on the web-based tool to make the removal easier, but it is not ready yet.

I would like the bot to run on language XX...
To let the bot run on a new language, you must first put a note at Requests for bot status. Once the flag is granted, i will add it to the list.

How to change many interwikies at once
See Interwiki Conflict Resolver tool - eventually it will be real time, but for now use it to tell me what needs to get done.
 * This tool is temporarily suspended. You can use it to view the links, but it will not change any pages. --Yurik 05:24, 13 February 2007 (UTC)
 * This tool is expired Bulwersator (talk) 10:10, 20 May 2011 (UTC)

Is there a dictionary bot to find new links?

 * No. The bot operates only on the links found on the given page, and uses them to discover more links.

What about dates, years, etc?

 * The bot knows about different years and date formats used on different sites. Enter more formats here: User:Yurik/Formats. For example, February 25 on en is recognized as 25th day in February, and is matched with corresponding day in all other known sites, if they have it. There is no need to have any interwiki links. At present bot recognizes years AD/BC, decades AD/BC, centuries AD/BC, millenniums AD/BC, and Days of the month. It correctly handles Arabic and Roman numerals, and knows the sites that decided that year 2000 is in the 21st century.

The bot keeps adding back an incorrect link to site xx, what should I do?

 * This tool was designed to help users sort out these kinds of problems, but the tool is not fully complete. Use it to tell me how links should be resolved.


 * One or more of the sites found during discovery also point to site xx (see ).
 * Any of the following solutions can be used to solve this problem:
 * Find or create the correct page on site xx, and fix just one of the other site's pages with a new link instead of the existing one.
 * The bot will see two links to site xx, and will ask operator what to do.
 * or
 * Edit the page on xx to link with the proper existing page on other sites, thus also causing a conflict.
 * or
 * Comment out the incorrect link, the bot will do the rest in all the other language versions. After that you can fully remove the link.
 * Example: en, ru, ja, and ko are all interconnected. ko describes some other topic than the first 3. Removing it on just ru will not help, as all other sites still point to it. To fix this, create or find a page on ko that matches the topic and edit just one site, like en to point to new ko page. Alternatively, find the topic of ko site on either en, ru, or ja and change ko page to point to it.

The bot deleted a link, but i know it's there!

 * The links are case sensitive, please make sure the link has the same case as the article.

Why is bot replacing non-Latin characters with question marks or blanks?

 * It's not. Your computer has no appropriate font installed, so for example Chinese or Japanese characters will appear as question marks. The links still work and will get you to the proper page (you probably won't be able to read it, as most of those characters will also be question marks). The reason for bot to do this is to get rid of the unreadable html Unicode notation (like ? used to be written as &amp;#22283;). The ease of use should be self-evident.

Bot is adding empty links to other sites

 * See above.

Why should the bot change all sites at once?

 * To find all linked pages, the bot needs to check all linked sites (count N). Afterwards, the bot used to change just one page. Other sites were running their own bots, that also checked N sites and changed one. The total server load was N sites * N reads + N writes. Changing all sites at once allows total server load to be N reads + N writes -- a very significant improvement.
 * Another reason is that when sites are kept in sync, if some site renames the page A into AA, that change is immediately seen everywhere. If later some decides that A should be a topic of its own, there will be no conflict, as no site is pointing to A, only to AA. This is a fairly common scenario I had to resolve.

Disambiguation handling

 * When running in autonomous mode, bot checks if the page is a disambig or not, and makes sure that all the other pages it links to have the same status. This means that when page A has a disambiguation template, all linked pages must also have a disambiguation template, otherwise they will be ignored. The reverse is also true - a regular page link to a disambig page will also be ignored.

The bot is hiding vandalisms!
Please be aware that there is an option to hide bot edits from your watchlist and from recentchanges. Alternatively, choose 'expand view' for the watchlists and RC in your preferences. That way you'll be able to observe all human edits, even if a bot made an edit afterwards.

The bot replaces one link with another
Sometimes bot will modify a link to a site by replacing it with another link to that same site. This may happen for one of two reasons:
 * 1) The target is a redirect, in which case bot will link to the actual page rather than going through a redirect. Redirects are automatically created when the page is given a new name.
 * 2) The target is a disambiguation page, yet another linked page in another language has a link to a non-disambiguation page. Regular page is always chosen instead of a disambig.

Some questions

 * Where is the bot physically hosted? Is the code open and free and can I run it on my computer? Is there some kind of a progress report available? And if this is not a good place to ask questions liek these, what is? —Preceding unsigned comment added by 193.166.137.75 (talk) 10:40, 28 July 2009 (UTC)


 * How are inter-wiki links to a #section of an article handled? --Vicky Ng (talk) 14:54, 13 January 2011 (UTC)

interwiki Bot made problem
Hi according to Bot_policy I run bot in template ns and it cause some problems in /doc sub-pages. please take a look to User_talk:Reza1615Reza1615 (talk) 05:50, 23 October 2011 (UTC)

english two pages with We Were Here and We Were Here (film)
2 pages in english : We Were Here and We Were Here (film), in french only one translation We Were Here it is the film. interwiki with en: We Were Here (film) and no interwiki sv (it is not the film)--Almanach94 (talk) 16:35, 24 January 2012 (UTC)

Sicily
Hi Yurik, to who can I ask explantions about this wrong edit? and same problem, and it isn't a small problem, happened here. Thanks.--Sal73x (talk) 20:54, 12 May 2012 (UTC)

Hello
Hi, Please I was granted admin rights on my local wiki tw.wikipedia.org not long and I would be glad if you could assist me in creating an interwiki bot to do stuff that was tedious to do manually on my local wiki -- Thank you Robertjamal12   (talk)  13:34, 7 December 2021 (UTC)
 * you don't need interwiki bot anymore - you can use wikidata for that. --Yurik (talk) 15:30, 7 December 2021 (UTC)
 * Alright that's fine, Thanks -- Robertjamal12   (talk)  16:17, 7 December 2021 (UTC)