Wikipedia:Reference desk/Archives/Computing/Early/ParseMediaWikiDump

Parse::MediaWikiDump is a Perl module created by Triddle that makes accessing the information in a MediaWiki dump file easy. Its successor MediaWiki::DumpFile is written by the same author and also available on the CPAN.

Download
The latest versions of Parse::MediaWikiDump and MediaWiki::DumpFile are available at https://metacpan.org/pod/Parse::MediaWikiDump and https://metacpan.org/pod/MediaWiki::DumpFile

Find double redirects in the main name space
This program does not follow the proper case sensitivity rules for matching article titles; see the documentation that comes with the module for a much more complete version of this program.

Extract articles linked to important Wikis but not to a specific one
The script checks if an article contains interwikis to :de, :es, :it, :ja and :nl BUT not :fr. It is useful to link "popular" articles to a specific wiki. It may also give useful hints about articles that should be translated in priority.

Related software

 * Wikipedia preprocessor (wikiprep.pl) is a Perl script that preprocesses raw XML dumps and builds link tables, category hierarchies, collects anchor text for each article etc.
 * WikiProject Interlanguage Links/Ideas from the Hebrew Wikipedia - a project in the Hebrew Wikipedia to add relevant interwiki (interlanguage) links to as many articles as possible. It uses Parse::MediaWikiDump for searching for pages without links. It is now being exported to other Wikipedias.