User:Singpolyma/git-mediawiki

Brainstorming for a git <=> mediawiki bridge (ala git-svn).

Code (in Ruby) at: http://github.com/singpolyma/git-mediawiki

More advanced approach (in Perl): https://github.com/Bibzball/Git-Mediawiki/wiki/

Related project (in Python): http://github.com/scy/levitation

Get revisions for a page
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Main%20Page&rvprop=ids|flags|timestamp|user|comment|content&rvlimit=1&format=json

Set rvlimit to something bigger. Wikipedia limits to 50 by default. If that's common, the initial clone will definitely have to do paginated requests. Gross.

The revision ids should be added to the git commits (probably appended to commit message ala git-svn). Then when doing a fetch, we can just get changes since the most recent revision we pulled (using rvstart).

Documentation on this API and more at http://www.mediawiki.org/wiki/API:Query_-_Properties#revisions_.2F_rv and related.

Clone/Fetch
Will need to get a list of all pages. Probably using http://www.mediawiki.org/wiki/API:Query_-_Lists#allpages_.2F_ap

Should be able to clone whole wiki, one namespace, or stuff rooted at one page (to check out one wikibook, for example).

There are no branches, so everything goes in master.

Each page becomes a text file. Pages with / in the title get put in directories.

Use an authors.txt similar to git-svn to map mediawiki users onto an email address and name. If not found, use the MW username s/\s+/-/ @host-for-wiki.tld

Push
No branches on remote. Push from any branch other than master should warn.

There are edit and auth APIs. This should be possible.

Each commit becomes a revision. The revision number gets amended into the local commit so we don't pull it back again as a duplicate later.

Misc

 * Have an option to clone as Creole?
 * Should write in sh as much as possible (for portability) perl acceptable (all git users have perl, it's a dependancy) or small C utilities.
 * I don’t think sh is very portable—there are many inconsistent implementations and it’s no nice fit for Windows. IMHO, it’s best to go with a high level, cross-platform scripting language like Ruby or Python. I’m not sure about Ruby, but Python goes out of its way to make sure the code you write works on all platforms: you can let python figure out the correct path separator (\ or /) for example. & you can easily bundle the runtime for Windows. --Betekenis (talk) 12:11, 9 February 2011 (UTC)
 * Since C gives you the additional hassle of compiling for different architectures, I would only recommend it if speed turns out to be a bottleneck for this project. Though, given that the utilities will spend most of their time idly waiting for HTTP requests and Git calls, this seems unlikely. --Betekenis (talk) 12:11, 9 February 2011 (UTC)
 * Make sure it runs on windows too (as long as there is an sh.exe, perl.exe, and other utils present)
 * License it ISC or similar