User:Skagedal/Fafafa

This is a Python program I made that creates RSS feeds of the latest featured articles, pictures of the day, selected anniversaries and Wikiquote's quote of the day. It runs every night at 02:05 UTC, and generates feeds at:


 * http://toolserver.org/~skagedal/feeds/fa.xml (featured articles; validate feed)
 * http://toolserver.org/~skagedal/feeds/potd.xml (picture of the day; validate feed)
 * http://toolserver.org/~skagedal/feeds/sa.xml (selected anniversaries; validate feed)
 * http://toolserver.org/~skagedal/feeds/qotd.xml (quote of the day; validate feed)

For the code, see the Github repository. Patches and all kinds of feedback are of course welcome!

The program was inspired by User:Dze27's featured article feed, but shows the latest 20 articles, not just today's article.

News

 * 2008-12-12:
 * version now at 0.9.2
 * the feeds now validate as correct RSS.
 * the SA feed should now work better as each item have unique GUIDs
 * new feed: Wikiquote's "quote of the day"; also started work on Wiktionary's "word of the day"
 * import into subversion repository
 * fix bug: add hostname to links to "/w/index.php"
 * 0.9.1: (2008-12-10)
 * remove "surroundings" on POTD pages, now only the picture and description is in the feed (and a little link to the archive)
 * 0.9: (2008-12-10)
 * now runs on the Wikimedia Toolserver. The feeds at the old place point to this with multiple redirect mechanisms (HTTP header 301, RSS "redirect" tag, and a regular item if neither of the previous work), so I hope this won't cause much trouble. (Do readers actually update to the new URL, or do they keep accessing the old just to get the redirect...?)
 * remove html comments to reduce xml size
 * remove "footers" on FA feed (from "Recently featured:" on) and SA feed (from "More events:" on)
 * on SA feed, remove the "documentation" div
 * 0.8.3: fixed POTD, now working again (2008-09-01)
 * 0.8.2: correct URLs for wikilinks in articles (2008-08-16)
 * 0.8.1: updated for reorganization of POTD (2007-01-11)
 * 0.8: added support for selected anniversaries &mdash; could be made prettier...
 * 0.7: added support for picture of the day
 * 0.6: initial version

To do

 * Use the render action (like this), to minimize traffic

Assumptions
Since Fafafa generates feeds from Wikipedia HTML, it relies on the content being laid out in a specific way. If these assumptions no longer hold, the script will fail. When possible, it will fail "gracefully", so that the content is still there, but might not be pretty. The following is a few of the things Fafafa assumes:
 * URL:
 * "Featured article" for a specific day, say April 2, 2006, can be found at Today's featured article/April 2, 2006
 * "Selected anniversaries" for a specific day, say April 2, can be found at Selected anniversaries/April 2
 * "Picture of the Day" for a specific day, say August 30, 2008, can be found at Template:POTD/2008-08-30
 * Content: Content of article is between  and.
 * Title: The following works right now, will probably have to be adjusted...
 * The text inside the first bolded a-tag is the title.
 * If that can't be found, the first bolded text is the title (e.g. March 21, 2006 POTD).
 * If that can't be found, the first a-tag is the title (e.g. March 15, 2006 POTD).
 * For further assumptions, look at the /Code and search for ASSUMPTION: