Wikipedia:Bots/Requests for approval/Autotitlechange


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol delete vote.svg Denied.

Autotitlechange
Operator: Autotitlechange (talk)

Automatic or Manually Assisted: ''semi-supervised. Batch files are created manually. Batch files are executed automatically.''

Programming Language(s): ''Use "mvs" to upload/download web pages. Perl for text processing and C++ for the implementation of layout regularization algorithm.''

Function Summary: ''Regularize section titles for biography pages. ''

Edit period(s): ''one time write execution. daily read only checks for updates.'' 

Edit rate requested: 1k edits in total

Already has a bot flag (N):

Function Details: ''Titles and subtitles are extracted from biography pages. A natural language processing algorithm is run to find common structures in biography entries. Extracted structures are applied to pages in order to match their layout. Only section subtitles are changed for that purpose.''

Discussion
This project is a part of course project and a research project that is done at MIT. The course project is under Alvin Raj, Jenny Yuan, Serdar, and Erdong Chen on course Database System 6.830 at MIT.  The research project on language generation is under Erdong Chen.

We think this algorithm is useful for enforcing a layout format for biography pages in wikipedia. We aim at testing our algorithm on a small subset of pages and get feedback from the community and improve our algorithm. Any feedback and opinions are welcome.
 * Welcome to Wikipedia! Before we proceed, we'd like you to create a userpage for your bot, to describe to other users what your bot does. User:AntiVandalBot is a decent, if a little verbose, example, however all you really need is the text bot.
 * As for the request itself, I'd like to see a dry run before I approve any live edits - this request is a little more advanced than some we get, and I'm not sure what to expect - can you run it without making any edits, and post the changes it would have made to, say, 10-20 pages, as an example? Thanks, --uǝʌǝsʎʇɹnoɟʇs (st47) 02:41, 6 December 2007 (UTC)
 * Who is the actual operator? It states on the page that the operator is the bot. &lt; DREAMAFTER &gt; &lt; TALK &gt; 03:09, 6 December 2007 (UTC)
 * It's the account User:Serdarbalci, which seems to be the student "Serdar" named above. --uǝʌǝsʎʇɹnoɟʇs (st47) 03:15, 6 December 2007 (UTC)
 * Ah, ok. &lt; DREAMAFTER &gt; &lt; TALK &gt; 03:21, 6 December 2007 (UTC)


 * Thanks for the comments. We uploaded some sample changes to [|Autotitlechange]  page. It can be well observed how similar sections are mapped into more broad and general categories. We are eager to test this algorithm on a small subset of pages. We want to get some user feedback and improve our algorithm and also contribute to the Wikipedia community. Serdar Balci is going to monitor the changes to Wikipedia for our group work.  Serdar Balci 12:59,6 December 2007 EST.
 * Excellent. Please make 50 edits. --uǝʌǝsʎʇɹnoɟʇs (st47) 11:28, 6 December 2007 (UTC)
 * Thanks! Serdar Balci
 * Does anyone notice that the operator has very few edits? They also do not seem to be familiar with Wikipedia policy, as shown above. Soxred93 has a boring sig 23:18, 6 December 2007 (UTC)
 * We run the bot on 50 sample web pages and provide the links to the changes at Autotitlechange. I would be happy to hear any feedback. We will be monitoring the persistence of the changes. We made the changes mainly to infrequently used pages, but are willing to test the algorithm on more popular pages. I am new to Wikipedia community and have indeed few edits. Sorry for my wrong update, I'll check the guideline page for edits. Serdar Balci 12:24,7 December 2007 EST.


 * Preliminary tests don't look good. On Dan Abnett, the algorithm wants to change "Bibliography" to "References"? There's already a references section, so the Bibliography is likely something else. A "Bibliography" section on wikipedia can be used for either the idea of "further reading" about the subject not used as references, or "works published" by the subject of the article. Same idea for Michael Flood and "Recent publications". These should not be changed to "References". The suggestion for Troy Evans suggests you're working from an archive rather than reading the active page, and a pretty old archive at that. The title "College years" was changed months ago. On David Ellett, I don't see why "Career statistics" should be changed to "Career". I'm not really seeing what this is supposed to be doing, so I don't know what to suggest to improve the language algorithm. Gimmetrow 23:25, 6 December 2007 (UTC)
 * You seem to have 2 accounts, that have very few edits on them. You even seem to not be familiar with Wikipedia policy, let alone bot policy. You have 1 edit to the mainspace, which was removed in the next edit. Please come back when you have a little more history. Soxred93 has a boring sig 02:00, 7 December 2007 (UTC)
 * Thanks for valuable comments. We are trying to achieve a rather difficult objective of having a uniform layout across different pages. But it seems that we have to improve the algorithm. The algorithm considers each section independently so it fails in some cases, as in "Bibliography" to "References" where there is an already existing "References" section. It seems that achieving acceptable results for logical ordering and naming of sections is too difficult for an algorithm to achieve. A more acceptable and maybe more fruitful direction would be to concentrate more on low level variation in layouts, e.g. "Reference" to "References" where formats and styles of the sections are matched. Again we are thankful for your valuable comments. We will be working to improve this algorithm. Serdar Balci 06:04, 7 December 2007 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.