Wikipedia:Bots/Requests for approval/DatabaseBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol neutral vote.svg Request Expired.

DatabaseBot
Operator: Gsonnenf (talk) / DatabaseBot (talk) ( I would like to manage this bot primary through the DatabaseBot talk page, thanks ^_^.)

Automatic or Manually Assisted: Automatic:supervised

Programming Language(s): Java - JWBF framework

Function Overview:
 * This bot will build data templates from Infoboxes, by iterating a category, which will be used to create lists and similar collections.
 * Changes to these data templates will propagate to all elements that draw from them.
 * This bot will be operated primarily within the wp:c++ project which needs the cleanup.

Edit period(s):

Roadmap
 * One time trial run on "Category:C++ software libraries" for trial period. This is a project I am active in. Data templates will be stored in user space.


 * One time run on "Category:C++ software libraries". Data templates will be stored in the main space.
 * Next proposal application to other C and C++ categories after discussion with other members of wp:c++.
 * Future proposal application to other projects after discussion with members of respective groups.

Already has a bot flag (Y/N):

Function Details:

1. This bot will simplify the creation and maintenance of lists and other data driven templates by creating templates pages that draw from the visitor pattern and model-view-controller on the category:C++ library. This may sound "complicated", but be assured this is only an academic way of stating a simple to understand process.

For the bots proposed run, the bot will iterate a category and create data templates from Infoboxes. The bot will also create a list that uses the data template.

These data templates can be used called in many pages and paired with different templates, to present a single information source in many ways. this concept is illustrated in. If the data template is changed, this change will propagate to the infoboxes and lists that use them. This saves users the maintenance problem of tracking and updating all infoboxes and lists when a change occurs. This method relies entirely on templates, so the automatic propagation is done by the mediawiki software. This bot is not involved in this maintenance, only the initial change.

2. If a flag is set by the operator, this bot will add an infobox to elements in a category if it can't find one.

3. New additions to the category would still require manual user maintenance or regeneration of the list. This bot is not operated continuously.

4. This bot will need permissions to post websites, as these are common elements of software lists.

For more information:

User:DatabaseBot



Discussion
A bot can't run itself. Please provide accurate operator information. Q T C 08:10, 31 March 2009 (UTC)

Fixed, I added my primary account and left rational for the prior operator information.Gsonnenf (talk) 08:33, 31 March 2009 (UTC)


 * It's difficult for me to understand what this request is proposing. It looks like you're proposing separating out data from articles throughout Wikipedia, using a bot, and displaying that data via templates in a difficult-to-understand way using the principles of a lesser known data architecture theory. To I have that right? – Quadell (talk) 14:02, 1 April 2009 (UTC)

You have the some of the idea correct, but the scope and subject of your comment do not reflect what I intend, below is a clarification:

SCOPE
 * The initial scope of this project is the category:c++ libraries, an orphaned category I have decided to maintain.
 * I need to generate tables for these libraries anyway, but it is to tedious to do it by hand.
 * Further propagation of this method would require additional discussion and review of prior results and reception.

ARCHITECTURE
 * I'm proposing using the Model-View-Controller architecture, one the most well known and used architecture in GUI and web programming.
 * This architecture pattern is discussed on page 4 and 5 of the highly regarded book Patterns: Elements of Reusable Object-Oriented Software (ISBN 0-201-63361-2), as well as nearly ever other recent book on design patterns.

USABILITY
 * Displaying data with a template would be very easy. you would type { { dataPage | templateDisplayPage } }. I couldn't think of an easier way to display data in a template.
 * to change data, you would simply edit the data page.
 * to change the display, you would simply edit the display template page.
 * This method is completely compatible with current display templates.
 * The current way to change data is to try and find every page, list, navbox, etc where a specific data is used and change it. This is especially ridiculous for rapidly changing data such as software release version or country population.

Gsonnenf (talk) 20:07, 1 April 2009 (UTC)

Update The bot is coded. I will put post the source upon its tentative approval.Gsonnenf (talk) 20:43, 1 April 2009 (UTC)


 * As a rule, I think we generally aren't that bothered about the details of the implementation (your MVC comments, if I read them right, fall under this category), but more about the kinds of edits that the bot will be making. Basically, it would be useful to see what the average wiki-user will see when looking at the bot's contributions.   [[Sam Korn ]] (smoddy) 22:07, 1 April 2009 (UTC)


 * Hi Sam, the MVC comments are related to how the data will be stored on Wikipedia, not the implementation of the bot (i think this is what you meant?). I had posted an example on the bots, but it seems the example is unclear. I'll go ahead and draw up a better presentation with less CS and software engineering jargon.Gsonnenf (talk) 01:54, 2 April 2009 (UTC)
 * The dev's usually frown on bots using wikipedia as a database instead of using an actual database. Q : Chat  02:09, 2 April 2009 (UTC)


 * I agree that bots shouldn't use Wikipedia as their database. This bot would be be putting some redundant information strings, that occurs in multiple places om multiple wiki pages, in a reference template. This would help solve the maintenance problems of certain pages, lists etc, displaying outdated information. It would result in a reduction of the amount of redundant data on Wikipedia. Instead of the same (or outdated) cut and pasted information existing in multiple pages, users could simply reference the data template.


 * Of course, this bot will also make lists from proposed categories, helping people maintain lists. Having accurate lists is very important to the quality of Wikipedia, as it helps users organize and identify invalid, outdated, and missing content.Gsonnenf (talk) 11:19, 2 April 2009 (UTC)


 * This diagram ([[File:User DatabaseBot Presentation Of Information Architecture.pdf]]) shows how the data template i was talking about works. For the scope of its use please see the initial application. Thanks! If you have any further question feel free to ask, but PLEASE be specific and cordial. Its very difficult to answer vague or ambiguous questions, re-answer questions that are addressed elsewhere, or answer questions that are veiled insults. Thank you!Gsonnenf (talk) 00:42, 3 April 2009 (UTC)


 * Thanks for the added information -- I understand much more what you intend the bot to do. (It's similar to a template I've come across: .)  My principal concern is that this is more than just the introduction of an automated task -- it is a new content-storage paradigm.  As such, I think it should probably be discussed at e.g. the Village Pump.  There are valid questions to be asked before it is approved.  Firstly, do we want to go further down the route of not having the wiki-markup of a page correspond to its display, thereby creating an extra hurdle for inexperienced users to edit?  Secondly (and not unrelated to the first point), do we want to go further down the route of abstracting data from its display?  MVC is a fine principle for programming, but MW syntax is not a programming language.  These are concerns that are bigger than this bot request, and should probably be discussed elsewhere first.
 * Another query I have is about "this bot will also make lists from proposed categories, helping people maintain lists". What do you mean by this?  Could you give an example?
 * Thanks!
 * [[Sam Korn ]] (smoddy) 22:41, 4 April 2009 (UTC)


 * Hi Sam, That is an interesting template, I wouldn't mind using a different template style that had the same result. I am aware that it is a shift from the "dominant" data storage paradigm, but both you and Anomie presented incidences where this paradigm is currently in use on wikipedia. I think applying this paradigm to several subsets of pages and discussing its adoption within local wiki communities before bringing it to the attention of the more bureaucratic global community is the best approach. Discussing it on a global level before any experimentation, metrics or user feedback has taken place is a sure way to breed heated/moot arguments based predominantly on conjecture. Grassroots approaches in the open source community are encouraged and often result in iterative design that provide solutions to problems as they occur. In this way, we can either:


 * Present a working model to the global community with positive results that can address criticisms with real experience.
 * or
 * Conclude that a working model cannot be established, and roll back the changes. (This can easily be done by a bot, because all incidences of its usage will be linked to the info template.) We can optional report our failure to the wiki community so others can learn from it.


 * For your second question related to making lists from categories: This bot will receive the content of a category, such as Category:C++ libraries, take the data from the Info boxes of groups in this category, and build a table which may be titled "List of C++ Libraries and Frameworks" where users can look at the libraries and sort them by license, platform, genre, etc. This is consistent with the WP handbook of style which encourages the co-existence of categories, tables, lists, etc.Gsonnenf (talk)


 * I disagree that giving something a trial and then assessing it is necessarily better than a well-formulated proposal. You can, after all, show exactly what this will look like -- a demonstration will gain nothing on a comprehensive proposal with good examples.
 * My engagement with was to repeatedly remove it!  I run a bot task that updates cricketers' statistics in infoboxes, making the template unnecessary.  It might be worth considering whether you could follow this method with a bot to do a similar task in updating many pages using simple wiki-text.
 * Whatever our conclusion about your main proposal, your plan for creating lists sounds excellent!
 * [[Sam Korn ]] (smoddy) 13:45, 8 April 2009 (UTC)


 * Hi Sam, I don't think the binary nature of Experiment Vs. Proposal fully reflects the situation. I believe that submitting a full proposal, that has a huge impact on Wikipedia, to a global group, without real world metrics or case studies, is a sure way to kill any proposal. If brought to a global committee there are great number of people who would appose this simply because it "changes" things or is performed by a "bot". Two of the primary arguments I see from these people:
 * * It would be too complicated to code. (Pure Conjecture : Answerable by case study)
 * * It would be too complicated for new users. (Pure Conjecture: Answerable by case study)
 * are based on conjecture. A good example of this is the current debate in Date formatting and linking poll/Autoformatting responses. A very large number of its opposition liberally state that its "too complicated to program correctly" and too difficult for new users. These statements are purely speculative. It is not know how new users will respond, and the difficult of the program is subjective to the experience of the programmer. A case study could answer these questions.


 * So I am thus making this less formal proposal within a local community, (C++ and Bots), so we can study the users reaction and adaptation to the format. This method is analogous to SBIR/STIR Phase I, grants wherein small amounts of money are given to study "high risk/high benefit" ideas in an effort to reduce the risk(which is often attributed only to lack of data) and prove the benefit. If a phase is successful, Phase II/III grants are usually granted with much higher effect and consequence but are subjective to much higher scrutiny. There are many other analogous systems within academia and industries.


 * The idea of updating these pages using a maintenance bot would work, but then these pages would be entirely dependent on me. I'd have to keep a list of places the data was used, and other users wouldn't be able to create lists using it without my intervention. An idea behind template driven data is that its not dependent on the bot. The bot is only used for an initial migration.Gsonnenf (talk)

It looks like this bot is intended to create for C++ software libraries the same sort of setup we have for the country data templates used by flag and the like. The question I have is whether each of these data templates would have any use besides just two articles; for example, where would the Qt data be used besides an infobox in Qt (toolkit) and a row in a table in List of widget toolkits? Anomie⚔ 17:42, 5 April 2009 (UTC)


 * Hi Anomie, The following is an incomplete list of QT occurances in lists. More could exist that I didn't locate.


 * infobox
 * Qt (toolkit)


 * table
 * List of widget toolkits
 * List of formerly proprietary software
 * List_of_PIGUI_packages
 * List of trademarked open source software


 * list
 * Graphical_user_interface_builder
 * List_of_software_engineering_topics


 * (Tenative)
 * List of C++ Frameworks and Libraries (A table I intend to create)


 * Please note that if we were to link these lists via info templates, we could automatically generate a list of pages that linked to the info page using built in Wikipedia search. Another thing to note, is that the existence of an info page can increase its usage (hopefully within useful lists). I'd be interested in knowing how the country template affected the usage of its contained data. Gsonnenf (talk)

No activity or discussion in a month. – Quadell (talk) 13:11, 8 May 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.