User:Kpjas/Proposed keyword search mechanism


 * Please remember that this text by Krzysztof P. Jasiutowicz is of historical value only.

Search mechanism on Wikipedia is rather basic. As Wikipedia grows it will become more and more inefficient and inadeqate. We are not aware of any plans of making Wikipedia software more robust to take into account the size of encyclopedic data being gathered ( at an astonishing speed) and hopefully growing number of Wikipedia end-users. BTW we have no idea what are even approximate figures on different Wikipedia page hits. Sigh. The another point is that Wikipedia is weak at categorizing its data. Some efforts are made but with arrival of EB articles, they will be definitely lagging behind. Why is Google successful and Altavista is not ? Because Google has an ingenious system of categorizing and ranking Web pages. I have only vague idea about the software of Wikipedia but perhaps my idea might be viable. The mechanism :

* Each page should have a subpage called "Keywords" * on the page the author/editor put keywords describing the page's data * on each line there's one or more keywords * keywords are delimited (white space or comma) * the keywords are ranked * ranking is in top down order * the higher a keyword on the page the higher its relevance * each line ends with * lines beginning with "-" are skipped * ranking is absolute - on all pages keywords in the same line have the same weight

Then suitable software can easily search those pages digest them and present a nice search results. Ha.

What do you think ? Or maybe a much much better software solution is currently under way at Bomis ? My 2 cents. Kpjas