Wikipedia talk:Top 10 Google hits

Not using exact matches on phrases
All the ones I've added have turned up in the top 10 without using quotations to return an exact match. (I'm guessing that's the way most people search.) --KQ 11:21 Sep 2, 2002 (PDT)


 * I wonder if we should standardize on using the q argument for Google's search for single-word searches and the as_epq argument for multi-word searches (which parses them as a phrase, not as a group of words)? The results differ... --Shallot 22:02, 6 Apr 2004 (UTC)


 * Some search links also include the argument X=1, which may be an old name for the present exact match (as_epq). --Shallot 22:10, 6 Apr 2004 (UTC)

Strangely titled articles
This page has some misleading hits, because we titled our articles strangely. "ECMAScript programming language" is indeed hit #1 on a search for "ECMAScript Programming Language" (without the quotes). But who would search for that? A more likely search "ECMAScript" has the Wikipedia article at #22. DanKeshet

Google+Wikipedia outage
At the moment, Google doesn't seem to be returning any wikipedia articles as hits (though it still comes up with the front page if you search for "wikipedia"). Should we be worried about this? --Camembert 16:22 Jan 5, 2003 (UTC)


 * Yes! We should be very worried, Google is one of our main sources of new contributors. Does anyone know anything about this? Enchanter


 * Looking more closely, it seems to have the urls indexed, but for some reason hasn't cached any of the texts or page titles from the /wiki directory (see results from, for example). Why this has happened, I don't know. Could it have crawled us when we were down, or something? I'm guessing, not something I know anything about, this. --Camembert

Removal policy
What is the policy on removing entries here if they are not in the top ten? I would suggest flagging them if they are in the top 20, and removing them if they are not. WDYT? --snoyes 02:42 Feb 21, 2003 (UTC)


 * Well, the current Google situation is a bit problematic. We dropped off the index due to a robots.txt mishap for a while, now we have reappeared, but not all pages seem to be indexed. I'd suggest waiting until mid-March or so, which is when a second Google index round should be complete. Then we should go about deleting the pages which no longer show up. --Eloquence


 * Yeah we still haven't recovered from dropping-off Google and I don't expect we will be back where we were before the drop-off until late spring. It's a real bummer that we got all the media attention at a time when our articles lost their previously very high Google ratings. --mav

Large article
This article is now Large (32k+). I recommend that we reorganise it. It also seems to me that it would be a good idea to include the date on which the article was in the "top ten" since the top ten is pretty dynamic over time. The two ideas could be combined if we split the page into monthly or quarterly sub-pages, something like this:
 * 2002
 * Top 10 Google hits (2002, Jan-Mar)
 * Top 10 Google hits (2002, Apr-Jun)
 * Top 10 Google hits (2002, Jul-Sep)
 * Top 10 Google hits (2002, Oct-Dec)
 * 2003
 * Top 10 Google hits (2003, Jan-Mar)
 * Top 10 Google hits (2003, Apr-Jun)
 * Top 10 Google hits (2003, Jul-Sep)
 * Top 10 Google hits (2003, Oct-Dec)
 * etc.

Of course the current page doesn't have dates, so it's a bit difficult to do this retrospectively but it could be done by anybody (or anybot) willing to consult the page history. Doing this would also make it easy to drop old entries which were of lesser interest if need be. -- Derek Ross


 * This is probably a good idea, though I should probably point out that we still haven't been fully reindexed by Google (glockenspiel, for example). --Camembert


 * The date idea would not be very maintainable. Breaking the list up alphabetically is the better choice. --mav

Red links
Interestingly, red links seem to get spidered by Google, such as Braga I thought our robots.txt file sorted this out? Martin


 * Yep - that is a bug. It might be a leftover from when we had the Google problems. --mav

Title not NPOV
The title of this page is not NPOV. I, like others, dont really agree, that we should "bundle" WP so tight with a private Company. --Nerd 15:11 Mar 20, 2003 (UTC)


 * Wikipedia is not being bundled. Google handles around 80% of all English language searches made and the Wikipedia list is not either greatly boosting the reputation of the company or acting a s much of an advert. Wikipedia's high pagerank is pleasing and the article reflects the value people put on Google searches to return accurate results. There is a certain pleasure-perturbation in seeing Wiki material top a Google search. 62.253.64.7


 * but there are some opinions stated like http://www.google-watch.org/pagerank.html, saying that pageranking is rather creating popularity than expressing it. :) Nevertheless i do think that this title is not neutral: why "Google" and not "alltheweb" or somethings else. Or should we just mention searchengines, which are bringing us traffic? Is this really neutral? --Nerd


 * Why "Google"? Because they refer to results on the Google search engine.  If you want to add results for other search engines, add them to another page--this one has already grown too big to fit on one in any easily editable form.


 * I think your argument is essentially a non-starter: it's not asserting anything controversial or subject to interpretation; it's asserting something that is easily and independently verifiable. In other words, we're not saying "No one should see Midnight Cowboy because it's base and disgusting"; rather, we're saying something more akin to "2 + 2 = 4"; and anyone who comes along can count the pieces and see if indeed they do.  You might as well argue that we shouldn't have lists of Academy Awards winners because there are other awards for films like the ones at Cannes and the Golden Raspberries.  (Personally, I find the Raspberry winners inevitably entertaining, but for all the wrong reasons).  Koyaanis Qatsi 19:44 24 Jul 2003 (UTC)

Safe vs. non-safe search(?)
I found a good website taht maybe useful for this: (I used Eurovision Song Contest 1956) as an example and we come 1st for both safe search and non safe search. -fonzy

Google hits for Wikipedia articles
Moved from Village pump on Thursday, September 25th, 02003.

Just as an experiment, I searched in Google for topics I was interested in/ contributed to/ created/ whatever in Wikipedia. In many a case, Wikipedia articles seem to feature within the first two pages! Has something changed so that the whole site has a better weightage, or is it something to do with individual articles? I had tried a month ago and did not get any hits.In any case, I think the responsibility of giving factual information has increased tremendously. KRS 03:03, 21 Sep 2003 (UTC)


 * I'd noticed this. I suspect Google has given Wikipedia articles a built-in higher weighting ... since most Wikipedia articles won't be linked-to from outside, and thus wouldn't rate high with Google's default algorithms.


 * Definitely a responsibility, I agree --Morven 03:08, 21 Sep 2003 (UTC)


 * It is very noticeable when you see a poor stub, go to Google to try to patch it up a bit and find the WP page is #1 (often because the page title is worded obscurely). The overall WP average is pushing these pages up above the good pages out there on the web on these subjects! :) Pete 11:21, 21 Sep 2003 (UTC)


 * There is definitely special consideration from Google. Oftentimes, even minutes after a new page is created, it registers in the Google search results, which I find fascinating. Fuzheado


 * A colleague of mine herared a talk from someone from Google, giving information on their page ranking. One thing that would probably do Wikipedia pages much good, is that if a page is linked to with a link text that corresponds to the search text, this is a high plus in ranking. Thus, Wikipedia-Wikipedia links might well give high rankings. Andre Engels 13:11, 22 Sep 2003 (UTC)

Automation?
Manually updating a topic like this seems crazy to me. It doesn't seem like it would be hard to write a script that Googles every Wikipedia page by its title (without quotes?), parses the first page of output, and collects statistics on the appearance of the Wikipedia page in the top 10. (I'm not sure how happy Google.com would be about queries, but we could always spread them out over a few weeks if necessary.)  Naturally, I don't have time to do this myself right now. ;) &mdash;Steven G. Johnson

What about our clones?
Seems like people who have taken a copy of Wikipedia can have higher pagerank than ours. For example, see, where http://www.wordiq.com and http://www.ezresult.com show up, but we don't.

Is this a good thing or a bad thing? Do we list such pages on this page?


 * This is a rather bad thing, as it makes it significantly less likely that someone who reads the article will actually contribute to it, or the encyclopedia in general. I have come to the conclusion that our clones _must_ be using various schemes of search engine optimisation in order to get a higher rank on google, given our prominence. I'm not sure what we could do about this, but those searches shouldn't be listed here if our results are not in the top 10. - snoyes 05:36, 7 Feb 2004 (UTC)