User talk:OsamaK/June 2012

(the following orphaned text note belongs with the June archive --Lexein (talk) 14:40, 28 October 2012 (UTC))

I'm reviving this discussion here from archive, because I have a better response to Jimmy Wales --Lexein (talk) 02:43, 25 June 2012 (UTC)

Alexa Bot
Hi Osama! Long time, no chat!

I'm curious as to what you might think about generally replacing alexa statistics throughout English Wikipedia with quantcast numbers. In the industry, Quantcast numbers are considered to be much more accurate, and Alexa numbers to be highly questionable. At the very least, I wonder what the possibility might be of updating these numbers in infoboxes to include both, rather than Alexa alone.--Jimbo Wales (talk) 11:35, 2 April 2012 (UTC)
 * Hi, Jimmy! I think it will be easy to adapt the code to any ranking services but it seems that Quantcast only provides US-based rankings. If this is actually true, I wonder if we should adopt it and be US-centric. What do you think?--OsamaK (talk) 06:15, 3 April 2012 (UTC)
 * Hmm, that's a good point. I don't know.  Maybe if we report both with a notation?  Like, give the Alexa one first, and then the Quantcast one (with "US only" in parentheses). We should probably seek advice from others!--Jimbo Wales (talk) 11:04, 3 April 2012 (UTC)
 * Jimbo - what's a good source for industry analysis of Alexa vs Quantcast vs others? I'm a little surprised that Alexa is questionable, since it's affiliated with Internet Archive and the Wayback Machine (or was, anyways). --Lexein (talk) 15:07, 3 April 2012 (UTC)
 * I don't think it was ever affiliated with Internet Archive and the Wayback Machine. It is an Amazon company and has been for many years.  I should clarify that I don't think it is that Alexa is in any way morally questionable.  It's just that their methodology and accuracy has been questioned by many.  Here is a typical example.  Quantcast, on the other hand, directly measures traffic for lots of sites who are signed up to be "Quantified".  See This list to see many examples of quantified sites.  For those sites, the data is as accurate as you are going to find anywhere, because they directly measure the traffic with an invisible 1x1 pixel.  They extrapolate in various ways (I don't know exactly) to get traffic figures for sites that are not Quantified.  Quantcast simply has a lot more data to work with and will therefore be a lot more accurate overall.--Jimbo Wales (talk) 09:42, 4 April 2012 (UTC)
 * Ah - hm, dunno where I got the idea about an Alexa/IA connection. Fair enough. -Lexein (talk) 02:37, 6 April 2012 (UTC) See below. --Lexein
 * I added the suggestion here.--OsamaK (talk) 06:31, 7 April 2012 (UTC)


 * Quantcast, as Jimbo noted, provides accurate stats only for entities which subscribe to be "quantified". For all nonsubscribers such as Flickr, it declares This publisher has not implemented Quantcast Measurement. Data is estimated and not verified by Quantcast. Get Quantified!™ 
 * Nitpick: Alexa and Internet Archive have been "affiliated" or a very long time, though only for archive purposes, not for web statistics. From the Alexa Internet article, we have: Alexa's operation includes archiving of webpages as they are crawled. This database served as the basis for the creation of the Internet Archive accessible through the Wayback Machine. In 1998, the company donated a copy of the archive, two terabytes in size, to the Library of Congress. Alexa continues to supply the Internet Archive with Web crawls.
 * The critique Jimmy found, and the one it references are both outdated in terms of browser and OS support, but the main point that the statistics are undersampled is valid.
 * If Alexa's traffic and ranking data are partially estimated (without being declared so), and are based on a limited number of sources (users), should we be relying on it at all, per WP:RS, since RS prohibits using user-generated content? If Quantcast's rankings for non-"quantified" sites are estimated (and declared so), should we be using ranking #s at all?  I mean, the assignment of a rank implies certainty, where certainty is not assured by the source.
 * If we use any ranking in infoboxes, should it be from only Quantcast "quantified" sites?
 * Or should we only post rankings where Alexa and Quantcast agree? Posting both seems like having two clocks - you'll never know which is really right. --Lexein (talk) 02:43, 25 June 2012 (UTC)

Why not starting a separated RfC at the discussion page of the infobox(es)? mabdul 08:07, 25 June 2012 (UTC)
 * I don't mind moving the discussion. JW raised a point, which closer examination has made very interesting (to me). So I thought the thread deserved a more complete response. I don't know: should the use of Alexa or Quantcast be discussed at the infobox(es), or RSN? I guess I'd hope that the whole thread would be transcluded... --Lexein (talk) 16:11, 25 June 2012 (UTC)