Wikipedia talk:Wikipedia Signpost/2014-01-29/Traffic report


 * Re ""why is important information like view counts outsourced to volunteer servers liable to crash or lose functionality?" - because until recently the analytics team was two men and a dog ;p. I understand that a pageviews API is in the pipeline, just as soon as we've (a) built a system that can actually host the pageviews data and (b) worked out what precisely a pageview is. Ironholds (talk) 05:32, 2 February 2014 (UTC)
 * Referer data should be available in a way that doesnt compromise privacy. i.e. aggregate data, just like access log data.  e.g. how many hits come from Google, Facebook, BBC, etc is not a privacy problem.  Also when a single Internet webpage includes a link to a Wikipedia page that sends more than 200,000 hits our way in a month, that referer data is really useful and doesnt affect privacy.  Even at low referrer rates for smaller timespans, e.g. 100 or so per day, there is no issue.  I couldnt find a relevant enhancement in the bugzilla database. John Vandenberg (chat) 08:15, 2 February 2014 (UTC)
 * That data is in the request logs, so building a system that can even store those logs is a dependency. Ironholds (talk) 03:52, 3 February 2014 (UTC)
 * , 'is' should be 'isnt'? John Vandenberg (chat) 04:35, 3 February 2014 (UTC)
 * No, the referrer for a page is in the request logs as they currently stand. The problem is that the volume of requests is such that the WMF is still building a system that can store them, which is kind of necessary to have, say, a reliable API for this kind of information. As an example; if we look just at Mobile, and ignore everything that isn't a direct request (in other words, ignore requests for page elements), we're talking 70 million rows of data every day. Ironholds (talk) 04:45, 3 February 2014 (UTC)
 * Thanks for clarifying. So you have the request logs stored.  It isnt a new scale problem.  It is at most a 2x expansion.  The WMF already provides raw pageview data, and academics and hobbyists would want raw referer data that would look almost identical to the raw pageview data.
 * A separate issue is providing a user-friendly system for accessing the data. WMF pageviews infrastructure is a lot more mature now, and WMF may want to deliver referer analytics tools in the new style of infrastructure, but that is a layer on top.  And a new system to process referer isnt much different from processing pageviews, so it should be fairly straightforward since the WMF has conquered most of the relevant problems with pageviews. John Vandenberg (chat) 10:54, 3 February 2014 (UTC)