Talk:Opinion polling for the 2011 Canadian federal election

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Date of polling[edit]

Now that this has its own page, I'll move my reminder with it.

Remember that the Date of Polling in the chart is the last day on which the polling took place. When the poll was released (there's usually a date on the press release) is irrelevant, as pollsters or news agencies often sit on polls for some time in order to release them in a way to receive broader coverage. For example, the Ipsos-Reid poll released on January 24, 2010, was conducted over three days, January 19-21. So the Date of Polling should be January 21, 2010. Again, the date the poll results were released does not matter and has no place in the chart. Thank you. --Llewdor (talk) 19:20, 25 January 2010 (UTC)[reply]

Seriously, if you're linking to the actual poll results, take the time to read them to find out when the polling took place. The date on the press released DOES NOT MATTER. It's always mentioned in the methodology description at the end, and it's often mentioned in the opening paragraphs. Listing "June 3" just because that's the day the results were printed in the newspaper is just lazy. Somewhere in the official poll results there will be a line telling you that the polling took place from May 28 to June 1, and then the date you want to list in the table in June 1. Please try. --Llewdor (talk) 19:18, 5 June 2010 (UTC)[reply]
When I add results, which I usually get from [1], the date that the poll ended is underneath the table. This is correct date, right? I think so, but I want to make sure that I'm doing it correctly. Bkissin (talk) 19:48, 5 June 2010 (UTC)[reply]
You are doing it correctly. The Date of Polling is the last day on which the poll took place. Listing any other day, or even the full range, would be incorrect. --Llewdor (talk) 20:08, 9 August 2010 (UTC)[reply]

Graph[edit]

I created this file with Excel using data from Opinion polling in the 41st Canadian federal election. It is a graph of the data compiled in the article, as well as the average of the data.

Please advise if it can be included in the main article.


—Preceding unsigned comment added by Bob Traver (talkcontribs) 22:18, 2 February 2010 (UTC)[reply]

I like it, but not so big ;) Ottawa4ever (talk) 22:34, 2 February 2010 (UTC)[reply]
Its seems too jagged. During the discussion to create this page, New Zealand '08 was mentioned. I think that the graph should have a line of best fit like at Opinion polling for the New Zealand general election, 2008#Graphical summary. 117Avenue (talk) 04:22, 3 February 2010 (UTC)[reply]
I agree that the new zealand graph is prettier on the eyes, but the idea of trends just opens it up to some points of view and original research issues. For my tastes (maybe be others i dont know...) I like raw data that I can see and allow myself as the reader to interpret the way id like. Just my thoughts Ottawa4ever (talk) 12:49, 3 February 2010 (UTC)[reply]
Hi all. Thanks for your input and a special thanks to Rrius for shrinking the graph :) As for the style of the graph; this is where I admit my somewhat-Pakled nature and admit I'm not sure how to do it. I'll check it out and get back to you. Can-eh-dian Redhead (talk) 22:53, 3 February 2010 (UTC)[reply]
Would it be benifical to put it at the top of the article, rather than the end? Ottawa4ever (talk) 09:48, 5 February 2010 (UTC)[reply]
I think so too... I did it before reading this. 117Avenue (talk) 03:46, 6 February 2010 (UTC)[reply]
How about a scatterplot? If a line of best fit would be too much interpretation, just show the data points and don't draw the line. I find a scatterplot if the best way to illustrate trends in this sort of dataset without making those trends explicit. --Llewdor (talk) 17:56, 8 February 2010 (UTC)[reply]

Does new month mean updated graph? 117Avenue (talk) 21:50, 6 March 2010 (UTC)[reply]

How was this new graph calculated? I'm not seeing enough points for all the polls that have taken place. 117Avenue (talk) 23:28, 30 April 2010 (UTC)[reply]
I would like to know this too, it appears that the graph is only showing one poll. (I may be wrong though). but certaintly not the actual raw data of each poll. Seems to be averages of all polls, which i would disagree with as i think the data shouldnt be interpreted in that way but should be presented raw. I am being bold and reverting back for the time being and applying a note on the users page. Please feel free to revert my edit if you disagree with my reasoning. Ottawa4ever (talk) 08:59, 1 May 2010 (UTC)[reply]
I'm not sure why we can't do scatterpoint with a line for the average. Even just an average is okay because it is simply a summary of the data already set out. In any event, we need a some sort of update as the current graph is only current through January. -Rrius (talk) 01:17, 2 May 2010 (UTC)[reply]
using an average (unless you have a source which shows it) i think is Original research. We are introducing unpublished ideas by displaying an average in the graph (Its not a routine calc like simply adding two numbers and can be done in different ways). Besides that, each poll is conducted differently (different sample sizes, queries etc) and by calculating an average we are introducing equal weight on each poll. However in the event of showing new vs old information, I personally would prefer non OR that is out of date compared to OR that is up to date. I think perferably a plot showing raw data is most perferred still that can be up to date. But If everyone (consensus) perfers and agrees on an average than I wont argue the issue further. Ottawa4ever (talk) 08:57, 2 May 2010 (UTC)[reply]
It is not OR, it is a summary. -Rrius (talk) 23:43, 3 May 2010 (UTC)[reply]
I just dont see it that way, unless im interpreting WP:OR entirely wrong; Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any of the sources. If one reliable source says A, and another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned by either of the sources. This would be a synthesis of published material to advance a new position, which is original research.[5] "A and B, therefore C" is acceptable only if a reliable source has published the same argument in relation to the topic of the article.-from WP:OR under SYNTH. I dont see how we can just calculate an average of all polls (especially considering the circumstances of each poll are different as mentioned before) and apply it without a source that does this. Ottawa4ever (talk) 08:53, 4 May 2010 (UTC)[reply]

This graph is now almost seven months out of date, unless someone can bring it up to date I think it ought to be removed from the article as it misleads readers as to the current state of affairs. - Ahunt (talk) 10:34, 3 November 2010 (UTC)[reply]

Agreed, it needs to be updated at least once a month. 117Avenue (talk) 18:46, 3 November 2010 (UTC)[reply]
It has now been a week since I proposed this. In accordance with WP:SILENCE I think we have a consensus. Since no one has come forward to update the graph I will remove it. If someone updates it and agrees to keep it up to date then it can go back in. - Ahunt (talk) 23:07, 9 November 2010 (UTC)[reply]
I have managed to use OpenOffice.org Calc to create an entirely new graph for this article based on the data from the article. Since I know have this I can keep it up to date as time goes by, so I will reinsert it into the article. Comments are welcome. - Ahunt (talk) 02:08, 15 January 2011 (UTC)[reply]
Could be cropped, I think there's a lot of white space. 117Avenue (talk) 05:29, 15 January 2011 (UTC)[reply]
I like it (thanks for taking the time to do this!), but 117 is right some white space could be trimmed. Ottawa4ever (talk) 11:34, 15 January 2011 (UTC)[reply]
Thanks for the feedback and the formatting. I have cropped it to reduce the border whitespace and can make further adjustments as time goes by. - Ahunt (talk) 13:20, 15 January 2011 (UTC)[reply]
Since there are no datapoints at 55% or higher, you could probably improve the image by reducing the range of the vertical axis to 0-55, rather than 0-60. Unless you've already tried that and decided it didn't work. Just a thought. --Llewdor (talk) 19:51, 3 March 2011 (UTC)[reply]
Okay, I have done that, see what you think. Of course if any party gains some real popularity I'll have to expand the vertical scale, but probably not for long before the election would be called! - Ahunt (talk) 20:59, 3 March 2011 (UTC)[reply]
I think it's terrific. As you point out, if any party got anywhere near that outside of some weird crisis (those datapoints over 50% occured during the 2008 Canadian parliamentary dispute), we'd see an election immediately. I'm pretty confident the 0-55 range will be sufficient. --Llewdor (talk) 18:50, 15 March 2011 (UTC)[reply]

Updated image[edit]

Had a go at making a rough version of the updated graph; Please feel free to recommend additional improvements. I can also send the excel document if any one wants to work on it.Ottawa4ever (talk) 17:37, 2 May 2010 (UTC)[reply]

I tried to update my original image, but...[edit]

I'm trying to do my best to update the image, but since I created it, somehow it was moved to something called the commons. I have no idea or appriciation for what this bloody "commons" is, but I joined it anyway in the hopes of updating the poll graph. I did all the work in excel, but when I tried to upload it, the commons told me I'm "too new" a member to update the file. This sort of frustrating thing makes me want to cease contributing all together.

Call me a Pakled, but why the hell can't I just pull up the file THAT I CREATED and hit one button to upload/update the damn thing?!?!?!? Given some of the subsequent comments, perhaps other Wikipedians who seem to be less than satisfied with my original graph (Ottawa4ever, perhaps?) can take over and update the graph to their liking.

--Can-eh-dian Redhead (talk) 15:20, 15 May 2010 (UTC) aka a frustrated and disgruntaled semi-wikipedian[reply]

You are more than welcome to upload the file under the name of the graph file above that I proposed above (replace the image). I do not beleive the one i created was moved to commons (yet). You could then subsequently /update your original after on commons when you are no longer 'new' there. I may express my dis-satisfaction with averages being interpreted/ plotted on the graphs and prefer simply raw data. But that is my interpretation of OR policy guidelines. As you can see others disagree with me at times here. I accept that, and respect that, and hence I havent posted the image i made on the page yet unless others were to agree with me. Do feel free to make the changes you think are a must, be bold, go ahead, we can improve images as we go with greater parcipitation here and iron out any kinks with some consensus. As for experinces with commons, The basic idea is to make the image available for other wikipedia projects, like those in french for instance to use the graph. It can be a bit confusing though to work with Ive had a few problems with images, so i can relate to the pain of needing a one click solution. Ottawa4ever (talk) 20:30, 15 May 2010 (UTC)[reply]

Margins of error[edit]

As per User:142.104.139.204's edit, aren't margins of error meaningless without stating the confidence interval? —Arctic Gnome (talkcontribs) 06:37, 28 July 2010 (UTC)[reply]

If they are to be included in this table, shouldn't they be included for all polls? In my opinion sample size and margin of error don't need to be said, the columns were created without discussion. 117Avenue (talk) 06:57, 28 July 2010 (UTC)[reply]
We can drop sample size, but I think that margin of error is important to mention. Too often in news comment boards I see people either a) making a big deal about changes in polls that are within the margin of error, or b) complaining that the poll can't be meaningful because it didn't include all Canadians. Telling people the exact margin of error helps clarify what the numbers actually mean. —Arctic Gnome (talkcontribs) 07:36, 28 July 2010 (UTC)[reply]
Well b) is just stupid, everyone should know that a country wide election is a lot of tax dollars. 117Avenue (talk) 07:44, 28 July 2010 (UTC)[reply]
Im very much in favour of reporting sample sizes and would think they are as important as listing errors. They do go hand in hand and are related. Both sets of information give that data a neutral presentation Ottawa4ever (talk) 08:08, 28 July 2010 (UTC)[reply]
The only important part I see the sample size playing in this kind of data is determining the margin of error, which is also reported. I don't see why we need to report both. If one study has a small sample size, it will have an accordingly large margin of error, so I don't see why we actually need to report the sample size. —Arctic Gnome (talkcontribs) 17:21, 5 August 2010 (UTC)[reply]
Point taken. Im just wondering if the average reader sees that correlation, but providing the links are there, Ok consider my opposition withdrawn. So long as the margin of error is still presented I wouldnt have any objection now with the removal of the sample size. Ottawa4ever (talk) 18:43, 5 August 2010 (UTC)[reply]
Thanks. I strongly support the margin of error being there, so there shouldn't be a worry of that changing. —Arctic Gnome (talkcontribs) 18:50, 5 August 2010 (UTC)[reply]

The margin of error needs to be added for all polls, this is an incomplete table. 117Avenue (talk) 19:16, 5 August 2010 (UTC)[reply]

We could calculate the margin of error for any poll where we have the sample size, but that might constitute original research. --Llewdor (talk) 20:06, 9 August 2010 (UTC)[reply]
Any of the ones that report the sample size will probably also report the margin of error, so I don't think we'll have a problem, we just need to get it done. The main thing to remember is to use the margin of error from the responses to the voting question, not the margin of error of the whole sample. —Arctic Gnome (talkcontribs) 04:47, 10 August 2010 (UTC)[reply]

Ipsos Reid[edit]

Has anyone else noticed that the new Ipsos poll adds to 101% and not 100 (39+25+18+10+9=101).....Probably a reason of rounding, but still weird Ottawa4ever (talk) 14:20, 15 February 2011 (UTC)[reply]

I imagine it is a rounding off error as they seem to always report in whole percents and not decimals. Actually some of the other polls that report decimals and then indicate that the accuracy is +/- 3.5% aren't adding any accuracy with those decimal points. - Ahunt (talk) 15:30, 15 February 2011 (UTC)[reply]

Angus Reid poll 11-18 February 2011[edit]

I recently found this article that describes an exclusive poll for the Toronto Star. It doesn't seem to be available anywhere else including on the Angus Reid website. The problem with the Star story is that the data is incomplete, once again the Bloc national numbers are missing, meaning it can't be used unless they can be found. - Ahunt (talk) 19:10, 26 February 2011 (UTC)[reply]

Looks like User:Llewdor found it, thank you! I have updated the graph to include that data. - Ahunt (talk) 21:58, 26 February 2011 (UTC)[reply]

Graph for the election period[edit]

Hey Everybody!

So, with an election looming, my thought was that we should create a new graph for polling during the election period, much like they did during the Last British general election. This does not mean getting rid of the full chart between elections, however. It's just to give the reader a more accurate sense of the polls during the election period.

Thoughts?

Bkissin (talk) 00:38, 25 March 2011 (UTC)[reply]

I take care of the current graph. I was thinking that it might be more useful to mark the election call date (today) with a line on the current graph, that will show the trends both before and after the call, (in other words how the election call affected the poll results), but I can do a seperate graph easily if there is a consensus to go that route instead. Lets's see if we can get some input on this here and I will be happy to do it either way. - Ahunt (talk) 14:36, 25 March 2011 (UTC)[reply]
I cant remember but im assuming the polls will be more frequent, in which case i would support an additional graph with just the election polling. A line is a nice touch though per Ahunt. Ottawa4ever (talk) 12:09, 26 March 2011 (UTC)[reply]
The polls will definitely be more frequent. I expect that we will have about 6 polls a week X 5 weeks or so, for a total of somewhere around 30 data points. I can always do both, adding the election polling to the larger graph with a line to show where the election started (which is today officially) and also having a seperate detail graph that just shows the election period. That is easy to do and I think would keep everyone happy! - Ahunt (talk) 12:25, 26 March 2011 (UTC)[reply]
We don't seem to have reached a consensus on this yet. Today we got our first post-election-call poll so I have added it to the existing graph, with a line to show the polls after the call. See what you think. - Ahunt (talk) 15:05, 28 March 2011 (UTC)[reply]

If someone is planning on making a new graph for the election period, may I point out the graph that was made for the previous election: Opinion polling in the Canadian federal election, 2008. The scatterplot style makes the results easy to read and interpret. (I personally find the current graph here hard to read). I also like the margin of error bands graphically represented in the 2008 graph. But many thanks to the person/people who are keeping this up to date. Best, Eb.eric (talk) 19:10, 28 March 2011 (UTC)[reply]

I have seen that graph and thought that it really lacked detail. I am doing the current graph and the software I am using won't do the sort of graph you are talking about, but perhaps I can find an application that will. - Ahunt (talk) 19:41, 28 March 2011 (UTC)[reply]
I don't understand what you mean by lacked detail... as far as I can tell it has everything the current graph has (dates, trend line, legend) but more as well (individual polls plotted, margin of error). Maybe you just mean the overall appearance, which is a fair comment. Good luck finding software that will let you plot the data like the 2008 graph. Perhaps someone can recommend something? Cheers Eb.eric (talk) 22:13, 28 March 2011 (UTC)[reply]

I would support a separate graph if it wasn't a line graph like the current one. Like I have expressed previously (3 February 2010), a line graph like this is too up and down, it doesn't display trends because the high and low margin is 5 points tall. If you take the busiest month from the current graph, and stretch it horizontally, I think that you won't see the proper trend waves. The graph would either have to be made by a program that can calculate a line of best fit (allegedly computers aren't bias), or only one polling firm can be graphed (I don't know if that means different shades of blue for each firm's Conservative numbers etc, or separate graphs for each firm). 117Avenue (talk) 23:48, 28 March 2011 (UTC)[reply]

Good thoughts. My own graphing limitations may preclude your former suggestion, but I may be able to make the latter work. Let me give it a try and post some results here tomorrow on the talk page before they go live to see what everyone thinks. We have a little bit of time before we have enough data to make a really nice looking graph that means something. - Ahunt (talk) 00:17, 29 March 2011 (UTC)[reply]

Ipsos Reid March 23, 2011[edit]

This needs a link to the actual study not the National Post article where it is cited. —Preceding unsigned comment added by 134.117.251.4 (talk) 01:05, 25 March 2011 (UTC)[reply]

Yes it should be, but it doesn't seem to be available publicly, just behind a pay-wall. If you can find the link it can be added. - Ahunt (talk) 14:44, 25 March 2011 (UTC)[reply]

Telephone or Online Poll[edit]

Should we create a new column to indicate whether it was an online or a telephone poll? We could start with the in-campaign polls. I think there is certainly a difference in the numbers that come up in online vs telephone polls and could help to locate different trends. Thoughts? Krazytea(talk) 00:29, 29 March 2011 (UTC)[reply]

I don't have any qualms about this. Anewshore (talk) 01:23, 29 March 2011 (UTC)[reply]

Unless we can show a documented validity difference I don't see much point to this. Right now the numbers vary far more based on the polling firm and the questions asked than the polling method. - Ahunt (talk) 05:15, 29 March 2011 (UTC)[reply]
It's more information for the reader. There is an important theoretical difference between the two approaches. Online panels are not random selection--each Canadian is not equally as likely to be picked to participate. I'm not saying phone samples are perfect either, but they follow the random selection model. Anewshore (talk) 14:07, 29 March 2011 (UTC)[reply]
Actually the problem is is differentiating the different kinds of on-line surveys. The ones where anyone can answer them, like news-site polls, are no where near random surveys, have no scientific validity and thus shouldn't be included here. Most polling firms these days do on-line surveying by randomly picking people via phone call and then asking them to fill in an on-line form versus a telephone survey. This saves phone manpower and allows direct input onto the database, again saving manpower for data input. The results are quite similar to telephone surveys these days, because the selection is done by phone in both cases. Of course none of the polls here are strictly random, because people can opt out, and especially because certain demographics no longer answer their phones to unknown numbers, skewing the data towards seniors, etc, who do. - Ahunt (talk) 14:22, 29 March 2011 (UTC)[reply]
There are still different ways to interpret the answers. Phone surveys can be subject to interviewer bias and as you argue some people do not have a land line in which to answer surveys, so to do online surveys miss a demographic of people who may not have easy access to the internet or perhaps are not technologically literate. Either way I think any information that is added for the reader allows them to come up with their own determinations, the online/phone column to me adds to the information rather than distracts. Krazytea(talk) 21:27, 29 March 2011 (UTC)[reply]
Found a good quote "There is also vigorous disagreement over the merits of online vs. telephone polling. Online surveys tend to be skewed in favour of younger, more affluent voters. Telephone surveys tend to over-represent older, rural and lower-income voters. Web polls are typically larger, but voice polls, which use random sampling, offer a better microcosm of the electorate." Link Krazytea(talk) 18:21, 1 April 2011 (UTC)[reply]
That is comparing random sample telephone polls with self-selected (eg anyone can participate) internet polls. The latter are non-scientific, but that is not the type of polling we are talking about here, where people are selected randomly by a polling company and asked to complete the poll online and where self-selected people cannot just jump in and complete the survey. That is actually why I object to the indication on the chart as there are two kinds of on-line surveys in current use, one valid and one not-valid and it does not differientate between them. - Ahunt (talk) 22:16, 1 April 2011 (UTC)[reply]

Scatterplot?[edit]

I'm not sure who is in charge of the graph, but I think a scatterplot would be much more suitable. Since there are so many polls, and there will be even more polls over the campaign trail, the lines make the data messy and hard to read on a graph. A scatterplot will help the reader identify trends and more clearly see where the poll data is.

What does everyone else think? — Preceding unsigned comment added by Anewshore (talkcontribs) 01:21, 29 March 2011 (UTC)[reply]

That would be me. As per above I was planning to work on a special election period graph as we get enough data to display and will try out a scatter plot to see if that will give a better picture. My initial inclination is to try, as suggested above, that we track each polling firm differently as it seems clear that different companies get different results, based on the wording of the questions, question order, etc. This really makes the results from one firm hard to compare with that of another and makes separate lines make more sense. - Ahunt (talk) 05:19, 29 March 2011 (UTC)[reply]
Thanks for the graphs, Ahunt! —Arctic Gnome (talkcontribs) 05:25, 29 March 2011 (UTC)[reply]
There is a bit of a history there. We had a graph at one point a while ago, but it wasn't kept up and got very out of date. I started a new graph with the raw data we have and my somewhat limited software and skills and that is what we have today, not ideal, perhaps but better than no graph at all. It has been improved by suggestions from other editors as described above. I am glad it is viewed as being of some value! - Ahunt (talk) 05:34, 29 March 2011 (UTC)[reply]
Thanks for the hard work guys. I do like the idea of having separate graphs for each firm, but didn't suggest it because I know that we all have day jobs! But if you want to do that, I think it would be the most accurate, but make sure there are clear dots for each poll. They can be connected with a line because it shouldn't get as messy as all the polls combined. Anewshore (talk) 14:04, 29 March 2011 (UTC)[reply]
Actually I have the great advantage of not having a day job! Yes that is exactly what I was thinking - a different coloured line for each polling company/party and clear dots and lines. Need a bit more data, but it should work. - Ahunt (talk) 14:14, 29 March 2011 (UTC)[reply]
(edit conflict) Hear, hear! Maybe we could keep the line graph for the last two years and add a more zoomed-in scatterplot with a line of best fit for the election, like we did here for the last election. —Arctic Gnome (talkcontribs) 05:20, 29 March 2011 (UTC)[reply]
That was my plan, that the existing graph would continue in its present form to show continuity and then we will have an election time-frame graph to show more detail. I am working on the latter, but we need to wait for a bit more data to arrive to show anything right now other than an almost blank field! As outlined above when I get it working and some more data to show the first trends I'll post it here on the talk page first, to get some critique before we go live on the article page. I am limited by the software I have and know how to use, but let's see what I can do. - Ahunt (talk) 05:34, 29 March 2011 (UTC)[reply]
Ahunt i wish there was something more appropriate than a barnstar for your graph work efforts here, your on the ball :) Ottawa4ever (talk) 20:18, 29 March 2011 (UTC)[reply]
I accept gifts of cash, too ;) Naw, it is all part of just building a great encyclopedia! - Ahunt (talk) 20:35, 29 March 2011 (UTC)[reply]

Hello! Anyone interested in me continuing the plots as done in the last election? http://en.wikipedia.org/wiki/Opinion_polling_in_the_Canadian_federal_election,_2008 galneweinhaw (talk) 22:24, 29 March 2011 (UTC)[reply]

You are a bit late to the party here, we have been working on this for a bit. See below. - Ahunt (talk) 23:55, 29 March 2011 (UTC)[reply]
Sorry, I didn't mean to butt in. I can provide you my R-script that generated the plots from last election if you want to try it? Just want to help provide pretty graphics =) galneweinhaw (talk) 16:52, 30 March 2011 (UTC)[reply]
I can't decide for the entire group, but I really like your graphics. Could you provide me with the script for my own personal use? Anewshore (talk) 17:55, 30 March 2011 (UTC)[reply]
My graphing capabilities are admittedly pretty limited, so if the consensus is that editors would rather see a 2008-style graph for the election period polling than the one I put together yesterday (see below and on the article page) then that is fine with me, as long as someone will commit to updating it daily while the election is going on.
On another note I would like to see the existing 2008-2011 graph retained as I think it provides a good long-term view and I am happy to keep updating it until 2 May. - Ahunt (talk) 18:03, 30 March 2011 (UTC)[reply]
Anewshore, here's the script I have so far, not totally up to date with the latest R and ggplot2 :
library(ggplot2)
polls <- read.csv("Polls2011b.csv", header=TRUE)
election <- read.csv("Election2008.csv", header=TRUE)
windows(width = 8, height = 5)
ggplot(polls, aes(x = Date, y = Popular_Support, colour=Party, size=Error, weight=Error)) +
scale_area(to = c(0,3), legend=F) +
#scale_shape(solid=FALSE)+
geom_point() +
#geom_pointrange(aes(ymin=Min, ymax=Max)) +
stat_smooth(span = 1.0) + 
scale_colour_manual(values = c(Conservative=alpha("blue",1),Liberal=alpha("red",1),NDP=alpha("orange",1), Green=alpha("green3",1),Bloc=alpha("turquoise4", 1))) +
geom_point(data=election, size=3, shape=5, legend=F) +
geom_point(data=election, legend=F) +
geom_text(data=election, legend=F, aes(label = c(37.6,26.2,18.2,"10.0",6.8)), size=3, hjust=-0.3, vjust=-0.2)+
#scale_x_date(format="%d-%b", major="4 days", minor="1 day", lim = c(as.Date("2011-03-24"), as.Date("2011-05-02"))) +
scale_x_continuous(name = "Date", limits=c(40625,40665), breaks=c(40625, 40629, 40633, 40637, 40641, 40645, 40649, 40653, 40657, 40661, 40665), labels=c('14 Oct 08\nElection', '27 Mar  11', '31 Mar 11', '4 Apr 11', '8 Apr 11', '12 Apr 11', '16 Apr 11', '20 Apr 11', '24 Apr 11', '28 Apr 11', '2 May 11\nElection')) +
scale_y_continuous(name = "% Popular Support", lim=c(0,45), expand=c(0,0)) +
opts(axis.text.x = theme_text(size = 9, angle = 90))+
opts(axis.title.x = theme_text(size = 8))+
opts(axis.text.y = theme_text(size = 10))+
opts(axis.title.y = theme_text(size = 10, angle = 90))

galneweinhaw (talk) 02:00, 2 April 2011 (UTC)[reply]

Updated script for anyone interested. galneweinhaw (talk) 19:49, 4 April 2011 (UTC)[reply]

Polling graph[edit]

Daily polling graph, 2011 election

Okay as per the requests above for a detailed graph for the actual election period - I have finished creating it and added the first three polls that we have. The graph will show polls for each day, with lines and dots for each polling firm X each party. Once the same company has their second poll the lines will join the two dots for the same company and party, showing trends. The same party has the same colour symbol and line regardless of polling company, but the combination of dots and colours is unique to each company/party combination as per the legend. As with the existing graph I will keep this up to date on a daily basis. Have a look and see what you think. If there are no required fixes then this can be added to the existing graph on the article page in the same "slider" format, probably right below the main graph. - Ahunt (talk) 23:50, 29 March 2011 (UTC)[reply]

The lines are just going to make it messy. I was thinking above that you meant a seperate graph for each polling firm. I think a scatterplot alone is fine (no lines). Differentiating the polling companies with different symbols (but same colour) is useful though -- although there will probabbly be a lot of overlap. Anewshore (talk) 00:35, 30 March 2011 (UTC)[reply]
I definitely like the idea, I was thinking after I had made the mixed firm graph suggestion, that the different lines didn't need to be different shades, the space between them can be our margin of error. I think though, that the legend is long, would there be a way to abbreviate it? Perhaps a list of coloured lines for parties, and a list of dots for the firms? 117Avenue (talk) 02:17, 30 March 2011 (UTC)[reply]
I think the lines are going to be needed or else it will just end up as a sea of dots, but perhaps we can see how it looks after a few polls from the same company are included, which will produce some lines. You can't see right now, because we don't have two polls from the same company, but I have made the lines very thin (0.2 mm) and light and therefore not too obtrusive. The lines are all the same colour for each party, so that they will be easy to correlate visually. Unfortunately the software picks the dot style and it is not selectable, meaning that if I have a legend for party colour and for dot-by-company it will make the legend even longer than it is now. The software is definitely the limitation there. - Ahunt (talk) 11:55, 30 March 2011 (UTC)[reply]
You can note as the polls come in I'll keep updating the graph, it now has the Nanos data from yesterday added. - Ahunt (talk) 12:47, 30 March 2011 (UTC)[reply]
Since there desn't seem to be any major objections I'll put it into the article and we can make adjustments as we go. For instance if people don't like the lines when they start to appear I can certainly cut them easily. - Ahunt (talk) 13:28, 30 March 2011 (UTC)[reply]
Scatterplots are userful. It's much better than having a bunch of lines that go up and down, as that just makes trends invisible. I'd argue that it wouldn't make it look like a sea of dots, instead you could see the up and downs and how much each poll varies. If it appears as a sea of dots, then that means we shouldn't take polling seriously at all, as there is no pattern to any of the polling companies results. However, if there is a pattern, then we can argue that there must be something in the polls. Look at 2008 election scatterplots (that has an average line - which you can add if you want, but it's not needed), if you just look at the dots you can see definate patterns. —Preceding unsigned comment added by 70.72.130.79 (talk) 15:17, 30 March 2011 (UTC)[reply]
That is a good point. I did some simulations both ways and I thought the lines helped more than hindered. I would suggest that we leave it for now and see how it looks in a day or two, since there are no lines yet. If the consensus is that people don't like the lines I can quickly remove them and I am happy to make the graph look as the consensus dictates. - Ahunt (talk) 17:56, 30 March 2011 (UTC)[reply]
It's confusing having the multiple dot shapes, and them not being attributed to anything. Microsoft Excel should be able to do it, I'll give it a try. 117Avenue (talk) 20:02, 30 March 2011 (UTC)[reply]
Agreed. The points themselves only need to represent two variables: Party and Polling Authority, and should therefore only have two degrees of variability (i.e. total number of symbol/colour combinations should equal the total number of parties*total number of different polling agencies. As it stands a square vs a triangle has no meaning. It would be much clearer if there was a one colour one party relationships, and a one shape on polling agency relationship. 199.213.163.136 (talk) 20:27, 30 March 2011 (UTC)[reply]
I think I just figured out how to make that happen - give me a few minutes here and I'll try to get a new version up! - Ahunt (talk) 22:13, 30 March 2011 (UTC)[reply]
Okay I got it working, with the same symbol for each polling company and the same colour for each party. I have to admit that this little project is teaching me a lot about Calc! I will next come up with a simplified legend. - Ahunt (talk) 22:46, 30 March 2011 (UTC)[reply]
Looks good. Except should the titles at the top be capitalized? 117Avenue (talk) 22:49, 30 March 2011 (UTC)[reply]
I have a new version with a new simplified legend now posted. The title is uppers/lowers right now, do you think it should be all caps? -Ahunt (talk) 23:25, 30 March 2011 (UTC)[reply]
No, per MOS:CAPS, everything after the first letter should be lower case. 117Avenue (talk) 00:34, 31 March 2011 (UTC)[reply]
No problem I will fix that with the next upload for both graphs! - Ahunt (talk) 15:27, 31 March 2011 (UTC)[reply]
Okay the title is fixed, thanks for pointing that out. I uploaded new graphs for the Nanos data from 30 Mar and this also means the lines are starting to show up as this was Nanos' second result. - Ahunt (talk) 15:45, 31 March 2011 (UTC)[reply]
I know we've finally settled on a good graphical representation, and it does look good, but here is something else to consider which may be a bit of a compromise... In the New Zealand general election, 2008, the graphical representation of polling was both a scatterplot and a line-graph. I don't know what program was used to create it, but it may be a good idea. Bkissin (talk) 23:54, 30 March 2011 (UTC)[reply]
This graph is a scatterplot and line graph, but the lines will join the results for each party by each polling company, unless people decide they want the lines turned off once they see how it looks. Right now there are no lines because no polling company has had two polls in the campaign yet. - Ahunt (talk) 00:04, 31 March 2011 (UTC)[reply]

Sorry I've gotta be so nit-picky about this, but why isn't "Polling since the 2008 election" centred? 117Avenue (talk) 19:11, 31 March 2011 (UTC)[reply]

No problem, we should get it right! That was because the legend was moved from the side to the graph itself, but I'll fix it in the next update. - Ahunt (talk) 19:18, 31 March 2011 (UTC)[reply]

Rolling Polls[edit]

Some polling firms - Nanos Research among them - are now conducting rolling polls. Each day they poll roughly 400 respondents, and then release the previous three days' worth of results (with a 1,200 sample size). So while we do have a new dataset each day, it's not all new data. Should we report on them every time (effectively triple-counting each day's results), or should we only report each poll with entirely new data? --Llewdor (talk) 18:30, 31 March 2011 (UTC)[reply]

A number of companies have done three day rolling polls in the past and statistically they are valid, so I think they should be included. That is one of the reasons why I have at least initially included the lines on the scatterplot by party and polling compamy, because as 1/3 is added and 1/3 is dropped off each day you will see the poll data evolve. It should give results similar to other polling methods, such as once a week, but the lines will be smoother and trends should be evident sooner. Each polling company is asking different population samples different questions at different intervals. Because participating in these polls is refusable (and lots of people refuse or don't even answer the phone), none of the samples are random and so the results really aren't comparable between different companies. This all means that each company needs to have its data compared only to its own previous polls and that comparisons between different company's results will be difficult and imprecise. In this case Nanos is doing his own thing and as long as his methods are sound, which they seem to be, they should be recorded. - Ahunt (talk) 19:30, 31 March 2011 (UTC)[reply]
I agree with Ahunt. We are not averaging out polls, we are seeing how the parties progress day to day. Like Ahunt said, polling data between companies is not really comparable, as they all have different methodologies. This is why sites such as ThreeHundredAndEight are flawed in combining polls with arbitrary weights. Each companies data must be judge on its own merits. — Preceding unsigned comment added by Anewshore (talkcontribs) 21:13, 31 March 2011 (UTC)[reply]
I'm aware the polls are statistically valid, but including the overlapping polls will give the impression that the datapoints are independent, and they're not. We're not doing anything statistically incorrect by listing them each day, but I do think we're doing something misleading. --Llewdor (talk) 03:57, 1 April 2011 (UTC)[reply]
Well then I think just an explanatory note to their data would suffice. Perhaps this could be a footnote or similar? - Ahunt (talk) 13:00, 1 April 2011 (UTC)[reply]
I have added a note, see if you think that handles it. I think the key thing is that readers understand what is going on. - Ahunt (talk) 13:06, 1 April 2011 (UTC)[reply]
I like it. As long as the reader is given the information, what he does with it isn't our problem. Thanks. --Llewdor (talk) 19:00, 5 April 2011 (UTC)[reply]
I agree, that is the important thing - clear information. - Ahunt (talk) 19:13, 5 April 2011 (UTC)[reply]

2008 Style Plot[edit]

2008 style plot. I had to update my R script to be compatible with the latest version of R and ggplot2. Still need to work out what's going on in the legend there =P. Also, the LEOSS regression trendlines and error bands like in 2008 aren't useful until there is more data available. galneweinhaw (talk) 08:09, 1 April 2011 (UTC) [reply]

This is great, thank you. Eb.eric (talk) 11:58, 1 April 2011 (UTC)[reply]

Thanks! I find this easier to read and spot trends. — Preceding unsigned comment added by Anewshore (talkcontribs) 15:30, 1 April 2011 (UTC)[reply]
There appears to be enough data now to add in the LOESS regression like in 08. galneweinhaw (talk) 15:11, 2 April 2011 (UTC)[reply]
The main problem I have with this representation is that you can't differentiate polling companies and, as discussed above, they all use different methodologies, asking different questions to different not-completely-random samples. As the 2008 election graph shows, almost none of the polling in the last week or so of that campaign accurately predicted the Conservative final result, with Nanos' three days rolling polls one of the least accuarate and Angus Reid, an outlier for numbers from the rest, the most accurate. That is why the other graph (which admittedly I created) distinguishes the different polling companies and links their individual results together. You can still visually integrate all the results for each party, but more importantly you can see the differences by polling company. Is there anyway to account for that in this graph? - Ahunt (talk) 15:23, 2 April 2011 (UTC)[reply]
A different symbol could be provided for each polling company, like your plot does (Personally I like that too, but it's a balance of visual simplicity vs info content). The regression/trendline could also be weighted according to the sample size/error of each poll. Which polling company gets the closest is going to be a crapshoot right? I remember in 2006 when Nanos got the numbers bang almost to the decimal, and everyone was praising his rolling poll methodology =) ONE of them has to do the best ;). galneweinhaw (talk) 21:32, 2 April 2011 (UTC)[reply]
Since the 2008 election results are on the graph, it would be informative to also have some results of the polling leading to the 2008 elections (maybe the last two weeks?) in this graph to the left of the 2008 election's date if possible. This will give everybody a chance to see how well the polling surveys agreed with the election and put the current graph in perspective. Dangerouslycurious (talk) 11:22, 7 April 2011 (UTC)[reply]
  • In looking at the individual polling company results from 2006 and 2008 versus the actual election results there is no pattern there, no one company that seemed to do better than others from one election to the other. It seems that they change their methods over time and as a result I am not sure that would give any benefit to see how they did in the past, as it probably wouldn't be a good predictor of accuracy this time around. - Ahunt (talk) 11:56, 7 April 2011 (UTC)[reply]
It certainly is a trade-off. Perhaps we could just use both graphs in the article, as they both have presentational advantages and disadvantages. I know a lot of people are checking the article page regularly now that the campaign is on, including people involved in the election itself. Since each party and media outlet have their own polling firm and ignore the others, what we have here is the best aggregate election polling webpage on the internet. One more graph could only make it better and I don't see any reason to waste anyone's work! - Ahunt (talk) 21:43, 2 April 2011 (UTC)[reply]
Here's the default with polling companies added by shape (can tweak for aesthetics). Ahunt, do you have any interest in taking this one over if I provide you with the code? It's really simple to generate from a csv text file. By the sounds of it, you might enjoy trying all the different things you can possible plot in different ways with simple parameter changes/additions. Let me know, since I won't be around for the whole election to keep it up (and I don't want to if the article isn't going to use it anyway =) galneweinhaw (talk) 21:56, 2 April 2011 (UTC)[reply]
2008 style plot sample indicating polls by pollster
That is interesting. I wish I knew anything at all about running scripts like that. I think I tracked it down, it runs on R (programming language) does it?. - Ahunt (talk) 22:25, 2 April 2011 (UTC)[reply]
Yep, that's the one! Fairly easy to use and extremely powerful. Here's a pretty awesome reference manual with loads of examples: [2]. I don't think you need to know anything about running scripts. Trial and error based on examples given in the references will get you far (that's how I learned). Then you just run it and see what coolness pops out, haha. galneweinhaw (talk) 22:43, 2 April 2011 (UTC)[reply]
Given my lack of familarity with it and the fact that you won't be around later and and we have a current graph that is doing the job I am probably inclined to go with what I have, which is just a simple OpenOffice Calc spreadsheet. I have that all trouble-shot, set-up and working! - Ahunt (talk) 00:19, 3 April 2011 (UTC)[reply]
Updates: Size (area) of plot points are proportional to the (inverse of) error, and the trend is now weighted by the (inverse of) the error of each poll galneweinhaw (talk) 20:54, 4 April 2011 (UTC)[reply]
I think it looks fine with all three graphs there, now! You can never have too many graphs! Besides each one shows different things. One request on this graph (the third one on the article page) can a title be added to it to say what it is showing? It isn't clear what the gray area or the line represents.- Ahunt (talk) 21:38, 4 April 2011 (UTC)[reply]
I agree that the three graphs all show different things and are each useful. I like the adjustments you've made to the third graph Galneweinhaw. I don't know if it's possible--but perhaps the Nanos polls should only contribute 1/3 to the mean since the same results appear in 3 separate polls (rolling polls). Ahunt will you be able to maintain this graph when Galneweinhaw can't? If not, I could try to learn but I don't have that much time. Thanks to both of you for your continued contributions. Eb.eric (talk) 02:00, 5 April 2011 (UTC)[reply]
I agree that the weighting for the Nanos rolling polls should be reduced to one third for the average line, otherwise each data set is essentially being counted three times. Add to that that Nanos' data, as in 2008, is not at all agreeing with the other polls and that indicates a possible methodology problem there. At the very least it shouldn't be over-weighted. Sorry I don't understand how it works well enough to be able to use it and adjust what it is displaying it as we go along. If someone else can do that then great! - Ahunt (talk) 11:49, 5 April 2011 (UTC)[reply]
Great points about the 1/3 for Nanos polls! I'll try this when I update this evening. galneweinhaw (talk) 23:29, 6 April 2011 (UTC)[reply]
I had a look at it - that certainly changes the trend line direction. We still need the other polling companies to put out some data, even with the weighting there is too much Nanos here and too little anyone else. I mean Ipsos-Reid and Angus Reid haven't even done a poll yet since the elction was called! - Ahunt (talk) 00:37, 7 April 2011 (UTC)[reply]
I agree. 2008 had a flood of polls, like 3 or 4 a day! where are they all? galneweinhaw (talk) 08:00, 7 April 2011 (UTC)[reply]
It looks like User:Krazytea found two! - Ahunt (talk) 10:39, 7 April 2011 (UTC)[reply]

Ditto for the Ekos polls now, their weighting will have to be affected. 117Avenue (talk) 14:54, 29 April 2011 (UTC)[reply]

Ranges in polling dates[edit]

I liked when the polling dates were expressed as ranges. If half of a poll was done before a given event, it's useful to know that only half of the population of a poll knew about the event when giving their answer. —Arctic Gnome (talkcontribs) 01:33, 2 April 2011 (UTC)[reply]

Aesthetics of this page[edit]

A couple recommendations I'd like to make to make this page look a little less weird:

1. Thumbnail and caption the inter-election period polls chart.
2. Move the inter-election period polls chart to the bottom (the title of the article is "polling in the Canadian federal election" and the first thing you see is a bit confusing)
3. Shrink the Ahunt's polling during the election plot (I've never seen an image on wikipedia that I had to scroll to see the whole thing)
4. Add caption or footnote with description of my plot with trend

Thoughts? galneweinhaw (talk) 02:31, 5 April 2011 (UTC)[reply]

Sounds fair. Eb.eric (talk) 18:14, 5 April 2011 (UTC)[reply]
I can make my graphs smaller if that would be a better solution, or they can be reduced in size by thumb-nailing on the page and keep the full size graphs available for readability. They were originally reduced by thumb-nailing - the scrolling was added later. I also wanted to point out that the top graph also covers the election period, the second and third graphs really just show an expanded part of the election campaign period, so all three cover the campaign. - Ahunt (talk) 18:38, 5 April 2011 (UTC)[reply]
In thinking about this and trialing a few things it occurs to me that if we want to neaten up the lead section where the graphs are and make it more "small-screen friendly" perhaps the easiest thing would be to "gallery" all three graphs, which would make them three, side-by-side thumbnails with captions to explain what each one is. Readers could just then click on each one to see the full-sized graph.
On a related topic I ma a little concerned that while I have been updating the first two graphs at least daily, the third one is now two days and two polls out of date. - Ahunt (talk) 11:46, 6 April 2011 (UTC)[reply]
Per thumb nails and description i think that is a good idea. As for the out of date graph, i too share concerns on that- but am willingly to support its inclusion so long as it not too much out of date within reason (ie 3 or 4 days- alot can happen to effect polls). Other thoughts? Ottawa4ever (talk) 12:40, 6 April 2011 (UTC)[reply]
Okay let me make that gallery change on the page itself so people can see what it looks like. Feel free to revert it if you don't like the result. - Ahunt (talk) 13:11, 6 April 2011 (UTC)[reply]
I think the default is too small, so I've made them bigger. 117Avenue (talk) 18:52, 6 April 2011 (UTC)[reply]
That actually looks pretty nice! - Ahunt (talk) 21:20, 6 April 2011 (UTC)[reply]

Date tiebreaker[edit]

What should be the tiebreaker between polls from the same date? I think currently it's just who's gets added first, but shouldn't it be something quantitative, like alphabetic firms? 117Avenue (talk) 03:36, 7 April 2011 (UTC)[reply]

On the second graph that I am doing it doesn't make any difference graphically as they both show up on the same date. It does on the first graph, but the polling period is so small that you can't really visually discern that. I think in looking at the third graph that it doesn't matter there either. Unless there is a real concern I wouldn't sweat it! - Ahunt (talk) 10:37, 7 April 2011 (UTC)[reply]
That's why I suggested the second graph, you can't discern the first one :) 117Avenue (talk) 18:26, 7 April 2011 (UTC)[reply]
Exactly! - Ahunt (talk) 22:41, 7 April 2011 (UTC)[reply]

Nanos polls, Conservative bias?[edit]

Am I the only one who has noticed that Nanos polls almost always report higher numbers for the Conservatives than other firms? Looking through all the polls for the last year, almost every single Nanos poll has higher Con numbers than the polls immediately before and after. Sysys (talk) 19:47, 11 April 2011 (UTC)[reply]

I have noticed that as well, they do seem to have some methodology problem there, although in the 2008 election they underestimated the Conservative vote. It could be something as simple as their polling this time around is being done on residential landlines only and in the middle of the workday, thus getting a lot of seniors on the phone and missing the working people and students. None of the polling companies are getting close to a true random sample and that, along with the choice of, and order of questions accounts for most of the variability in the data. That is why on the second graph in the article I have joined the same company/same party only with lines, as I don't believe that different polling companies' data are comparable. In the end we'll have to wait until 2 May to see which was closest. In 2008 it was Angus Reid. - Ahunt (talk) 19:53, 11 April 2011 (UTC)[reply]
Nanos polls have a bias for the two major parties, slightly against the minor parties, more so against the Greens. That's simply because they don't prompt party names so people are more likely to have the major parties on the tips of their tongues. --JGGardiner (talk) 20:05, 11 April 2011 (UTC)[reply]
Maybe I shouldn't have used the word "bias" there because it has certain connotations. That's not a comment on their accuracy but I meant relative to the other firms. --JGGardiner (talk) 20:09, 11 April 2011 (UTC)[reply]
No you are quite right, in the field of stats that is correctly called "bias". - Ahunt (talk) 21:21, 11 April 2011 (UTC)[reply]
Thanks for the vote of confidence. I did think I was using the term correctly but I know that some Wikipedians are touchy around political articles, especially during elections. I was imagining a user accusing me of suggesting that Nanos was in bed with the Conservatives. --JGGardiner (talk) 08:56, 12 April 2011 (UTC)[reply]
Pretty sure that even Nanos said on CTV that the polls were slightly biased due to support in the west. "When speaking to Question Period, Nanos said the Conservative lede is skewed towards the Prairies and B.C. to some degree because of a large local advantage in those provinces." link Not sure if that really addresses a national bias or not though. Krazytea(talk) 21:43, 11 April 2011 (UTC)[reply]
Thanks for the link. Interesting read. Though I think Nanos was talking more about a wasted vote effect. --JGGardiner (talk) 08:58, 12 April 2011 (UTC)[reply]
If I were you I would be more worried about the Toronto Sun poll commissioned by Dr. Conrad Winn for COMPAS. Krazytea(talk) 15:53, 13 April 2011 (UTC)[reply]
I saw that one, talk about a statisical outlier. I have been a little concerned in past elections that some outlying polls are intending to create a self-fulfilling prophecy and that these are not polls at all, but press releases. - Ahunt (talk) 17:02, 13 April 2011 (UTC)[reply]
Technically the word "bias" is wrong since we don't know what the real numbers are -- maybe COMPAS is correct and everybody else has an anti-CPC bias. Better to stick to "house effect". 96.49.183.90 (talk) 01:46, 14 April 2011 (UTC)[reply]

COMPAS poll[edit]

I have just read through the COMPAS polling report and I am really wondering whether this should be included or not. It was commissioned by a newspaper that supports the CPC, widely quotes the aforementioned right wing Dr. Conrad Winn, even though he wrote the report ("The principal investigator on this study was Dr. Conrad Winn.") and is written is a very non-scientific and highly biased manner. It draws polling conclusions from the English leaders' debate, and yet the polling was done 6-11 April. It reads like a party press release instead of a report on a poll. The results are also wildly divergent from the other current polling. I really don't think this is credible enough to include in the list. - Ahunt (talk) 17:29, 13 April 2011 (UTC)[reply]

im for its removal per the above concerns after reading this pdf. But im trying to do digging in compas per its reliability, my understanding is that theyve only had two polls in the last 4 or 5 years?., am i wrong in my search? Ottawa4ever (talk) 18:06, 13 April 2011 (UTC)[reply]
Meh I'm fine with leaving it. A poll is a poll is a poll. We all know that certain groups and polling organizations have a natural bias. It could be in the polling method or perhaps because they lean politically one way or the other. Conservative bias is generally a little more extravagant than say liberal biased polls but the fact remains in any political organization there is going to be bias. Obviously people can look at certain polls and see the patterns. I think if anything what the polls are really telling us is that things are all over the map in BC and Quebec, but they all confirm a narrowing race in Ontario. It is information like that where we can find correlation rather then one party is up 1-10% nationally or down 1-10% nationally. I'm fine with leaving it, even if it is a little clumsy. Krazytea(talk) 18:30, 13 April 2011 (UTC)[reply]
Okay, with that one objection to excluding it from User:Krazytea I have included it in the graph. Just looking at the second graph, the COMPAS poll discredits itself by being so out of whack with everyone else. The past COMPAS polls were also way off, too. - Ahunt (talk) 20:41, 13 April 2011 (UTC)[reply]
I agree to keep it. Professional polling company commissioned by a national media chain. I'm not sure why we'd not include it just because it's wacky, the latest EKOS appears just as out of whack in the opposite direction. That's political opinion polling for you =) All the info is available to the reader of this article. galneweinhaw (talk) 20:49, 13 April 2011 (UTC)[reply]
If we exclude COMPAS for having a house effect in favour of CPC (past polls support this -- it's not just one weird poll), why not also exclude Nanos for having a house effect against GPC? How much of a house effect do we count as being enough to exclude someone? Better to keep all the data and let the reader decide which ones are outliers. 96.49.183.90 (talk) 01:42, 14 April 2011 (UTC)[reply]
I dont know if its any bearng, but if its only one or two polls in a sea of 100s this campaign, its not being given undue weight( per how often is compass going to be conducting polls?) In this case Id be certaintly willing to step aside and form consensus on inclusion Ottawa4ever (talk) 08:37, 14 April 2011 (UTC)[reply]
Leafing through the report, I don't see anything terrible. They conclude the Conservatives are strong but that is in line with their data. Maybe the data is flawed as well but we don't know that. I think pollsters should be treated much like RS; they may have their biases but if they are reputable enough, we need a specific reason to exclude them. --JGGardiner (talk) 08:52, 14 April 2011 (UTC)[reply]
Thanks for the reponses to this issue. I have been convinced it should stay and so it remains in the list and is also on the graphs. We'll let the readers decide what weight to give it. - Ahunt (talk) 12:50, 14 April 2011 (UTC)[reply]

If anyone is still interested, Eric Grenier, the journalist, says that COMPAS is an outlier because it appears to exclude "leaning" voters from the results which is unusual. --JGGardiner (talk) 19:11, 15 April 2011 (UTC)[reply]

Youth vote analysis from EKOS[edit]

How to fit this in I don't know but check it out. Of course the seat counts are based on - I think - preferential balloting....maybe not, maybe those are FPTP outcomes, which is very interesting re the Greens....makes me wonder which ridings those are.Skookum1 (talk) 09:08, 14 April 2011 (UTC)[reply]

It is an interesting analysis, but doesn't fit into this article. Perhaps it could be included in Canadian federal election, 2011. - Ahunt (talk) 12:47, 14 April 2011 (UTC)[reply]
Here is another poll that is again very interesting but we can't really fit into this list Harper stumbles, Duceppe shines during French debate: poll. - Ahunt (talk) 14:05, 14 April 2011 (UTC)[reply]

Polls schmolls[edit]

What is the purpose of this article's existence? Once the 2011 election is held, this article's info will be moot. GoodDay (talk) 15:18, 14 April 2011 (UTC)[reply]

Umm... No, not really. After the election is held, it will give readers the ability to track the changes in support for the parties over time, especially given specific news stories that changed public opinion (i.e. the Coalition Crisis, the Vancouver Olympics and the prorogation that it followed, etc.). Plenty of articles like this exist for other elections, and have not been deleted after the election has been finished (UK 2010, USA 2008). Additionally, we originally had the polls as part of the Canadian federal election, 2011 article, and it became too long and unruly that we felt the need to move it to its own page. I hope that clears things up on the necessity of this article. Bkissin (talk) 16:01, 14 April 2011 (UTC)[reply]
I would argue that this article's value actually increases after the election as it provides a complete polling history for reserchers. I have personally used past election articles in the series to draw conclusions about the accuracy of the various polling companies versus the final election results. - Ahunt (talk) 16:34, 14 April 2011 (UTC)[reply]
agreed, though it needs someone to build a proper prose section that discusses the results over time and what has caused changes in support. Resolute 18:44, 14 April 2011 (UTC)[reply]
The reason that isn't done is because it is very hard to find reliable refs. You can't just look at the polls yourself and write your own analysis, as that would be WP:OR. - Ahunt (talk) 19:15, 14 April 2011 (UTC)[reply]
ORLY? [3], [4], [5], [6], [7], etc. WP:NOTSTATS applies. If we can't write an article to go with the list, then we shouldn't have the bare stats either. However, I think people far more motivated than I could do so. Resolute 19:44, 14 April 2011 (UTC)[reply]
Original Research would be conducting our own poll for publishing on wikipedia. However, WP:SYN may come into play if the description of the stats is poorly written. maclean (talk) 01:08, 15 April 2011 (UTC)[reply]
WP:NOTSTATS gives nationwide opinion polling in the US 2008 election as an example where a table of data is acceptable. I agree an intro would be nice, but not required. I just copied the intro from the exemplified article. galneweinhaw (talk) 05:16, 15 April 2011 (UTC)[reply]

House Effects[edit]

This can't go into the article itself because it's WP:OR, but I figured this table of pollster house effects relative to polling consensus over the set of polls listed on the 2006, 2008, and 2011 pages might help cut down on some of the "why is poll X so different from the rest" questions:

Polling firm                # polls      CPC     LIB     NDP     BLQ     GRN
EKOS                            145     -0.44   -0.20    0.02   -0.01    0.64
DecimaResearch                  141     -0.53   -0.01   -0.00   -0.04    0.58
NanosResearch                   130     -0.09    1.12   -0.10   -0.12   -0.81
IpsosReid                       113      0.69   -0.06   -0.41    0.01   -0.23
StrategicCounsel                 88     -0.16   -0.39   -0.13    0.27    0.41
AngusReid                        63      0.38   -0.76    0.91    0.17   -0.70
Environics                       26      0.35   -0.39    0.67   -0.31   -0.32
LegerMarketing                   21      0.46    0.01    0.03   -0.20   -0.30
Pollara                          13      0.60    0.47   -0.21   -0.33   -0.53
SESResearch                       9     -0.56    0.97    0.03    0.31   -0.75
COMPAS                            5      2.65   -1.15   -0.97   -0.08   -0.45
AbacusData                        5     -0.56   -1.25    1.48    0.08    0.25
Segma                             4      1.64   -1.25   -0.09   -0.10   -0.20
ForumResearch                     3     -0.32   -1.40    1.23   -0.26    0.75
UniMarketing                      2     -0.42   -0.53    1.40   -0.03   -0.42
Praxicus                          2      1.37   -1.40    0.46   -0.37   -0.06
Election2008                      1      0.87    0.16   -0.51    0.10   -0.62
Election2006                      1     -0.06    1.36   -0.52   -0.25   -0.54

Comparing COMPAS and Nanos, for eaxmple, we can see that COMPAS shows on average a 5% wider spread between CPC and LIB than Nanos. 96.49.183.90 (talk) 20:42, 17 April 2011 (UTC)[reply]

Interesting calculations. What does this compare, the polling companies to the final result or to the mean for the week of the poll or what? - Ahunt (talk) 12:13, 18 April 2011 (UTC)[reply]
This is comparing polls to LOESS regression curves through all the polls. The last two lines show how the election results differ from the polling consensus -- there's no way to know if these differences are due to the polls being wrong or due to the elections being wrong (in the sense that if young voters don't turn up at the polls, the election is providing a biased sample of the population). 96.49.183.90 (talk) 19:51, 18 April 2011 (UTC)[reply]
Thanks for clarifying that! It makes sense. I guess these numbers are percentages, then and not std deviations or some other measure? It also shows that COMPAS should be ignored as they are most inaccurate or biased. - Ahunt (talk) 20:40, 18 April 2011 (UTC)[reply]
Yes, those numbers are percentages. A large house effect doesn't mean that a poll should be ignored -- just that you should adjust the results before relying on them. A pollster who consistently gives the Green party 10% extra support but has zero random error is far more useful than a pollster who has zero house effect but has 2% of normally distributed noise added to all the numbers. 96.49.183.90 (talk) 23:06, 18 April 2011 (UTC)[reply]
Makes sense. The one thing it doesn't account for is companies changing their methods. For instance in 2008 Nanos seemed to be the company that most underestimated the Conservatives (compared to the other companies and also the final result), whereas this time they seem to be overestimating them compared to the other polling companies. I suspect they are doing something different and this may average your numbers out over the three elections. - Ahunt (talk) 23:23, 18 April 2011 (UTC)[reply]
I just ran the numbers for each year separately. Out of the pollsters with >50 polls, four show constant house effects; EKOS shows a gradual increase from roughly zero up roughly +1 CPC, -0.5 NDP, +1.5 GRN; and Nanos shows a shift some time in 2009 or 2010 from -0.5 CPC +1.25 LIB +0.25 NDP to +0.5 CPC, +0.75 LIB, -0.25 NDP. I'll have to go back and add regressions for those into my modelling now... 96.49.183.90 (talk) 04:07, 19 April 2011 (UTC)[reply]
That is really neat! I think you have statistically identified the Nanos change in technique there that I saw in comparing their 2008 data to their current numbers. I suspect that they noted that they were the least accurate versus the actual outcome in 2008 and made some adjustment, but, as your figures note, they seem to have over-compensated in whatever they did, at least compared to other polling companies. I am guessing what they did was a weighting adjustment for demographic representation. It will be interesting to see who gets closest to the final election result, since that is really the best test. In 2008, looking at the last week of polling, it was Angus Reid, and at the time they looked like an outlier. - Ahunt (talk) 11:29, 19 April 2011 (UTC)[reply]
I'm not convinced that they changed their methodology. Top-of-mind polls have a bias towards the parties people are most aware of -- hence the lower GRN vote -- and the longer a party is in power, the more aware people will be of it. 96.49.183.90 (talk) 14:40, 19 April 2011 (UTC)[reply]
I think Nanos had been using the unprompted method for all previous elections; but if you look at the last poll done by Nanos during the '08 campaign, they actually over-estimated the Greens by about 2 points. Then again other polling companies had the Greens at around 10 percent. I guess what I'm trying to say is no methodology is perfect, and usually will fail extremal cases (for example people who sympathize with the Greens but who vote for someone else come election day).02:08, 20 April 2011 (UTC) — Preceding unsigned comment added by Tony Kao (talkcontribs)

Commented out graph[edit]

Reluctantly I have "commented out" User:galneweinhaw's graph, as it is now five days out of date for data. I do know he said earlier he wouldn't be available to update it at some point in the campaign. If it can be updated please do remove the comment tags in the gallery so it will display again. - Ahunt (talk) 12:40, 18 April 2011 (UTC)[reply]

Thanks Ahunt, I'm back at a computer now and will be able to update daily for the remainder of the campaign. galneweinhaw (talk) 06:46, 28 April 2011 (UTC)[reply]
It is great to have you and your graph both back! I think that graph adds some real value, especially as the campaign has become very interesting in the last week and half. - Ahunt (talk) 12:02, 28 April 2011 (UTC)[reply]

Where the NDP are taking votes from[edit]

My reading of the polls is that it's clear that the NDP are taking votes from the Liberals and Bloc, but we need more data to say if they're taking any from the Conservatives yet. My regression curves are showing a dip in Conservative from 39.11% to 38.32%, but that's within the range of variation from aggregated polling noise; the drop it's showing for the Liberals, from 29.44% down to 24.28%, and the drop for the Bloc, from 8.74% down to 6.95%, are definitely more than just noise. Based on this I'd like to revert the addition of the word "Conservative" into my sentence about post-debate polling shifts which Ahunt made (at least until there's enough polling data to be more confident about it). Thoughts? 96.49.183.90 (talk) 22:00, 23 April 2011 (UTC)[reply]

The sentence in the article currently says "Over the weeks following the debates, Conservative, Liberal and Bloc support slipped while NDP support increased;" It doesn't say that the NDP is picking votes from the Conservatives and the ref cited precisely backs up the quoted statement, saying "The Conservatives continue to hold on to a significant lead at 34.4 points, short of the last election and down from our last poll where they were 37.4." On that basis I don't see any reason to remove it as the article is factually accurate and the ref backs up the statement. I probably should add that while combining polls into new data and then commenting on that is interesting, it is WP:OR and WP:SYNTHESIS and can't be used in the article. As per above I was against adding narrative for exactly this reason, but if we are going to have it it has to be very carefully referenced to reliable refs. - Ahunt (talk) 22:11, 23 April 2011 (UTC)[reply]
I'm not saying that we should include WP:OR in the article... but I'm not aware of any policy against using WP:OR as a basis for excluding statements which don't seem to be true. Still, you know the policies far better than I do, so I'll defer to your judgment. How about adding the Ipsos poll as a reference to this sentence and rewording to say that Liberal and Bloc support slipped and some polls also show Conservative support slipping? 96.49.183.90 (talk) 22:42, 23 April 2011 (UTC)[reply]
I agree with you that OR is fine to use to know which information to exclude. I also think you have a great solution, quoting Ipsos as well and indicating that the polls disagree on where the Conservatives are going, because they do. Currently Ipsos is showing them increasing in support poll-over-poll, Nanos and EKOS show them declining, while Forum and Environics show them as flat. - Ahunt (talk) 23:52, 23 April 2011 (UTC)[reply]
I'm not sure that EKOS or Nanos really shows the Conservatives declining. EKOS has polled at 34, 35, 37, and 34. Everything there seems to be in the margin of error, and if you look at the 37 as an outlier, as noise, they held steady at 34/35 from 12 April to 20 April. Nanos doesn't seem to be showing a decline either. Aside from the 23rd, they've polled at 39 to 40. A drop to 38 isn't exactly a huge skid. Like EKOS, the result could well be noise. For the time being, it would best to say that as of 23 April the Conservatives appear to be holding steady while the Libs and Bloc are declining and the NDP increasing. -Rrius (talk) 00:13, 24 April 2011 (UTC)[reply]
I have quoted the EKOS ref that says they are declining, we'll need a ref that says they are holding steady. I am not picking on any one party here, but we need refs that say what we are writing. If we want to sum up what all the polls are saying together, than we need a ref that analyses them or at least indicate disagreement of refs. - Ahunt (talk) 00:24, 24 April 2011 (UTC)[reply]
I still have some serious doubts about this intro text that has been added, much of it is unreferenced and I have tagged it as such. - Ahunt (talk) 11:20, 24 April 2011 (UTC)[reply]
Just look at the data, there are literally hundreds of references on this page. 117Avenue (talk) 20:55, 24 April 2011 (UTC)[reply]
That is WP:SYNTHESIS, which says "Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any of the sources. If one reliable source says A, and another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned by either of the sources. This would be a synthesis of published material to advance a new position, which is original research." - Ahunt (talk) 21:06, 24 April 2011 (UTC)[reply]
Doesn't WP:CALC apply? We are allowed to make graphs, and you can see that "Throughout the life of the 40th Parliament, public opinion support for the Conservative party fluctuated from the low 30s percentage to the low 40s" isn't an incorrect statement. 117Avenue (talk) 22:03, 24 April 2011 (UTC)[reply]
CALC allows routine calculations, but it doesn't justify what is being done here, taking stats from one poll, comparing them to stats from another poll, drawing conclusions and writing text from them. That is WP:OR and I think unless proper refs, such as media or polling articles describing the trends can be found, then the words should be removed as unsourced OR. - Ahunt (talk) 22:12, 24 April 2011 (UTC)[reply]
Are we going to be able to find a news source taking into account all the polls? I'm thinking the only stories out there, are going to be looking at polls from one company. 117Avenue (talk) 00:56, 25 April 2011 (UTC)[reply]
The EKOS source is not good enough because they're only saying the Conservatives are down from the last EKOS poll. Trying to make the point that Conservative support has fallen generally on the back of a change from one poll to another is fallacious. If you look at each of the polling companies separately, it is hard to make the point that there has been a decrease in Conservative support since the debates. -Rrius (talk) 01:09, 25 April 2011 (UTC)[reply]

I agree on both counts - this all adds up to this text should be removed as unsourced OR. - 12:53, 25 April 2011 (UTC)

Nanos daily polling[edit]

Nanos seem to have ceased their daily polling over the Easter weekend. I will keep checking it daily however to see when they pick it up again. - Ahunt (talk) 11:20, 24 April 2011 (UTC)[reply]

Looks like the Nanos poll from the 23rd just turned up on their website, but none for the 22nd. - Ahunt (talk) 12:16, 24 April 2011 (UTC)[reply]
Their Twitter feed says that they did would not call people on Good Friday... it would have been in poor taste to do otherwise in my opinion. EricLeb01 (Page | Talk) 16:00, 24 April 2011 (UTC)[reply]
Makes sense to me. I guess we will find out tomorrow whether the same applies to Easter Sunday! - Ahunt (talk) 17:40, 24 April 2011 (UTC)[reply]
The Nanos website indicates that they did not do polling on Easter Monday either and so the next report will not be until Wednesday. - Ahunt (talk) 12:47, 26 April 2011 (UTC)[reply]

Value and accuracy of polling[edit]

Editors working on this article might find this item of interest. It does show the limits and inaccuracies in the election polling this time around. - Ahunt (talk) 16:36, 25 April 2011 (UTC)[reply]

A thought[edit]

What would be good to keep everything neutral, might be (rather than provide interpretations of polls) to supply the methodology each polling company is actually using to get the results they are. This would allow the reader to come to a conclusion in a neutral way. Just a thought. Ottawa4ever (talk) 09:15, 26 April 2011 (UTC)[reply]

It is a good idea, but most companies only provide general information on that. For instance Nanos says: "Methodology - A national random telephone survey is conducted nightly by Nanos Research throughout the campaign. Each evening a new group of 400 eligible voters are interviewed. The daily tracking figures are based on a three-day rolling sample comprised of 1,200 interviews. To update the tracking a new day of interviewing is added and the oldest day dropped. The margin of error for a survey of 1,200 respondents is ±2.8%, 19 times out of 20. The respondent sample is stratified geographically and by gender. The data may be weighted by age according to data from the 2006 Canadian Census administered by Statistics Canada. Percentages reported may not add up to 100 due to rounding. The research has been registered with the Marketing Research and Intelligence Association of which Nanos is a member." That sort of information would really be only useful if it included whatever adjustment calculations they made to try to correct the representativeness of the sample, the time of day of the polling, exact question order, more details on survey method (eg land line only) but these seem to be trade secrets. - Ahunt (talk) 10:32, 26 April 2011 (UTC)[reply]

Seat projections[edit]

Some pollsters like EKOS and Forum Research have started putting out Seat Projections along with their regular numbers. Should these be included as well in the article? I don't see why not as the Opinion_polling_for_the_next_United_Kingdom_general_election article does it. Sima Yi (talk) 22:19, 27 April 2011 (UTC) || Seat projections are hokum. They really don't have any scientific or theoretically sound backing. I'd suggest leaving them out.[reply]

coming closer[edit]

The opinion polls do not differ as much as they did earlier. I am getting the feeling that the pollsters understate and overstate estimates on purpose before, and then fix their act when elections come closer. Is this view accurate or is it wrong? (LAz17 (talk) 07:02, 29 April 2011 (UTC)).[reply]

The polls do seem to be converging this time around, but there is no evidence that any techniques are being changed, it could just be the people being polled are less unsure in their responses. See also the article above underValue and accuracy of polling. - Ahunt (talk) 11:32, 29 April 2011 (UTC)[reply]

New data. [8] - but is it 29th or 30th of the month? (LAz17 (talk) 05:48, 30 April 2011 (UTC)).[reply]

COMPAS[edit]

Are they a credible pollster? The disparity between their last poll and every other poll is surprisingly large. Educatedseacucumber (talk) 07:33, 1 May 2011 (UTC)[reply]

Seems to be the case for every one of their polls. I just ignore them. Sysys (talk) 09:45, 1 May 2011 (UTC)[reply]

This came up above. Eric Grenier, the journalist, says they appear to exclude "leaning" votes which most pollsters include. --JGGardiner (talk) 10:40, 1 May 2011 (UTC)[reply]

So they only include decided voters? The undecided and the leaning can't just be ignored. EricLeb01 (Page | Talk) 14:05, 1 May 2011 (UTC)[reply]
These polls are done by Dr Conrad Winn, to put it mildly he has a particularly right wing POV/agenda and he often quotes himself in the reports. In the last poll he even accounted for events influencing the polling that happened after the poll. For some background on him see this. - Ahunt (talk) 14:18, 1 May 2011 (UTC)[reply]
It is not for us to judge methodology. What matters more is that the poll results are being reported by the media. By the logic of removing because Dr Winn might have a right-wing agenda, it would only be fair to remove the EKOS polls as well -- Frank Graves is a Liberal. This recent poll giving the Tories 46% is almost certainly an outlier, but for example, the EKOS poll from April 12 gives the Tories 33.8% when everyone else gave them around 39-40%. The COMPAS poll from the day before gave the Tories 45% -- so both polls are equal outliers there. See April 24 EKOS poll, another big outlier. It doesn't make sense to only get rid of the COMPAS poll. Maxim(talk) 21:59, 1 May 2011 (UTC)[reply]
To respond to the edit summary: less g-hits is not surprising if only two polls were done during the campaign. However, why then is something like Abacus data in the list? They have 12 g-hits, but not all of them seem to be about the election. In the news, I get the first one about polygamy, and the third about aquaculture... Maxim(talk) 22:03, 1 May 2011 (UTC)[reply]
I agree they should stay - we have a consensus above at #COMPAS_poll to include them, removing them requires a new consensus. Personally while I think their results are silly and will probably be proven very wrong tomorrow, it is worth including them - their data speaks for itself. - Ahunt (talk) 22:05, 1 May 2011 (UTC)[reply]
EKOS includes "Others", so their total is lower. Besides, being five points off is within the MOE; being 9-13 points off isn't.
9-13 points "off" what? Note that both EKOS and COMPAS final polls were both 6% off the final Conservative vote. Goes to show the importance of including all professionally conducted polls and methodologies, because the only time any judgement call can be made is after the election. 77.70.191.198 (talk) 05:48, 6 May 2011 (UTC)[reply]
Well, I'm not adamantly against it staying. Since the two of you favor it staying I'll concede. Educatedseacucumber (talk) 22:07, 1 May 2011 (UTC)[reply]
And if it is discussed in reliable sources, you note the reasons for the difference in prose. The Compas polls are being used in RS publications, so in my view, removing them would serve only to push our own POV. Resolute 22:11, 1 May 2011 (UTC)[reply]