Talk:Scatter plot/Archives/2013

Untitled
"If no dependent variable exists, either type of variable can be plotted on either axis and a scatter plot will illustrate only the degree of correlation (not causation) between two variables." - This seems to suggest that measurements in which the experimenter can control one of the variables do illustrate causation, whereas scatterplots do not. To me, the latter is not obvious. The fact that a experimenter has no control over a variable does not preclude a causal relationship between the two variables, does it? In the example in the article, there might be a causal relationship between lung capacity and breath holding time, e.g. because people with bigger lungs can store more oxygen in one breath and thus can hold it longer before they run out of oxygen. In that case the scatterplot would indicate a causal relationship, in my opinion. —Preceding unsigned comment added by 193.67.21.67 (talk) 14:18, 11 November 2008 (UTC)

I have seen it cited both ways in statistical literature, but scatterplot seems to appear more frequently:

"The scatterplot is one of our most powerful tools for data analysis." -- Cleveland, W. S., and McGill, R. (1984), "Many Faces of a Scatterplot," Journal of the American Statistical Association, [79], 388, 807-822.

"To identify potentially significant changes in expression, we used a scatter plot of the observed relative difference d(i) vs. the expected relative difference dE(i)." -- Tusher, V. G., Tibshirani, R., and Chu, G. (2001), "Significance analysis of microarrays applied to the ionizing radiation response,' Proceedings of the National Academy of Sciences, 98, 5116–5121.

What is the line aross the graph for? This is a line of best fit. I have edited the article to show this- Micropw

Have done further work,adding an explaination of correlation coefficient from another article. Feed back welcome Micropw 15:43, 15 October 2006 (UTC)

"An equation for the line of best fit can be computed using the method of linear regression." The regression curve is not in general the same if y is regressed on x, or x on y (see Fig. 1.1 at page 18 of the 3rd ed. of Draper & Smith, Applied Regression Analysis). By contrast, the correlation between two variables is only one. should not one use the line of the eigenvector in a scatterplot reporting a correlation? —The preceding unsigned comment was added by 134.60.108.123 (talk • contribs).

I think the discussion for the line of "best fit" is out of the scope for an article on scatterplots. When I produced the image in question (Old Faithful data) I put in a lowess smoother to show the relationship between erruption times; however, I think the image without the line would be better suited for this article. Nwstephens 23:01, 19 August 2007 (UTC)nwstephens