Wikipedia:Reference desk/Archives/Mathematics/2015 February 28

= February 28 =

Median
When, in a data set, there are multiple numbers which together make up the median(i.e. 3,6,9,10,12,11), what do you do with those two numbers to determine the median?Ohyeahstormtroopers6 (talk) 01:03, 28 February 2015 (UTC)


 * As illustrated in the lead of the article Median, for an even number of data points in the data set, the median is the mean of the centre-most pair of data points. —Quondum 03:03, 28 February 2015 (UTC)

Thank You. 2602:306:C541:CC60:6866:CFB1:2D1B:5526 (talk) 05:38, 28 February 2015 (UTC)

Identify partial differential equation
These are really two and a half questions.

I stumbled on a partial differential equation: $$ u=u(x,y), v=v(x,y), u_x v_y - u_y v_x =1$$.

Question one: How can I google or otherwise search the net for such kind of equations?

Question two: does anyone happen to recognize this as a famous and named equation I can search by name?

Question 2 1/2: u(x,y), v(x,y) rings a (probably wrong) bell, making me think of Cauchy-Riemann equation. I guess this would be a special case of the equation, giving me $$ u_x^2 + u_y^2 = 1$$. This looks somehow familiar but I neglected math for far too long to see what that would be.

93.132.5.112 (talk) 19:17, 28 February 2015 (UTC)


 * I don't know the name of the equation, but in classical mechanics, the left hand side is a Poisson bracket for u and v if x and y were canonical coordinates. --Mark viking (talk) 22:28, 28 February 2015 (UTC)


 * I have to admit that, on other occasions, I had several utterly unsuccessful tries on getting a grasp on Poisson brackets. I guess there is some tiny ugly heuristic clue that I'm to dump to pick up myself and everyone else is too tired to mention. 93.132.5.112 (talk) 23:28, 28 February 2015 (UTC)


 * I'd call this the "Jacobian determinant equation", or something like that, and Google finds several useful results for this search term. It is one equation for two unknown functions u,v, so it is underdetermined.
 * If u+iv additionally solves Cauchy-Riemann, the only solutions will be affine functions u+iv(z)=az+b where a is a complex number of modulus 1.
 * The equation $$ u_x^2 + u_y^2 = 1$$ is called the eikonal equation. —Kusma (t·c) 12:19, 2 March 2015 (UTC)

Regression with a graph (machine learning)
Suppose I have a graph of nodes and connections between nodes. The independent variables are this graph and other numerical values assigned to each node. The dependent variable is a point value for the node. How could I use regression to predict the point value of new nodes, based on their connections in the network? Is regression even the right tool to use here? 70.190.182.236 (talk) 20:55, 28 February 2015 (UTC)


 * Just trying to understand the problem here. So the nodes on the graph each have a value, and that value is in some way based on the other numerical values assigned to that node and nearby nodes, right ?  If the relevant "graph distance" is simply the smallest number of nodes to get to each node, then you might use that in a second regression analysis, after first doing a regression analysis without considering nearby nodes.


 * For example, let's say each node's dependent value is 90% based on the (single, in this example) independent value of that node, and 10% based on the node(s) one step away. I think it might be better to look at as few variables at a time as possible.  I think you would be more likely to find convergence that way.StuRat (talk) 21:19, 28 February 2015 (UTC)


 * Yes, that's a good statement of the problem. 70.190.182.236 (talk) 21:28, 28 February 2015 (UTC)
 * This sounds indeed like a regression problem. However "regression" is a name for a type of problems, not a specific tool, so you can't "use regression", you need to choose which regression to use.
 * Usually, either the problem is simple enough that a standard cookie-cutter technique can handle it, or you have a huge search space and you need human intelligence and domain knowledge to constrain it. Your question seems to belong to the second category - so without a description that is less abstract, I don't think we can really help with a solution that will actually give good results for the problem you have at hand. -- Meni Rosenfeld (talk) 10:47, 1 March 2015 (UTC)

This link has a section on network regression models: http://faculty.ucr.edu/~hanneman/nettext/C18_Statistics.html. There's also an R package for this: http://svitsrv25.epfl.ch/R-doc/library/sna/html/netlm.html — Preceding unsigned comment added by OldTimeNESter (talk • contribs) 20:30, 4 March 2015 (UTC)

Statistical tests for normality
If the Kolmogorov-Smirnov and Shapiro-Wilk give significantly different results, how does one resolve this ambiguity? All the best: Rich Farmbrough, 21:50, 28 February 2015 (UTC).


 * It is not really an ambiguity. As the articles you pointed to state, these are different tests, with Shapiro-Wilk giving greater power for a test of normality and Kolmogorov-Smirnov being a nonparametric test that could be applicable to many different kinds of distributions. So it is entirely possible that the tests give different results. Without more details, it is hard to say anything more specific. --Mark viking (talk) 22:14, 28 February 2015 (UTC)
 * My "go to" first test is Jarque-Bera, because it's so intuitive - read the instructions carefully, as you need to be aware of the sample size - but it's certainly not the best. It's usually a good place to start, as you just drop the data into Excel and look at the skewness and kurtosis immediately, and those values on their own are a useful place to start thinking about the data. If you're using K/S you may have accidentally re-invented the Lilliefors Test. My personal preference would be to prefer S/W to K/S, but see our articles on the weaknesses in both tests. Where I have found K/S very useful is in fitting general stable distributions: though it takes some care and attention you might use this approach to check the stability parameter. RomanSpa (talk) 23:40, 28 February 2015 (UTC)
 * Thanks, both, for the replies. I certainly draw some warm fuzzies from the replies, and will look into these tests in a little more depth.  All the best: Rich Farmbrough, 23:45, 28 February 2015 (UTC).


 * (My last edit got lost in an edit conflict, so I'll try to repeat it...) Re: fitting stable distributions: it's not a graceful process! Re: other tests: I had a colleague who swore by the Epps-Pulley test, which we don't have an article on; the raw reference is "Epps, T. W., and Pulley, L. B. (1983). A test for normality based on the empirical characteristic function. Biometrika 70, 723–726". I haven't used E/P myself for a long time, but I do remember it was a pain in the ass to code up! E/P leads naturally to the BHEP test, which I don't really know, but have heard mildly positive comment on. I suspect it's also a nuisance to code up, alas! RomanSpa (talk) 23:56, 28 February 2015 (UTC)
 * There's obviously one other point, which we should really have put first: think about the data in practical terms, and ask yourself whether the underlying experiment is one that's likely to produce a normal distribution. Remember that what you're doing is modelling reality. There is always a model, so think about what the reality is that you're modelling: is a normal distribution a plausible outcome from the imagined/theoretical mechanism of the experiment. RomanSpa (talk) 00:04, 1 March 2015 (UTC)