Wikipedia:Reference desk/Archives/Mathematics/2010 March 6

= March 6 =

is number theory worth studying if one doesn't plan to make a contribution?
I understand that number theory is important for things like encryption (indispensible for fast international banking), and so on. Likewise computer science is important for advances in every industry and for society in general.

But for computer science, there is a reason for ME to learn it, even if I never plan to contribute anything to field of computer science; namely I can program for my own benefit.

my question is, for ME, is there a similar, analogous reason for me to learn number theory, if I never plan to make a contribution to the science; something I will be able to use. Thank you. 82.113.121.94 (talk) 00:03, 6 March 2010 (UTC)


 * Is it worth taking piano lessons if you're never going to be a professional performer or composer? 66.127.52.47 (talk) 01:26, 6 March 2010 (UTC)


 * It's fun. That's the main reason for doing any pure maths! --Tango (talk) 02:31, 6 March 2010 (UTC)


 * If I were to ask an arbitrary person the reason he/she does what he/she is doing, I think the most probable answer would be "money" (though there are definitely many exceptions). If I were to ask a mathematician the reason he/she is doing mathematics, the answer would be totally different. There are jobs in existence that yield a far higher salary than doing mathematics, and at the same time, require little, if not zero thinking.
 * How do you expect us to know how number theory will benefit you? When I first started doing mathematics, I did it because I liked it; simply because I found the ideas interesting, and wanted to have fun (on that note, I think the main fun of doing pure mathematics is "getting ideas"; especially the excitement after having thought and thought and eventually "proved your intended result").
 * Go ahead and learn number theory! See if you like it. If you do not, then you do not have to do it. If you do, then that is great, and you can do it for enjoyment (which is for YOU). The small risk that you will not like it is worth it because if you do like it, I am certain that you will not only LIKE it but ENJOY it. Besides, what do you have to lose? PS  T  04:59, 6 March 2010 (UTC)


 * Not many people contribute to number theory, the easiest way for most would be via Paypal :) It all depends on what type of computing you're hoping to do but I'd advise studying what they're teaching if you're at university so you get good grades and a nice job. I think the main help something like number theory would give is a general background in solving digital problems with a mathematical bent. Dmcq (talk) 15:30, 6 March 2010 (UTC)


 * You know, I studied a fair amount of computer science in college and it's not that useful to me as a programmer. I wish I'd studied more logic and statistics.  Logic doesn't get much respect in applied areas yet, but it's becoming quite relevant to programming practice as the more interesting new programming languages are being explicitly designed around the Curry-Howard correspondence.  And statistics knowledge is very useful for making sense of messy real-world problems. 66.127.52.47 (talk) 15:55, 6 March 2010 (UTC)

Quotient constructions
Why do quotient groups and quotient categories (both have an algebraic nature) differ from from quotient space (which is not algebraic)? For quotient space, we can use an arbitrary partition, but for groups and categories we need the partitions to be the cosets of a normal subgroup or congruence for the morphism of a category.

I think that for any partition G/R of a group G, if there exists a group structure on G/R such that the canonical projection p of each element to its class is a homomorphism, and that the universal property (for any homomorphism h:G->K there's a unique homomorphism j:G/R->K that makes h,j,p commute) is satisfied, then the partition is the cosets of a normal subgroup. It would be really neat if this is so; that'll answer why we can't use arbitrary partitions for groups and categories and other algebraic structures. Money is tight (talk) 00:42, 6 March 2010 (UTC)
 * It is fairly obvious that this is the case. You don't even need the universal property. The identity of G/R must be a normal subgroup N of G (since the kernel of a homomorphism is always a normal subgroup). Suppose g in G is in some other part A of the partition. Then for every n in N, p(gn)=p(g)p(n)=p(g), so gn is in A. Conversely, if h is in A, then p(g−1h)=p(g)−1p(h)=id, so g−1h is in N. Thus A=gN as desired. Algebraist 00:55, 6 March 2010 (UTC)

The short and (somewhat) informal answer to your first question is this: An arbitrary subset of a topological space can be equipped with a topology such that the corresponding inclusion map is continuous (i.e. a morphism in the concrete category of topological spaces), whereas an arbitrary subset of a group cannot necessarily be equipped with a group structure such that the corresponding inclusion map is a homomorphism (i.e. a morphism in the concrete category of groups). PS T  05:26, 6 March 2010 (UTC)


 * To algebraist: Wow.. I didn't know it was that simple. Thanks. To PST: I was talking about quotient objects but nevertheless your right sub-objects also exhibit the same phenomena; arbitrary subset may not have structures such that inclusion is homomorphism. It's weird hows there's two types of structures: algebraic (includes partial order/lattice, but I think they're called relation structures), and the special subsets: topological and measure spaces. Money is tight (talk) 10:41, 6 March 2010 (UTC)

Inner Product Spaces
Does ZF prove that if an inner product space is such that all absolutely convergent series converge, then the space must be complete? JumpDiscont (talk) 01:22, 6 March 2010 (UTC)


 * Yes, even for a normed space E, convergence of all absolutely convergent series (i.e. with Σk|xk| < ∞ ) is easily seen to be equivalent to completeness of E. A useful variation, apparently weaker but still equivalent to completeness, is: "convergence of all series with |xk| ≤ 2-k". The latter property passes immediately to the quotient on any closed subspace V, showing that E/V is complete if E is. --pm a  08:47, 6 March 2010 (UTC)


 * I think he means if the axiom of choice is needed in your proof. Money is tight (talk) 10:56, 6 March 2010 (UTC)
 * No, AC is not needed. It's a plain consequence of these two facts: (1) if a Cauchy sequence in a metric space has a convergent subsequence, then it is convergent to the same limit. (2) Any Cauchy sequence (xn) in a metric space has a subsequence such that d(xn k+1, xn k ) ≤ 2-k. In (2) you just define by induction the subsequence of indices (nk).--pm a 18:24, 6 March 2010 (UTC)

120 deg?
I need to construct angles of 120 deg on a circle without using a protractor. Is there a way?--79.68.242.68 (talk) 01:35, 6 March 2010 (UTC)
 * Yes. Assuming you have a pair of compasses, anyway. Set the pair of compasses to the radius of the circle, mark a point anywhere on the circle, then put the point of the compass on that mark and mark a point one radius away on the circle. Move the point of the compass onto that new mark, and mark another point a radius away. The outer two points will be 120 degrees apart. Did that make any sense? If not, I'll draw a diagram... --Tango (talk) 02:35, 6 March 2010 (UTC)


 * Yes I tried it and it Works! Not sure why but Thanks!--79.76.146.18 (talk) 21:06, 6 March 2010 (UTC)
 * What you're doing is basically partially constructing an hexagon, which is composed of six equilateral triangle with angles all equal to 60°. Circéus (talk) 20:32, 7 March 2010 (UTC)
 * Indeed. The key thing is that you get an equilateral triangle when you have a chord with a length equal to the radius of the circle. --Tango (talk) 23:07, 8 March 2010 (UTC)

Just seen it--79.76.188.14 (talk) 18:08, 11 March 2010 (UTC)

sin-1
On my calculator it says sin-1 on a key. Is this the same as arcsin (and cos-1 as arccos, etc.)? 68.248.236.144 (talk) 02:47, 6 March 2010 (UTC)


 * Yes. --Tango (talk) 02:52, 6 March 2010 (UTC)


 * The Inverse trigonometric functions article (arcsin's redirection target) addresses this notation in its lead, pointing out that it may result in confusion between multiplicative inverse and compositional inverse. On a calculator, sin-1 always means arcsin.  In trigonometric identities, such as sin2 x  + cos2 x  = 1, sin2 x always means (sin x)2.  To avoid confusion, most people will not write sin-1 x for (sin x)-1, although they will write sinn x for (sin x)n, even where -1 is a possible value for n.  What I wonder is, if the compositional use is understood from the context, is sin2 x ever used for sin(sin x)? 124.157.249.168 (talk) 23:33, 7 March 2010 (UTC)

Expected minimum from non-parametrized distribution
What is the best way of calculating the expected minimum value in a sample set of size n drawn from an arbitrary distribution, the functional form of which is unknown? To clarify, given a set of N samples from the distribution, how could you go about calculating the minimal value likely to be observed, as a function of n, when performing future samplings of size n from the distribution?

Further details: I am using a non-deterministic minimization protocol, and am performing multiple (independent) trials with different starting values to better locate the global minimum (otherwise unknowable). I would like to gauge how quickly the algorithm converges with increasing number of repeated trials. Using a representative example, I have performed a large number of trials. Just plotting the minimum obtained so far against output order is sub-optimal, because "lucky guesses" near the start of the run skew things. I've seen equations which give the minimum as a function of sample size, but they assume you know the functional form of the underlying distribution. Is there a "mathematically rigorous" way to compute it without the functional form? Thanks. -- 174.21.226.184 (talk) 03:56, 6 March 2010 (UTC)
 * This seems like an underspecified problem. Consider an unknown distribution on the two-value sample space {I am alive, I am not alive}.  Every day I wake up and observe a sample drawn from this distribution, and it's been "I am alive" every day so far.  After some thousands of such observations, all the same, should I expect to alive on any particular day (perhaps in the year 3000) with probability 1?  If not, then I need a more precise model of how I'm predicting.  The article statistical inference might be of some help. 66.127.52.47 (talk) 05:46, 6 March 2010 (UTC)
 * Granted, a sample of size N may not be representative of the underlying distribution, but what happens if we assume it is? - That is, if we assume that each of the samples in N has been randomly drawn in an unbiased fashion from the (otherwise unspecified) underlying distribution? (I'll note that the alive/dead distribution is time dependent in a way I'm assuming the sampling not to be: the probability of being alive at day n+1 is much greater if the state was "alive" at day n, versus being "not-alive" at day n. Let me be clear that I'm only considering systems where samples are uncorrelated - that is, where the probability of a particular result in trial n+1 is completely independent of the result of trial n, modulo the fact they are drawn from the same population.) -- 174.21.226.184 (talk) 06:20, 6 March 2010 (UTC)
 * Do you even know what the sample space is? Just how bad is it if you don't actually find the minimum?  The point is that with N samples, you will probably miss any "black swan" events whose probability is less than 1/N, so if you also know nothing about the sample space, you're probably hosed.  It would probably help if you can describe your actual application, to get more useful answers.  There has to be something you know or can infer about the prior distribution.  The article hyperprior might help you see some ways to formalize or quantify this. 66.127.52.47 (talk) 06:50, 6 March 2010 (UTC)
 * I have absolutely zero a priori knowledge of the sample space - all knowledge of it is solely contained within the N test samples. The underlying optimization problem is probably NP-hard (it's a derivative of the protein folding problem), so I'm unlikely to ever know what the true global minimum is, let alone how my non-deterministic minimization protocol samples the underlying search space. I realize that means that I won't be able to predict rare events, which is okay. I'm simply trying to gauge, to the best of my ability, with the knowledge I have, how effective my minimum-finding protocol is, as a function of number of runs. Yes, I may not find the true minimum with N, or even 1 000 000*N samples, but I simply want to know how much increasing the sample size from, say, n ~ N/5 to n ~ N/4 will improve the estimate of the minimum. -- 174.21.226.184 (talk) 16:23, 6 March 2010 (UTC)
 * Well why do you think the distribution even has a minimum? Suppose it's just the normal distribution, which means the "minimum" is minus infinity. 66.127.52.47 (talk) 21:59, 6 March 2010 (UTC)

If it makes it any easier, just assume that N *is* the underlying distribution. So my question now becomes: assuming a sample of size n drawn from a (total) population of size N, how do you calculate the expected minimum value for some parameter x as a function of sample size n, with the clarification that multiple members of N may share the same value of x? -- 174.21.226.184 (talk) 16:23, 6 March 2010 (UTC)
 * I'd do the following:
 * Take a random sample of n points from the N you have.
 * Calculate their minimum.
 * Repeat B times.
 * Calculate the average of all B experiments.
 * -- Meni Rosenfeld (talk) 16:50, 6 March 2010 (UTC)
 * Well what do you mean by "expected" in "expected minimum value"? Again this seems to suppose some knowledge of a hyperprior of some sort.  Otherwise all you know is your n samples (if drawn without replacement) have n/N chance of containing the global minimum.  Are we even presuming that the sample space is a set of real numbers (or maybe integers, since this sounds like a combinatorial problem)?  Again I'm thinking of a "black swan" distribution: N=1000000, N-1 elements of the sample space are in the interval [0,10], and the Nth element (the "black swan") is -101000.  Either you happen to see the black swan or else you don't.  If you don't see it, there is nothing to suggest that it is there at all.  You don't even know the expected value of a single draw, since the black swan completely dominates the sum of the possible outcomes. 66.127.52.47 (talk) 22:24, 6 March 2010 (UTC)
 * I just came across this:
 * Bayesian_probability
 * It has some interesting discussion. Added: Sunrise problem and the articles it links to are also interesting. 66.127.52.47 (talk) 23:59, 6 March 2010 (UTC)
 * By "expected minimum" I meant the likely (real-valued) minimum value which would be observed within a subsample limited to size n. To some extent, I was using the phrase "expected minimum" to refer to the expected value of the sample minimum (for the sample of size n), although at this point I'll take any relatively-easy-to-compute "average"-like metric. Let me clarify: Effectively, I have a black box which spits out numbers at me. For all practical purposes, I know nothing about how this black box decides which number to return, except for the fact that each number is *completely independent* of any number previously produced or any number that will be produced, as the black box completely forgets which number it produces immediately after it returns it and maintains no internal state between sessions. I can run the black box N times, and get a list of representative values. (In the "if it makes things easier" formulation, I can say that the black box randomly samples a member from a known population set of size N.) In a hypothetical future, I will produce n numbers from the same black box, for some (as yet to be determined) value of n. Given this data, I wish to know what I can conclude about the properties of this hypothetical sample of size n, without having to actually pull n numbers from the black box. Specifically, I am interested in the minimum value observed in the sample of size n. (That is, if N=300, n=4 and the values of the hypothetical future sample are {0, 6, -2, 10}, the relevant parameter is '-2', the minimum value within the sample set of size n). As the sampling is random, it is expected that this value should vary based on n (specifically, as the size n of the sample increases, the observed minimum should decrease). To be precise, there really isn't going to be a single minimum value for any given n, but rather a probability distribution m(x,n), giving the probability that the observed minimum value within a sample of size n is equal to x. I was hoping for some way of calculating some sort of parameter on this distribution (like the expected value) which describes how the distribution is "centered" on the number line. In writing my question, I was thinking of things like Student's t-test, where meaningful conclusions can be drawn on populations who are characterized by a sample set, rather than a fully-parametrized distribution. -- 174.21.235.250 (talk) 06:34, 7 March 2010 (UTC)

I think I've figured out a way to calculate what I was looking for. Let: By definition, we know that We now can iteratively calculate m(x,n) by hypothesizing drawing an additional number, and adding it to the n-1 numbers already drawn:
 * p(x) = the probability of drawing x
 * P(x) = the cumulative distribution function for p(x) (i.e. the probability that a number drawn is less than or equal to x)
 * m(x,n) = the probability of having a sample minimum of x in a randomly draw sample of size n
 * M(x,n) = the CDF for m(x,n) (i.e. the probability of having a sample minimum less than or equal to x)
 * m(x,1) = p(x)
 * M(x,1) = P(x)
 * m(x,n) = (probability the minimum so far is greater than x)*(probability of drawing an x) +
 * (probability the minimum so far is equal to x)*(probability of drawing an x) +
 * (probability the minimum so far is equal to x)*(probability of drawing higher than x)


 * m(x,n) = (1-M(x,n-1))*p(x) + m(x,n-1)*p(x) + m(x,n-1)*(1-P(x))

I would have preferred a non-iterative calculation (and one where I didn't have to explicitly calculate the CDF), but this works for my use, and hopefully is more robust then the average-of-random-samples method. -- 174.21.235.250 (talk) 04:37, 10 March 2010 (UTC)


 * That tells you the probability that the minimum is in the sample you drew. It tells you nothing about the minimum if the minimum is outside your sample. 66.127.52.47 (talk) 11:19, 12 March 2010 (UTC)

How to determine the angle a line drawn between two concentric arcs makes with the tangent of the outer arc..
(Apologies if this isn't a clear explanation)

In trying to draw something I cam across the following issue...

A line is drawn between two concentric arcs (centered on O) Define the starting point on the inner arc as A Define the end point on the outer arc as B

What is the angle the line AB makes with the tangent to the curve at A or B.

If it helps the distance AB and the radii of the two arcs are known.

Is there a simple solution to this?

The reason I was wanting to know was so as to set the relevant values when performing a transformation on other values.


 * Yes, express A and B in polar coordinates. That lets you immediately figure out the tangent angles (i.e. 90 degrees away from A and B respectively) and compare them to the vector AB as expressed in polar coordinates. 66.127.52.47 (talk) 14:53, 6 March 2010 (UTC)

That makes sense, but having drawn out a diagram I am still none the wiser... OK Assume for simplicty that A is at Cartesian(0,-r) on the inner arc....   A has polar form (r, -(pi/2) ) ? B has polar form (R, -(pi/2+delta) ) ? AB as a polar vector is (l, theta) ...

r,R and l are known. delta and theta are not known

ShakespeareFan00 (talk) 19:05, 6 March 2010 (UTC)


 * I don't really see either how polar coordinates are useful, but you can restate the problem as such: you have a triangle formed by A, B and O, you know the three side lengths r, R and I, and you'd like to find 2 of the angles (minus π/2). The formula you'd want is law of cosines. Rckrone (talk) 05:32, 7 March 2010 (UTC)

$$c^2 = a^2 + b^2 - 2ab\cos(\gamma)\,$$

$$ l^2 = r^2 + R^2 -2*Rr\cos(\gamma)\,$$

Solving for gamma:

$$ l^2 - r^2 - R^2 = -2Rr(\cos\gamma), $$

$$ \cos(\gamma) = \frac {l^2-r^2-R^2} {-2Rr} ,$$ -  This gives the angle between the end points..

From there the values needed to solve the angles I actually require are obvious :)

I'll post up a full worked out example in the next few days...

Notes
 * If $$ \gamma $$ is 0  then AB is quite obviously the difference in radii of the two arcs, which is the minimal length l can be
 * It should also be possible (using cosine or sine) laws to determine the maximum l for


 * This looks like it might also be useful in respect of another problem, that of approximating a curve using fixed length

lines ShakespeareFan00 (talk) 11:11, 7 March 2010 (UTC)


 * Hmm, Just had another look at it.

If you consider the triangle OAB and workout angle A then that's the solution desired. ShakespeareFan00 (talk) 14:54, 7 March 2010 (UTC)

statistics-- median in following case
we know that for getting the median of a grouped data with equal class length we first find cumulative frequency(cf) of the classes. then find n/2, the nearest(and greatest) cf. the class satisfying the condition is called median class. then we put the following formula to get median  median = l+[(n/2 - cf)/f]*h  where
 * l is lower class limit
 * n is the no of observation
 * cf is the cumulative frequency of the preceding class
 * f is the frequency of the median class
 * h is the height or length of the class

SO THE QUESTION IS
 * What will be the cf(of preceding class) in the formula if the median class is the first class.

THANX --Myownid420 (talk) 16:38, 6 March 2010 (UTC)
 * 0. -- Meni Rosenfeld (talk) 16:53, 6 March 2010 (UTC)


 * You should really have enough classes so the very first class isn't the one with the median in. Normally one should have at least ten classes, the data probably has rather a long tail compared to its interquartile range if it still has the median in the first class. Dmcq (talk) 17:03, 6 March 2010 (UTC)


 * I will point out, just as an aside, that the one fatal mistake in statistics is to focus so much on the math that you lose track of common sense. The median is a simple and easy to understand concept  (it's the counting midpoint of a set of ordered data points - figure out what n is, count from the left until your reach n/2, and there you are).  That's all your equations does, after having taken a couple of spins through the Twilight Zone.  Think first, calculate later.  -- Ludwigs 2  21:29, 6 March 2010 (UTC)