Talk:Random variable/Archive 1

Delete example?
Would it be appropriate to delete the "example" section at the bottom? 203.173.32.84 06:52, 22 Mar 2005 (UTC)
 * I think so, don't see what it adds. Frencheigh 08:20, 22 Mar 2005 (UTC)
 * Yes. Moreover, it is not an example of random variable. Its just a realization of random variable. Hardly misleading. --140.78.94.103 17:59, 24 Mar 2005 (UTC)

I thought this was very useful because it showed an example where the map was not the identity map. I propose putting it back. Pdbailey (talk) 00:48, 5 July 2008 (UTC)

False statement
A continuous random variable does not always have a density. —Preceding unsigned comment added by 194.4.140.135 (talk) 14:08, 11 June 2008 (UTC)

Definition
I dont like the definition here, not that I'm letting my ignorance of stats stop me from commenting, but could this be considered a clearer definition?:

"Random Variable

The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we often want to represent outcomes as numbers. A random variable is a function that associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated.

There are two types of random variable - discrete and continuous.

A random variable has either an associated probability distribution (discrete random variable) or probability density function (continuous random variable).


 * I think this is definitely an improvement but it should be pointed out that there exists r.v.'s which are neither discrete nor continuous (for example take the sum of one r.v. of each type) Brian Tvedt


 * The current definition is bad in that it is neither clear to the lay reader nor specific to the technical reader. The above definition is far better on both counts and so I will put it on the page. Pdbailey (talk) 21:35, 22 March 2008 (UTC)


 * BTW, sorry about the no summary edit, I hit the enter key accidentally. Pdbailey (talk) 21:36, 22 March 2008 (UTC)

Examples
"
 * 1) A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable.
 * 2) A light bulb is burned until it burns out. The random variable Y is its lifetime in hours. Y can take any positive real value, so Y is a continuous random variable.

I think the above definition (for the original reference, pls search via google) explains to me that it is a mapping between a value and the sequence number of that value, rather like a numeric index to a database record. This would appear to allow greater generalization in statistical definitions and algorithms. The earler Wikiedia definition led me to comment on another Wikipedia definition incorrectly because I think I may have misunderstood the concept after looking at this Wikipedia definition.

A random variable is a function that assigns a numerical value to an event. This example:


 * $$X = \begin{cases}\text{player 1 wins},& \text{if a 1 is rolled} ,\\

\text{player 1 wins},& \text{if a 2 is rolled} ,\\ \text{player 1 wins},& \text{if a 3 is rolled} ,\\ \text{player 1 wins},& \text{if a 4 is rolled} ,\\ \text{player 1 wins},& \text{if a 5 is rolled} ,\\ \text{player 2 wins},& \text{if a 6 is rolled} .\end{cases}$$

just assigns another event to an event. Could we change it to something like:

Consider a game where if a 6 is rolled on a fair six-sided die, player 1 wins, and otherwise, player 2 wins. Let X represent the number of times player 1 wins. Then


 * $$X = \begin{cases}0,& \text{if a 1, 2, 3, 4, 5 is rolled} ,\\

1,& \text{if a 6 is rolled} .\end{cases}$$

and


 * $$\rho_X(x) = \begin{cases}\frac{5}{6},& \text{if x=0} ,\\

\frac{1}{6},& \text{if x=1},\\ 0,& \text{otherwise} .\end{cases}$$

or maybe include an example of a continuous random variable? What do you think? Thelittlestspoon (talk) 07:33, 21 May 2008 (UTC)

This section should begin with a sentence of the form "A random variable is ...". To avoid this is to show that you don't quite know what to say -- like "I cannot define pornography, but I know it when it see it". Begin by trying to defining a variable (of the type seen in elementary school) -- A variable is a symbol with an associated domain, this symbol represents any member of the domain". Now what is a random variable? "A random variable is a symbol, with an associated domain, and a probability measure on that domain. The associated probability indicates how likely the symbol is to represent various values in the domain." Wrstark (talk) 04:15, 19 February 2010 (UTC)

Minimal meanderings on meta-definintions
In relation to the above, comparing the two definitions, one seems more mathematically formal than the other. We should probably bear in mind when posting (not that I've done much), that one's definition should stand on it's own as far as possible. That is, it should be very comprehensible to an average target reader in a minimum time. This means we may need to take great care when using references to other definitions and also avoid overly specialized terms. This is not to suggest that relevant specialized references should be avoided. Often experts in a field may use highly specialized terms in a definition as they are most familiar with that definition, although it may not be the most suitable for the general reader. This probably means, amongst other things, that we should try to use plain language to describe the concept as far as possible, together-with/based-on a minimum number of external definitions/explanatory references, especially those external to Wikipedia. Perhaps this comment could be added to the general rules for posting as a form of guidance.

Definition of cumulative distribution function
I have edited the page to consistently use the right-continuous variant (using less-than-or-equals as opposed to less-than) for the cumulative distribution function. This is the convention used in the c.d.f. article itself.

Also, I fixed was a minor error in the example, in that it implicitly assumed the given r.v. was continuous. For what it's worth, I agree that the example adds little to the article and should be deleted. Brian Tvedt 03:05, 14 August 2005 (UTC)

Reverted change to definition
The change I reverted was:


 * The expression "random variable" is an embarrassingly dogmatic misnomer: a random variable is simply a function mapping from events to numbers.

It's not so simple. A random variable has to be measurable. Also, the opinionated tone is not appropriate.Brian Tvedt 01:06, 21 August 2005 (UTC)

A question about Equality in mean
Why is "Equality in mean" as defined in 4.2 different from "Almost sure equality" ? A nonnegative random variable has vanishing expectation if and only if the random variable itself is almost surely 0, no ? Apologies if I missed a trivial point. MBauer 15:48, 28 September 2005 (UTC)


 * Yes, that seems the same.--Patrick 20:20, 28 September 2005 (UTC)

A 'first iteration' simple explanation possible?
In the first paragraph I find: " For example, a random variable can be used to describe the process of rolling a fair die and the possible outcomes { 1, 2, 3, 4, 5, 6 }. Another and the possible outcomes { 1, 2, 3, 4, 5, 6 }. Another random variable might describe the possible outcomes of picking a random person and measuring his or her height." I am not at all sure what the random variable *is* in these two cases from the description. How does a rnd.var. describe a process? Isn't it the result of a process that it pertains to? How does a rnd.var. *describe* the possible outcomes? My understanding is that it could be e.g. the number of dots facing upwards in an experiment being one throw of a die (but what about a coin? head/tail is not a number, or for a die with different color faces?), or e.g. the number of occurrences of the die showing a '1' in a given number of throws. In the second example, I guess it is the height of a person that is the rnd.var. Again I'm not sure how this *describe* the possible outcomes, I thought the 'range' of possible outcomes would be the interval from the height of the lowest person in a population to the highest, and I don't see how a rnd.var. *describe* this. What I often miss in definitions are some simple, yet precise descriptions and examples, before the more elaborate definitions, which are often too technical to be helpful at first (but good to visit later when my understanding of the subject has grown). Could anybody with a better knowledge of the subject please change/amend the introduction? Thank you. M.Andersen

80.202.85.143 11:07, 9 August 2006 (UTC)

I think it will be convenient to have sections in discussion pages in order to manage different questions around the same subject. Apart from this suggestion, I sincerely think that the text at the begining of the article is circular. Indeed, saying that "a random variable is a function that maps results of a random experiment..." is saying nothing. I agree with the point that "we should try to use plain language to describe the concept as far as possible...". But I think it must be clear when we are describing (informally) a subject and when we are defining (formally) the same thing. The correct definition is already on the text but it began with an horrible "Mathematically, a random variable is..." If we introduce, in mathematical subjects, a section of informal description and a section of formal definition, I think we will gain in clarity. I will not change the subject attending comments on my suggestion. --Crodrigue1 15:54, 19 November 2006 (UTC)

Other definition
==Incorrect sentence - "random variable"==

I don't think this sentence is correct:
 * Some consider the expression random variable a misnomer, as a random variable is not a variable but rather a function that maps outcomes (of an experiment) to numbers

.. A random variable is a variable... a value that can take on multiple different values at different times. It is NOT a function. A function could describe the rate at which that random variable takes on different values. They aren't the same thing. Just like an object in space or a thrown ball aren't the same things as the functions that describe their trajectories. Fresheneesz 07:05, 29 January 2007 (UTC)


 * Looking into it further, I think that there are two different definitions of a "random variable". In statistical experiments, you hold as many variables constant so that you can study a very few (preferably just one) variable. This variable *will* be random with some probability distribution. This is not the random variable that this article is talking about.


 * My question is: whats the difference between this and a probability distribution? We should add a second page covering the other definition or at least put a note on this article. Fresheneesz 07:11, 29 January 2007 (UTC)


 * Well, when you formalize things, a random variable becomes nothing than a measurable function. In the same way as a probability becomes nothing than a kind of measure. Oleg Alexandrov (talk) 16:07, 29 January 2007 (UTC)


 * I suppose. Well, I think some effort needs to go into reconciling these two mergable ideas of a "random variable" and a "random function" so that either definition will be consistent with the one in the article. Fresheneesz 21:08, 5 February 2007 (UTC)


 * As my calculus/probability professor said many years ago, "Random Variables - they are neither RANDOM nor VARIABLES" 65.96.187.78 19:34, 27 July 2007 (UTC)

A Lacking Explanation for the Average Reader
This page suffers from the same problem as most of the math definitions in Wikipedia. For the average reader, the definition is mostly NOT helpful. There is a single short sentence at the beginning that attempts to describe generally what a random variable is. Then the article goes straight into the formal definition. In order for this article to be helpful, this first sentence needs to be expanded upon. What are Random Variables for? How can one envision a random variable? Where would one be likely to use a random variable? In what other terms could one describe what a random variable means (without the formal math terms)? Perhaps a "function that maps outcomes from experiments to numbers" may not be entirely accurate, but it can certainly help the average person to better understand the concept. I would propose putting the formal definition in a completely separate second section, after the introduction. It's not just math people who are trying to read and understand some of these concepts. Please consider the average user when you write these definitions!


 * Bravo!
 * It seems that the Wikipedia is really changing in that regard and that it's no more aiding the accessibility of science for average people, as I first thought. You cannot imagine how much dedication and advertisement I have been carrying on for this great project, but it seems that something is going seriously wrong, since I am no more able benefit from reading the mathematical Wikipedia for example, along with some other types of articles :^(
 * This article was not helpful at all for me 77.30.105.76 (talk) 19:15, 14 November 2008 (UTC).
 * This article was not helpful at all for me 77.30.105.76 (talk) 19:15, 14 November 2008 (UTC).
 * This article was not helpful at all for me 77.30.105.76 (talk) 19:15, 14 November 2008 (UTC).

Title of Article: "Random Variable" vs. "Random Value"
Sorry to be negative, but this article is misnamed. A random variable is a variable chosen at random. E.g., if I have variables x, y, and z, then I might choose variable x at random.

I think this definition refers to a random value which might be assigned to a variable. As in:

x = rand(95) + 32;

Where if x represented an ASCII character, it could be anything from a space to a tilde. It's linked from so many places I'm a little scared to rewrite it. But in the context of many of the links, which are in cryptography, pseudorandom number generators, and password strength, it is even more clearly misnamed. What do others think?

P.S. Functions are not random. Functions are, by definition, deterministic. Call it pseudorandom function if you must, but there is another article on that.

—The preceding unsigned comment was added by GlenPeterson (talk • contribs).


 * They are called random variables. It's not Wikipedia's job to make up new names for things. --Zundark 17:55, 12 July 2007 (UTC)

Flipping coins
Is there any explanation for the seemingly astounding fact that we have managed to flip a negative number of tails in the first example? No surer way to confound the enterprising novice than to begin right away with total absurdity. My powers of visualization are nowhere near the point of being able to picture five negative coin flips. And how is w equal to both T and H if T and H have different values? Sometimes I think you guys are inventing your own mathematical notation as you go. ```` —Preceding unsigned comment added by 76.116.248.48 (talk) 03:46, 23 February 2008 (UTC)


 * It doesn't say anything about "five negative coin flips". Nor does it say that &omega; is equal to both T and H, it only says what value X(&omega;) takes if &omega; = T, and what value it takes if &omega; = H. I'll add a couple of ifs to make this clearer. --Zundark (talk) 08:32, 23 February 2008 (UTC)

Problem with "Functions of random variables"
The article asserts that the composition of measurable functions is measurable, and makes use of this "fact". However, this is false. In fact, there is a measurable g and continuous f such that g(f(x)) is not measurable. (See Counterexamples in Analysis by Gelbaum and Olmsted.) I am reluctant to edit the article to correct this, since I do not know how the probability community deals with this issue. My inclination would be to say that if X is a random variable and f is a Borel function, then Y=f(X) is a random variable; this works because the composition of a Borel function with a measurable function is measurable.

cshardin (talk) 17:31, 29 April 2008 (UTC)


 * Just to clarify, the counterexample is a non-Measurable function and you believe that this statement is true for all Measurable functions? Pdbailey (talk) 01:42, 30 April 2008 (UTC)


 * The counterexample is three functions: a measurable function g, a continuous function f, and their composition g o f, which is not measurable. Since all continuous functions on the reals are measurable, this example gives two measurable functions whose composition is not measurable.  I am not saying that for all measurable functions, their composition is not measurable.  I am saying that for some measurable functions f and g, their composition is not measurable.  It is wrong to say "the composition of measurable functions is measurable" when this only happens to be true for some measurable functions. cshardin (talk) 14:38, 30 April 2008 (UTC)


 * Perhaps you should start at measurable function (see the properties section) and then move over here--that article is more critical to the counterexample and is more critically flawed. BTW, I'm still not clear on if f is measurable in the counterexample. Pdbailey (talk) 15:19, 30 April 2008 (UTC)


 * The article measurable function is careful not to state outright that the composition of measurable functions is measurable. It makes a much more precise statement that specifies the exact sense in which each function involved is measurable.  I think the issue here is one of usage, not mathematics.  When one speaks of a function g from the reals to the reals as being measurable, does one mean that g is Lebesgue measurable, Borel, or something else?  (In the counterexample I refer to, g is Lebesgue measurable but not Borel; f is continuous, Borel, and Lebesgue measurable.)  The prevailing usage I have seen is that, in the context of functions from the reals to the reals, "measurable" means Lebesgue measurable unless otherwise specified; the article measurable function implies that, in the context of functions from the reals to the reals, "measurable" means Borel unless otherwise specified.  I am not an expert on usage so I will not take issue with that, and if that's the convention here, then so be it.  It might benefit from a clarification, though, since a lot of mathematicians will read "measurable" to mean Lebesgue measurable by default, when one is speaking of a function from the reals to the reals. cshardin (talk) 16:43, 30 April 2008 (UTC)

definition
I'm all for adding a technical definition later, but i think this article needs to state what it is talking about up front. Pdbailey (talk) 17:41, 30 April 2008 (UTC)

Determinism
I noticed the subject of determinism has not been brought up, and other articles seem to champion the incorrect viewpoint that random variables are not deterministic. It is important to recognize that a deterministic system can create variables where one of a set number of outcomes are equally likely.

IE those variables in which factors that create large variance in the outcome for a small amount of change such that all the infinite possibilities are rolled into a finite number of outcomes with an equal percent of the infinite possibilities being channeled into each one...

More general definition
At the moment the article seems to assume that a random variable only takes on values in the real numbers, but many authors use a more general definition. An informal definition might be: A random variable (often abbreviated to rv) is an object used widely in Probability and Statistics. An rv does not have a fixed value but instead takes on any of a set of values with frequencies determined by a probability distribution. The values taken on by the rv may be real numbers or integers or more generally they may be vectors or vertices of a graph or colors or members of any set whatsoever. Many authors restrict their definition to the real numbers or the integers.

We would then have to give alternate formal definitions for the restricted and the general case.

I think we need to do this because different authors do use different definitions. I'll make the changes if I don't get negative comments. Dingo1729 (talk) 05:31, 16 October 2008 (UTC)


 * Can you give a reference that uses a definition like this? Pdbailey (talk) 01:54, 17 October 2008 (UTC)

Trying to Understand Random Variables
I'm trying to refresh my memory on what a random variable is (maybe I never understood what it was in the first place), and seems like something very basic is missing: (it seems) a random variable is basically an assignment of real values to elements in the sample space. Such an assignment imposes a natural mapping from events to (real) values. If some version of what my previous statement, it could be added to the intro.

In the definition section, "real values" could be generalized to measurable spaces.

While I do not understand the purpose of using general measurable spaces (barely acquainted with them), it seems we should also add the following restriction on rv $$X$$, if this generalization is to be of much use: $$\forall f \in \mathcal{F}\, X[f] \in \Sigma$$.

A cosmetic suggestion: Why not use a more suggestive symbol for the set of events, such as $$E$$? If $$\mathcal{F}$$ is an established convention, we can stick to that.

Danielx (talk) 10:58, 9 November 2008 (UTC)

This example could be misleading...
"Continuous random variables can be realized with any of a range of values (e.g., a real number between zero and one)" It might easily lead to the idea that all r.v. must take on values between [0,1] —Preceding unsigned comment added by 71.231.101.149 (talk) 20:27, 10 February 2009 (UTC)

Why Random Variable have such name?
How is it "Random"? Why is it called Random Variable? —Preceding unsigned comment added by 220.241.115.165 (talk) 07:49, 31 March 2009 (UTC)

More on the introduction
Everyone has done a great job with this article and I see you've wrestled with many of the thorny issues surrounding "random variables," balancing mathematical precision and user-friendliness. You're almost there. But the definition in the introduction is still not right:
 * "Continuous random variables can be realized with any of a range of values (e.g., a real number between negative infinity and positive infinity) that have a probability greater than zero of occurring."

Actually, the probability for any value of a continuous random variable is zero. We can speak of a positive probability density, but not a positive probability. A set of values may have a positive probability. If this is what you mean by saying a "range of values ... that have a probability greater than zero", then this needs to be made clearer. The total probability of the range of values that the continuous random variable is defined on is 1, not just "greater than zero." However this sentence was meant to be understood (and it is not clear), it is not right. --seberle (talk) 20:02, 28 May 2009 (UTC)

Ambiguity in the meaning of "random variable"
The meaning of "random variable" and the corresponding denotation of a random variable symbol is ambiguous throughout the article.

On the one hand "random variable" is used to refer to an observation or generation process from which values can be obtained (in some distribution). This process is modeled by a probability distribution over a space of possible values. So the random variable symbol denotes a structure consisting of a probability space and a mapping to a value space.

But on the other hand many of the examples indicate that the random variable symbol actually denotes some value taken at random from the value space.

The second meaning is not fully coherent, since a mathematical symbol must have a unique denotation --- it cannot randomly refer to a value. But despite not being fully coherent, this second meaning is somewhat useful as it suggests how certain notation should be interpreted, which would be incoherent if the random variable always denoted the whole structure that models the random process.

For instance consider: P( X < k ) denoting the probability that the random variable X yields a value less than k (which is a member of the value space of X).

The X here seems to denote some value from the value space because it is compared to a particular element k of the value space. But the symbol "X" cannot itself have a random value, otherwise the expression P(X < k) would on some occasions have the value 1 and on other occasions have the value 0. What the notation is intended to denote is the probability of the set of sample elements of X such that their associated value determined by X is less than k. This is a kind of quantification operation, which might be represented by something like the following: Prob_X(S) where S = {x | x in Samp_X and Val_X(x) < k}

Clearly P( X < k ) is much simpler, but relies on an implicit and rather subtle interpretation of the notation.

Sifonios (talk) 23:31, 27 June 2009 (UTC)