Memorylessness

In probability and statistics, memorylessness is a property of certain probability distributions. It describes situations where the time you've already waited for an event doesn't affect how much longer you'll have to wait. To model memoryless situations accurately, we have to disregard the past state of the system – the probabilities remain unaffected by the history of the process.

Only two kinds of distributions are memoryless: geometric and exponential probability distributions.

With memory
Most phenomena are not memoryless, which means that observers will obtain information about them over time. For example, suppose that $X$ is a random variable, the lifetime of a car engine, expressed in terms of "number of miles driven until the engine breaks down". It is clear, based on our intuition, that an engine which has already been driven for 300,000 miles will have a much lower $X$ than would a second (equivalent) engine which has only been driven for 1,000 miles. Hence, this random variable would not have the memorylessness property.

Without memory
In contrast, let us examine a situation which would exhibit memorylessness. Imagine a long hallway, lined on one wall with thousands of safes. Each safe has a dial with 500 positions, and each has been assigned an opening position at random. Imagine that an eccentric person walks down the hallway, stopping once at each safe to make a single random attempt to open it. In this case, we might define random variable $X$ as the lifetime of their search, expressed in terms of "number of attempts the person must make until they successfully open a safe". In this case, $E[X]$ will always be equal to the value of 500, regardless of how many attempts have already been made. Each new attempt has a (1/500) chance of succeeding, so the person is likely to open exactly one safe sometime in the next 500 attempts – but with each new failure they make no "progress" toward ultimately succeeding. Even if the safe-cracker has just failed 499 consecutive times (or 4,999 times), we expect to wait 500 more attempts until we observe the next success. If, instead, this person focused their attempts on a single safe, and "remembered" their previous attempts to open it, they would be guaranteed to open the safe after, at most, 500 attempts (and, in fact, at onset would only expect to need 250 attempts, not 500).

The universal law of radioactive decay, which describes the time until a given radioactive particle decays, is a real-life example of memorylessness. An often used (theoretical) example of memorylessness in queueing theory is the time a storekeeper must wait before the arrival of the next customer.

Discrete memorylessness
If a discrete random variable $$X$$ is memoryless, then it satisfies $$\Pr(X>m+n \mid X>m)=\Pr(X>n)$$where $$m$$ and $$n$$ are natural numbers. The equality is still true when $$\ge$$ is substituted.

The only discrete random variable that is memoryless is the geometric random variable. It describes when the first success in an infinite sequence of independent and identically distributed Bernoulli trials occurs. The memorylessness property asserts that the number of previously failed trials has no effect on the number of future trials needed for a success.

Continuous memorylessness
If a continuous random variable $$X$$ is memoryless, then it satisfies$$\Pr(X>s+t \mid X>t )=\Pr(X>s)$$where $$s$$ and $$t$$ are nonnegative real numbers. The equality is still true when $$\ge$$ is substituted.

The only continuous random variable that is memoryless is the exponential random variable. It models random processes like time between consecutive events. The memorylessness property asserts that the amount of time since the previous event has no effect on the future time until the next event occurs.

Exponential distribution and memorylessness proof
The only memoryless continuous probability distribution is the exponential distribution, shown in the following proof:

First, define $$S(t) = \Pr(X > t)$$, also known as the distribution's survival function. From the memorylessness property and the definition of conditional probability, it follows that$$\frac{\Pr(X > t + s)}{\Pr(X > t)} = \Pr(X > s)$$

This gives the functional equation$$S(t + s) = S(t) S(s)$$which implies $$S(pt) = S(t)^p$$where $$p$$ is a natural number. Similarly, $$S\left(\frac{t}{q}\right) = S(t)^\frac{1}{q}$$where $$q$$ is a natural number, excluding $$0$$. Therefore, all rational numbers $$a=\tfrac{p}{q}$$ satisfy$$S(at) = S(t)^a$$Since $$S$$ is continuous and the set of rational numbers is dense in the set of real numbers, $$S(xt) = S(t)^x$$where $$x$$ is a nonnegative real number. When $$t=1$$, $$S(x) = S(1)^x$$As a result, $$S(x) = e^{-\lambda x}$$where $$\lambda = -\ln S(1) \geq 0$$.