Talk:Randomness extractor

Overly Mathematical
I agree with the comment that the article suffers the defect of being too mathematical at the expense of intuition and, on that account, perhaps only mathematicians have a chance to understand it. I invite the commentor to lend a hand to clarifying the meaning and importance of key concepts and also point out what are mere assertions. Skinnerd (talk) 14:17, 9 April 2013 (UTC)

Most of the Math in the article attempt to distil rather simple concepts into a mathematical domain. This make the article difficult or confusing for all but mathematicians in the field. On the other hand: the meaning and importance of the mathematical key concepts -- and their difficulties -- is not discussed or even mentioned. The article also bravely states some unprovable facts, that don't have any meaning in the real world. The interesting part is the difference between what has been proven, what is believed or assumed to be true, and how things work in practice. (End note from TRNG98.se)
 * — Preceding unsigned comment added by Bo at TRNG98 (talk • contribs) 13:42, 7 March 2013 (UTC)

This page is very badly written and there are several mistakes and inaccuracies in it. An expert in the field should have a look at it.

Also, it should be merged with Extractor. 132.77.4.129 (talk) 10:33, 24 December 2008 (UTC)

Attempted to Improve
I have tried to improve the wording without changing the math (which I don't consider myself qualified to do). My edits reflect my best understanding as a non-expert who nonetheless has some slight acquaintance with the theory of pseudo-random number generation. I note that a previous contributor has said the article contains several mistakes and inaccuracies. It may be the case that some of these have been taken care of in rewording. If not, they need to be attended to if this article is to stand alone. I also note the article's use of the term obliviate; this term needs to be defined and I am not knowledgeable enough to do it. Also a better job needs to be done of defining what is meant by error as used in connection with an extractor. Skinnerd (talk) 20:44, 23 October 2012 (UTC)

I also added a reference to extractors as unbiasing algorithms because this is a term that is encountered in some (older?) literature. Cited reference to Gifford. Skinnerd (talk) 23:58, 23 October 2012 (UTC)

Definitions Needed
To make this article more clear all of the symbols in the following definition should be defined:

Definition (k-ERF): An adaptive k-ERF is a function $$f$$ where, for a random input $$r$$ , when a computationally unbounded adversary $$A$$ can adaptively read all of $$r$$ except for $$k$$ bits,  $$|\Pr\{A^{r}(f(r)) = 1\} - \Pr\{A^{r}(R) = 1\}| \leq \epsilon(n)$$ for some negligible function $$\epsilon(n)$$.

For example, what is the meaning of $$ A^{r}$$? Skinnerd (talk) 00:14, 24 October 2012 (UTC)

RFC 5869?
RFC 5869 and the related paper (http://eprint.iacr.org/2010/264) seem relevant here. The paper indicates that it is important that the extraction and expansion are independent. I'm not qualified to comment on the quality of the work, but if both the paper and my understanding thereof is correct, then this independence is important to someone implementing an extractor (e.g., one shouldn't use SHA-256 both to extract and to generate output, or use HMAC-SHA256 with the same key for both operations--or, for that matter, use an unkeyed construct like SHA-256 for extraction at all). Wecotanoxa (talk) 12:17, 10 October 2013 (UTC)

Explicit Two-Source Extractors and Resilient Functions
I suspect this new development should be of interest to readers of this article. -- The Anome (talk) 09:42, 18 May 2016 (UTC)

Von Neumann extractor
The wording in the section about the extractor is a little difficult to understand. I am fixing it. Old: "His extractor took successive pairs of consecutive bits (non-overlapping) from the input stream." I read this five times and didn't understand the part about non-overlapping. I found a webpage "Generating uniformly random data from skewed input..." etc.  that explains it a little better, so I finally understand it. New: "From the input stream, his extractor took bits, two at a time." Friendly Person (talk) 02:51, 28 May 2020 (UTC)


 * Your wording is clearer. Thanks! I added (first and second, then third and fourth, and so on) to make it even clearer.


 * The original paper is here: Various techniques used in connection with random digits by John Von Neumann (1951)


 * A common error when writing a routine that does Von Neumann extraction is to overlap.
 * Such an algorithm is not the algorithm Von Neumann described.
 * Assume that we start with this (generated just now with a coin, heads = 1, tails = 2):
 * 1000100101101010
 * Von Neumann paired them up up like this: 10 00 10 01 01 10 10 10
 * ...and discarded any heads-heads or tails-tails result like this:
 * 10 = 1
 * 00 = Throw away result
 * 10 = 1
 * 01 = 0
 * 11 = Throw away result
 * 01 = 0
 * 10 = 1
 * 11 = Throw away result
 * 10 = 1
 * An overlapping algorithm would pair up 1000100101101010 like this:
 * 10 00 00 01 10 00 01 10 01 11 10 01 10 01 10
 * Instead of taking the first and second then third and forth toss like Von Neumann did, it takes the first and second, second and third, third and forth toss.
 * This destroys the property of the input pairs being independent. Now the second flip in one pair is the same as the first flip in the next pair.
 * 10 = 1
 * 00 = Throw away result
 * 00 = Throw away result
 * 01 = 0
 * 10 = 1
 * 00 = Throw away result
 * 01 = 0
 * 10 = 1
 * 01 = 0
 * 11 = Throw away result
 * 10 = 0
 * 01 = 0
 * 10 = 1
 * 01 = 0
 * 10 = 1
 * What's interesting is that nobody who flips physical coins would ever think of using the overlap method, but clever programmers who want to throw away fewer bits sometimes do. --Guy Macon (talk) 04:44, 28 May 2020 (UTC)

"Weakly random" and "weak entropy source"
Can anyone elaborate on what weak entropy is, and why radioactive decay is considered weakly random? Is there an article that could be linked here? 208.98.222.123 (talk) 13:59, 24 August 2023 (UTC)


 * There are two types on entropy sources: (near-)perfect "full entropy" (the term used by NIST, it is convenient since it is not overloaded) and everything else. The article had adopted a terminology of "strong" for the former, "weak" for the latter. All physical sources are subject to the formulas of the laws of nature and thus are typically not "full entropy". An average interval between two successive radioactive decay events, for example, is in practice nearly a constant, so some kind of postprocessing seems to be warranted (for the avoidance of doubt, I am neither a nuclear scientist, nor a statistician, so feel free to correct me). Dimawik (talk) 16:11, 24 August 2023 (UTC)