Talk:Fisher–Yates shuffle

code style
I separated testing the index from decrementing it, since doing both at once is discouraged in many standard coding styles. Also, this makes it easier to state what n exactly means, as an invariant. (The old version would need 'n is the maximum pertinent index', but then the initialization statement doesn't gibe with that.) It might be more natural to move the --n to the end of the loop body (and index n-1). I suspect a for-loop is actually more appropriate (esp. if the target reader is expected to recognize a pre/post decrement).

Should the variable 'n' be replaced with something more index-like, like 'i'? In particular, there is confusion between the algorithm's n (which is 0-based), and the n in the pseudocode's An (which is 1-based).

Finally, I like the discussion at the end about drawbacks of pseudo-random number generators, but that's orthogonal to the current page; should this page just link to that? (Like MaxEnt below, I kinda prefer expository treatments over encyclopedic ones in many cases.)

not-just-yeti (talk) 17:29, 14 March 2008 (UTC)


 * I agree with you regarding the indices. I changed 'n' to 'i', 'k' to 'j' and 'N' to 'n', according to common convention. I hope I didn't introduce any errors. Adrianwn (talk) 05:45, 30 April 2010 (UTC)

Comments from Ilmari Karonen's talk page
(Copied from User talk:Ilmari Karonen.)

I see you contributed this article recently. I read it through on a superficial level. It's quite good in pointing out the majority of pitfalls. You can never say enough about pitfalls where random numbers are concerned. A couple of points: there are many applications, such as video games, where a small bias is immaterial. In the case of using a language's built-in sort function, it be helpful to mention which sort algorithms are unbiased for this purpose. I think plain quicksort is OK, but some quicksorts oversample the pivot element. I would worry about those extra calls to the comparison function. The C++ STL is quite explicit about the properties of the provided sort algorithms, though I can't recall right now which are in and which are out for this purpose.

There are some who might complain that this article is more tutorial than encyclopedic (not citing every claim from another reference). I personally would favour seeing more expository synthesis, where the stiff/authoritative sources are assimilited for the benefit of a less expert reader. A good contribution. (Not expecting a reply, use my talk page if you wish to.) MaxEnt 06:52, 15 August 2007 (UTC)


 * Thanks for the comments. Since you've given me the excuse, I'll try to outline a bit where I've been planning to go with this article.  I've your comments here to the article's talk page, since I feel they could be useful to others besides just me.  Similarly, the rest of my response below is as much for myself (thinking out loud) and anyone else who might be reading than just you.


 * I agree that the "Potential sources of bias" section is rather lengthy; it just turned out that way. Part of the reason for spending so much affort and words on that section was my (perhaps misguided) aim to include more or less all of the issues already listed in the shuffling article; another was my personal experience that the Fisher-Yates shuffle, though simple, tends to be an easy algorithm to implement incorrectly.  Of course, much of the difficulty comes from not understanding the logic behind the algorithm, but instead treating it as a magic incantation that "just works somehow", a problem that I hope the earlier sections will help to correct.  Certainly I wouldn't see condensing the problems section into a more concise and easily understandable form as a bad thing at all.


 * As for sourcing, I do need to work on it more, if only by matching the existing sources with claims that are found in them. Much of the "Potential sources of bias" section can be sourced to either Knuth or Fisher&Yates themselves, and I'm sure there are others sources that treat the issues in more depth — it's not like I just pulled it all out of a hat.  I did name all the &lt;ref> tags, in anticipation of reusing them, but I'll still need to actually do that.


 * Another issue I'm aware of that needs work is distinguishing between the general problem of shuffling a pre-existing finite set and the specific problem of obtaining a random permutation of the integers 1–N (or 0–(N−1)). A particular reason for emphasizing the distinction is that there's a variant of the algorithm (some might consider it a different algorithm entirely) due to Knuth which is specific to the second task, avoiding the swap operation by running the shuffle "backwards" and initializing the array as it is shuffled:


 * Let n := 1.
 * Choose a random k such that 1 ≤ k ≤ n.
 * If k ≠ n, let An := Ak.
 * Let Ak := n.
 * Increase n by one.
 * If n ≤ N, repeat from step 2.


 * I haven't actually added this to the article yet, both because I want to try chasing down the original reference (Knuth, The Stanford GraphBase, 1994, p. 104) rather than cite the oblique mention in The Art of Computer Programming (3rd ed.) and because I want to clarify other parts of the article at the same time.


 * Similarly, I don't think Sattolo's algorithm really makes much sense when applied to any set that isn't ordered to begin with, although I've half a mind to split that one off into its own article anyway. Come to think of it, I do believe the "backwards" variant above can also be made to produce cyclic permutations by adjusting the range of k like in Sattolo's algorithm.  For that I have no cite, though I do believe I can sketch a proof.


 * As for the "sort with random comparator" approach, the reason I haven't mentioned which sorting algorithms produce unbiased results with it is that I'm not aware of any, though I'm willing to believe there could be some. In any case, though, those who attempt that approach usually do so using a built-in sort function whose implementation details may vary and thus can't be relied on.  It might be best to just remove the weaselly "some" from that paragraph entirely...  In any case, it really could use better references, though I don't know if the approach has actually been analysed in any academic publication.  —Ilmari Karonen (talk) 11:51, 15 August 2007 (UTC)


 * The "backwards" running of the shuffle (where the random element to be swapped is chosen from the previously traversed section of the array) is worthy of mention in the article. The big advantage of the "forward" running is that, since the final location of an element of the array is determined with each iteration, pigeon-hole analysis is made easier; specifically, it is clear that all N! permutations are possible.  It is interesting that such a minor change as choosing the random element from the "other half" of the array brings about quite a different process (the final location of no element is determined until the very last iteration), but it still results in a fair, random shuffle.  The validity of the "backwards" running shuffle is easier to show inductively.  (It can even be implemented recursively, calling itself to permute all but the last element of a list, then randomly swapping in that final element.)


 * This "backwards" running shuffle merits its own section because, as you said, despite its similarity it could almost be considered an entirely different algorithm. The algorithm should be introduced and its validity proven before presenting the trick of initializing the array on the fly when a permutation of a range of integers is desired.  -- Kirk Hilliard (talk) 06:42, 27 July 2009 (UTC)

Example Programs are Confused
"Below is the same algorithm in C#". The program below that statement is obviously not the same algorithm because all of the variables have been renamed in a contradictory manner. "n" is now "i", "k" is now "n". How on Earth is this supposed to make sense? Miqrogroove (talk) 22:41, 18 January 2008 (UTC)


 * I cleaned up the C# code to use the same variable names, but then decided to remove it entirely since the result was nearly identical to the Java version. Having multiple example implementations might not be a bad idea as such, but there's little point in providing minor variations for a bunch of different C-derived languages.  Let's at least have something with different-looking syntax, like Python for example.  —Ilmari Karonen (talk) 05:09, 20 January 2008 (UTC)

I don't think there is any reason for the Java implementation to index array elements at "i-1" - why not just have the loop go from "length - 1" to "1"? The way it is just adds confusion.--68.53.229.169 (talk) 02:44, 8 October 2010 (UTC)

Inaccuracy
"Also, of course, no pseudorandom number generator can produce more distinct sequences than there are distinct seed values it may be initialized with. Thus, it doesn't matter much if a generator has 1024 bits of internal state if it is only ever initialized with a 32-bit seed."

This is not necessarily true. If we switch to a different algorithm, such as the reservoir model (cite: Knuth vol 2 section 3.4.2 algorithm R), we only need our pseudorandom number generator to produce numbers [0 N) to produce all N! sequences. —Preceding unsigned comment added by 142.104.68.102 (talk) 23:51, 19 March 2008 (UTC)


 * Interesting, I didn't know that. But maybe what the orignal author wanted to say was, that implementations probably do something like one seeding - one shuffle. Then the program terminates and the long period of the PRNG goes to a waste. I understand that is why under Unix we should use the system provided random (but I don't really know much about that). If we use a PRNG library in our program (e.g. a Mersenne Twister is provided for Free Pascal under Windows) we probably should better save the internal state of the PRNG (and reload+continue from there on the next run) instead of seeding every time we start our program. I hope that wasn't too off-topic. Reading about the potential implemenation problems here was very helpful for me; I might have fallen into one of those traps. --Stevemiller (talk) 04:09, 27 March 2008 (UTC)

comparison with other algorithms
"However, care must be taken to ensure that the assigned random numbers are never duplicated, since sorting algorithms in general won't order elements randomly in case of a tie."

Isn't the whole point of the assign+sort algorithm that duplicates don't matter since which pair of numbers is duplicated depends on the distance between the same random number being produced - which is of course equally random ? The order of the duplicates would be implementation specific, but even if you know that a particular platform always kept duplicates unswapped - you wouldn't know which pair are duplicates so there is no loss of randomness.


 * I think it does matter -- consider where you only have two elements, and a tie occurs 2% of the time. Then the first element stays first 51% of the time.  Similarly, if you have three elements and occasional ties, the first element has a (slightly) higher-than-1/3 chance of staying first. not-just-yeti (talk) 21:33, 6 October 2008 (UTC)

Float resolution evenness
Article section "Modulo bias" begins : "A related problem occurs with implementations that first generate a random floating-point number ...". It may be worth considering possible consequences of the following :-

Looking at some Random [0..1), using IEEE Double, in browsers and elsewhere, I see some cases where the PRNG is 32-bit and converted to Doubles which are multiples of 2^-32 (poor, but OK), and some with 53 bits of PRNG making Double multiples of 2^-53 (no problem, if done correctly) - and I hear of one, in Mac iCab, which apparently takes a 64-bit PRNG, divides by 2^64, and returns the nearest Double.  In that final case, the absolute "fineness" of the Doubles varies with magnitude.  That could have some effect on results.

ASIDE : I see mo links to Wikipedia on Random [0..1) as opposed to PRNG generating integers. Many Wiki pages have a "see also" section near the end.  But as yet I see no page dealing with Random [0..1) (which would make a link even more useful if such exists).

82.163.24.100 (talk) 15:23, 21 August 2008 (UTC)

PRNG state (moved from article page)
131.15.48.58 recently [ added the following comment] to the article page: Note on the above: While the advice about the size of internal state and distinct seed values is no doubt good, some details must be wrong. A simple random number generator with 7 internal states [1 .. 7] was able to produce 12 different shuffles of four values; that is more different resulting sequences than internal states. Is there perhaps an assumption that every shuffle starts from the same sequence? My experiment reshuffled the already shuffled sequence. I've removed the comment from the article and placed it here where it belongs. The answer, incidentally, is yes: in the Limited PRNG state space section, I did assume that the array being shuffled and as well as the internal state of the PRNG would be reset between runs. If you store the shuffled array and use it as the initial state for the next shuffle, you'll do much better: in fact, you've essentially made the array part of the PRNG state, increasing the effective period by a factor of up to N! (and probably at least a significant fraction of that) where N is the length of the array.

(This is precisely why the often half-hearted shuffling of real card decks manages to produce such a good semblance of randomness: depending on the game, the deck usually retains significant randomness even after a round of play, and just needs a bit of mixing to appear mostly random again. Effectively, the deck itself acts as a roughly 226-bit PRNG (or more accurately, since some true randomness is being injected, an entropy pool), with the shuffle plus the game forming the transition function.  This is also why a new, sorted deck needs extra shuffling before use.  The Solitaire cipher is based on a similar idea.)  —Ilmari Karonen (talk) 23:22, 11 September 2008 (UTC)

Proof?
Is there a proof or something? I think a concise mathematical proof should be added to the article because at the moment there are just statements and nothing to back them up... 85.3.133.216 (talk) 14:23, 26 September 2009 (UTC)


 * Proofs of correctness can be found in the referenced sources. In any case, a simple proof follows from the observation that
 * there are n! distinct possible sequences of random numbers which may be generated during the shuffling of an n-element array,
 * each of these sequences is generated with the same probability,
 * each of these sequences generates a different permutation of the array, and
 * there are exactly n! different possible permutations of the array.
 * Thus, the algorithm must generate every possible permutation with the same probability, Q.E.D. —Ilmari Karonen (talk) 16:17, 31 December 2009 (UTC)

Java example code appears to be wrong
The Java version has the comment:

int k = rng.nextInt(n); // 0 <= k <= n.

But actually Java's nextInt(n) method returns a k such that 0 <= k < n. And this fact is vitally important! If k == n, then we would index outside of the array. This caught me when translating the Java code into another language, and I used the comment at face value. —Preceding unsigned comment added by 79.78.227.249 (talk) 12:03, 3 October 2009 (UTC)


 * Note the most recent edit to the code. Did that leave the code correct but the comment broken? -- ToET 13:15, 3 October 2009 (UTC)
 * I've updated the comment to match the code change.-- ToET 13:23, 3 October 2009 (UTC)

Not O(n)
The shuffle can't be O(n), because it requires us to generate random numbers in the ranges [1, 1], [1, 2], ..., [1, n]. Since generating a random number in the range [1, x] requires log2(x) random bits, the total number of random bits needed is log2(1) + log2(2) + log2(3) + ... + log2(n) = log2(n!) = O(n log n). A PRNG or HRNG can't possibly run in less than constant time per bit, so the shuffle as a whole can't be faster than O(n log n). NB: Since a list of length n has n! permutations and it takes log2(n!) random bits to choose among them, similar reasoning shows that no unbiased shuffle can be faster than O(n log n). Neon Merlin  22:46, 17 December 2009 (UTC)

By convention we typically agree that fundamental numeric operations take constant time, since they are of fixed width on real computers. Otherwise, multiplication needs to be treated as an O(n lg n) operation (although typical implementations are O(n^2)), which means that array accesses also do. This would make a typical in-place array sorting algorithm O(n lg^2 n), since it does O(n lg n) swaps, each with a cost of (lg n) for the array indexing. So Fisher-Yates would still win with cost O(n lg n).

If you're shuffling large enough arrays that you need an arbitrary-precision random number generator and math library you'll see a logarithmic slowdown or worse, for both sort-shuffle and FY shuffle. This is not the common case, and we typically describe the algorithms as O(n lg n) and O(n) respectively --69.168.48.167 (talk) 20:29, 25 December 2009 (UTC)


 * When you say "we typically agree that fundamental numeric operations take constant time", do you mean that in the limiting case memory words are infinitely long and the supply of input entropy (if we're using an HRNG) is infinite? Neon  Merlin  08:26, 31 December 2009 (UTC)

Original Research
Since no source is cited for the (various language) implementations, then surely these constitute "Original Research" and should be struck out. (Note : I don't disagree with this sort of thing being posted ; but I've been slapped down often enough for posting "original research" that I'd like to hear how/why this shouldn't suffer the same fate.) —Preceding unsigned comment added by A Karley (talk • contribs) 13:41, 1 March 2010 (UTC)

Too many examples.
Why are there examples in so many different languages? It gets in the way of the point of the article (which is shuffling, not a comparison of different languages). Also, as demonstrated in the other discussion topics for this page, it leads to several incorrect and/or contradictory examples. There should be one example and it should be an an ideal implementation of this algorithm in pseudocode. Any, competent programmer can then implement it in their language of choice for their use.

Adam Butt (talk) 19:26, 1 March 2010 (UTC)


 * Ditto, you should feel free to get rid of extraneous examples. I've cleared out many. — sligocki (talk) 04:54, 2 March 2010 (UTC)


 * I agree that all the explicit code serves little purpose. Note that pseudocode is already there, for the main cases. However, some stuff in the article refers to previous (or removed) code examples, so care should be taken to leave/bring the text in a consistent state. Marc van Leeuwen (talk) 13:13, 2 March 2010 (UTC)

Python example seems wrong
The Python code, as given, produces an index error in "items". To fix, in the expression "range(2, len(items) + 1)" on line 4, remove "+ 1" to get: "range(2, len(items))".

The function range(start, end[, step]) returns a list containing the integers from to the one preceding, with the option of controlling step size and direction with the third argument. Indices in python start with 0, therefore the last element has an index of . Therefore, "range(2, )" will make the program try to access the element behind the last, which does not exist.

Also, the description of the "modern algorithm" calls for randomization to start with index 1, not 2, for 0-based indices (where an index of 1 means the _second_ element), which means the 2 should get replaced by a 1.

All of this in mind, I suggest to change the complete line to read (without the quotes):

"for n in range(len(items)-1, 0, -1):"

This will count down from the last element to the second (inclusive). Getting the reversed list right away seems more efficient than having to iterate a second time just to reverse.

Til 85.178.23.84 (talk) 19:20, 23 April 2010 (UTC)


 * Yeah, sorry for confusing the mistake. I have reverted. Probably better not to try and make this more pythonic, it just needs to be an algorithm which is clear the way it is. Cheers, — sligocki (talk) 04:19, 24 April 2010 (UTC)


 * Agreed, good point about the need for clarity here. The algorithm's much less obscure without the python specifics. Til 85.178.23.84 (talk) 09:05, 24 April 2010 (UTC)

inside-out method doesn't work
Either the pseudocode for the inside-out method isn't clear to me, or there's something wrong with it. I read the leftward arrows as assignments. But this would lead to a null assignment with 50% probability on the first iteration, so something is wrong. To be sure, I implemented it, and, as expected, it gave an invalid answer.

Again, I may just be reading it wrong. But in that case, the pseudocode is not well-written. (This commentator is an engineer with 16 years experience.) — Preceding unsigned comment added by 140.180.191.41 (talk) 05:08, 23 February 2012 (UTC)


 * Pseudocode looks OK to me. At the first iteration, there should be a 50% of a null exchange: it's adding a second card to get a two-card deck, and that new card is equally likely to end up as the top card (position 0) or the bottom card (position 1/null exchange). At iteration i, there should be a 1/(i+1) probability of a null exchange.
 * I don't know what you mean by an invalid answer. Are you using a language that raises an exception when an uninitialized value is referenced? The first assignment need only be done if i!=j. Are you calling  instead of  ?
 * Glrx (talk) 17:03, 24 February 2012 (UTC)

I suppose the loop should go from 0 to n-1, that might be the problem. --134.76.205.137 (talk) 09:51, 4 May 2012 (UTC)


 * Did you actually read the text that precedes the code? It says "In case the random position happens to be number i, this "move" (to the same place) involves an uninitialised value, but that does not matter, as the value is then immediately overwritten." Performing a[i] ← a[i] is a no-op regardless; a[i] will not be any more or less undefined after executing it than before it. Marc van Leeuwen (talk) 14:11, 4 May 2012 (UTC)

Circular definition codes
My mind emits

on:
 * j ← random integer with 0 ≤ j ≤ i

Proposed bug fix:
 * j ← random integer between 0 and i, limits included

Rursus dixit. ( m bork3 !) 19:55, 12 August 2012 (UTC)


 * No, better:


 * j ← random integer from interval [0, i]


 * Rursus dixit. ( m bork3 !) 19:58, 12 August 2012 (UTC)

Illegal offset by one in examples
I don't want to change anything yet since it might be me who am stupid. But I'm fairly sure this is wrong in at least the modern algorithm:

j ← random integer with 0 ≤ j ≤ i

This should be non-inclusive of i, as in j ← random integer with 0 ≤ j < i There are probably errors in the other examples as well but I haven't verified them. How do I know this? I tried it. It doesn't work if i is inclusive, it does seem to work if it's not inclusive. — Preceding unsigned comment added by 213.103.201.140 (talk) 18:48, 29 March 2013 (UTC)


 * I don't see a problem. In first alg, the outer loop has i march down from n&minus;1, so an inclusive random is not out of range for the array. Glrx (talk) 18:23, 30 March 2013 (UTC)

is there error in Python example?
j = randrange(i) # 0 <= j <= i-1 items[j], items[i] = items[i], items[j] In this code j will never be equal to i, so there is no chance for element to stay at it's position in current iteration. I tested this example, and it did not produce uniform distribution. The right example, seems to be: j = randrange(i+1) # 0 <= j <= i items[j], items[i] = items[i], items[j] — Preceding unsigned comment added by 84.245.209.190 (talk) 13:24, 30 October 2013 (UTC)
 * Please read the whole “Sattolo's algorithm” section again, carefully. The fact that the element cannot stay at its current position is the whole point of this algorithm. It is supposed to generate the uniform distribution on n-cycles, not on arbitrary permutations.—Emil J. 14:14, 30 October 2013 (UTC)


 * There is my tests: http://pastebin.com/rgB85fz6
 * Test procedure: t-times perform shuffling of array [0..9], sum values at corresponding posisions - expect to obtain array of 10 elements with similar numbers in it.
 * First function generates uniform distribution, but function from example - don't. — Preceding unsigned comment added by ‎84.245.209.190 (talk) 16:01, 30 October 2013‎ (UTC)
 * I don’t see any test results on that page, and I’m not going to read through your Python code. For any i < n, the expected position where i is moved by a random n-cycle on {0, ..., n − 1} is $$\frac n2-\frac i{n-1}$$, whereas for a random permutation it is $$\frac{n-1}2$$, so you should get different results from these two cases.—Emil J. 16:30, 30 October 2013 (UTC)

Results are different.

100000 arrays of [0..9], have been shuffled and summed: j = randrange(i) [499261, 489663, 476216, 465605, 456629, 445249, 434744, 421796, 409994, 400843]

j = randrange(i+1) [449078, 451217, 449838, 449029, 449888, 449720, 450071, 450504, 451427, 449228] — Preceding unsigned comment added by 84.245.209.190 (talk) 16:45, 30 October 2013 (UTC)
 * Right, so the results confirm what I am telling you, the algorithm behaves correctly.—Emil J. 16:50, 30 October 2013 (UTC)
 * i'm totally wrong, advice "read again, carefully" was very good one... — Preceding unsigned comment added by 84.245.209.190 (talk) 17:12, 30 October 2013 (UTC)

Difference between "permuation" (Fisher-Yates) and "cycle" (Sattolo)
This is a great article, overall but I have a little problem. Maybe it's I who is dumb, but the difference between those are not clear. It just says you can accidently implement Sattolo instead of Fisher-Yates, but it doesn't explain for people who aren't familiar with math theory, what is the difference between a "permuation" and a "cycle", from a computer science (not math) perspective.85.1.241.60 (talk) 23:21, 20 January 2014 (UTC)

Inside-Out Adjustment
I'm tempted to adjust the Inside-Out shuffle to the following as the loop's first cycle is pointless. Any thoughts? 203.109.198.155 (talk) 21:43, 19 May 2014 (UTC)

To initialize an array a of n elements to a randomly shuffled copy of source, both 0-based: a[0] ← source[0] for i from 1 to n − 1 do j ← random integer with 0 ≤ j ≤ i if j ≠ i a[i] ← a[j] a[j] ← source[i]


 * Keep existing. The pointless part of the first iteration shows a lot about how the algorithm works. The if statement was added to avoid an uninitialized reference, but makes the algorithm look more complicated. With the change, the code would have a bounds check when asked to shuffle zero elements; the original code handles that case. Glrx (talk) 21:57, 19 May 2014 (UTC)
 * Keep existing. I agree with Glrx. This premature micro-optimization (pulling out one iteration of the loop to save a single if statement) clutters up the code and causes a crash when n = 0. We should be aiming for readability and correctness here, not trying to squeeze out small increments of speed. —David Eppstein (talk) 22:35, 19 May 2014 (UTC)

Fair enough. I'll leave it as is. 203.109.198.155 (talk) 04:52, 27 May 2014 (UTC)


 * Modify. As regards readability, Mister Eppstein and User_talk:Glrx, what is the point of using a zero-based array? A common countdown-loop termination in assembler looks like this: dec  ecx jcxnz shuffle In other words, the loop does not cycle when n=0, so you are using a one-based array, which is convenient in threaded contexts, where you want to store flags for an array like: counted, shuffled, sorted, saved, read-only: I would store flags in the zero-element. Zero-based countdown loops are misguided attempts at efficiency, and in some languages, yes, also a bounds-check failure. Plus, I would be apt to make one-off errors. 70.74.164.9 (talk) 20:49, 6 February 2015 (UTC)
 * Do you really do most of your programming in assembler? I did, long ago, but I don't think most of the readers of this article are likely to. Most modern programming languages (C/C++/Java/Python/etc) use zero-based indexing, so we should use it too to avoid unnecessarily confusing readers who program in those languages. —David Eppstein (talk) 21:40, 6 February 2015 (UTC)

Plain English Description?
Pseudorandom number generator

Start with such a thing. Scale output to number of tunes. Use series calculated as an index for numbers to exchange in an array of counting to avoid duplication. The other index is another count. Get a list of tunes in track or alphabetical order (which might be the same). Number tunes according to your random series. Radix Sort tunes by numbers.

If you run WinAmp, press misc/sort/randomize list, then play the tunes *in order* ("123" is dark), then you can see some of this.

70.74.164.9 (talk) 15:41, 5 February 2015 (UTC)

last step
for i from 0 to n − 1 do j ← random integer with i ≤ j < n exchange a[j] and a[i] in this code last iteration will always do nothing because j will be equal to i, therefore loop might as well be to n - 2 Norill (talk) 16:53, 7 April 2015 (UTC)

Separate (Additional) article specifically for Modulo Bias?
Are there other ways modulo bias can be avoided/minimized? Should there be a separate article (or at least a redirect) specifically for it? I just ran into the term while looking for information about the ObjC/Swift PRNG function arc4random_uniform, which is supposed to avoid it. (So I don't know enough about it to WP:BOLD :) Jimw338 (talk) 07:10, 4 July 2015 (UTC)
 * Rejection sampling. Glrx (talk) 17:56, 7 July 2015 (UTC)

Another (equivalent?) algorithm
This variation recently occurred to me: -- Randomly shuffle array a[0...n-1] for i from 1 to n-1 do r ← random value, where 0 ≤ r < 1 if r < 0.5 then exchange a[0] and a[i] The loop shuffles the array by randomly deciding whether or not to swap the first card with the 2nd card, then the 3rd card, and so on until the last card is reached; each swap has a 50% chance of occurring. If this is indeed a valid working algorithm, then I am certainly not the first one to think of it, so it undoubtedly has a formal name. (I am not entirely certain, however, if it is capable of producing all n! permutations of the elements of a.) If it is, then the question is, is this algorithm similar enough to the F-Y algorithm to be considered a variation of it, and thus be worthy of mention in the article? — Loadmaster (talk) 18:08, 29 December 2015 (UTC)


 * WP is not a place for your WP:OR. You need to have reliable sources for algorithms. I doubt you'll find a reliable source for the above nonsense. The last element,  can only end up in two places rather than n places. You should also consider $n! >> 2^{n}$. And that doesn't even start to address bias.   is exponentially more likely to make it to position $1$ than position $n&minus;1$. Glrx (talk) 18:26, 29 December 2015 (UTC)


 * Well, thank you for the positive critique. Just thought I'd ask, since I was curious about it. I did not intend it to be OR, but rather thought that it might be another existing variant of F-Y. But as you show, it is not. — Loadmaster (talk) 19:13, 29 December 2015 (UTC)

Is "inside-out" really Fisher-Yates?
I can't see much relationship between them. If inside-out is Fisher-Yates, then what algorithm to choose uniformly from the $n!$ options is not Fisher-Yates? --73.93.124.14 (talk) 07:19, 4 August 2016 (UTC)

Comparison: assign random numbers + radix sort
The section "Comparison with other shuffling algorithms" suggests that a reasonable alternative algorithm is to assign a random number to each element of the permutation and then sort the permutation based on those random numbers, and suggests this might be preferable in some cases due to being parallelizable.

However, this relies on assigning unique random numbers to each element. But the most efficient way that I know to assign N random numbers that are guaranteed to be unique is...to create a random permutation of the numbers 1..N. And obviously, if the first step of your random-permutation algorithm is to create an (equal-size) random permutation, then you have only moved the problem, not solved it.

Is there an efficient way to assign unique random tags that does not require creating a random permutation (or doing equivalent work)? If so, perhaps the page should explain how. (Maybe with example code?) If not, then I don't see how this can be considered a valid alternative. - Antistone (talk) 07:34, 11 August 2016 (UTC)
 * The radix sort algorithm still works if you make the range of the random numbers bigger, say 1..N3. That's big enough that the probability of getting a collision and having to repeat is O(1/N), negligably small. —David Eppstein (talk) 15:55, 11 August 2016 (UTC)
 * I agree with David Eppstein. Also, one could assign random numbers, sort the array, find all those blocks that got the same random number by chance, and then just shuffle those blocks. Glrx (talk) 23:56, 17 August 2016 (UTC)
 * You can create a set of unique numbers without shuffling a list using format-preserving encryption and a counter. In fact if you have such an algorithm, a key, and a function to map integers 0..(n-1) to the values you want shuffled then you needn't use an array at all.  You simply calculate the value at index i as a function of i when it's needed. --73.93.124.14 (talk) 23:31, 4 March 2017 (UTC)

I don't see how radix sort can sort the random numbers in O(n) time as claimed in the article. The random numbers need at least log_2(n) bits, so radix sort will require at log_2(n) passes, giving O(n log n) time. — Preceding unsigned comment added by 2A02:1210:2E16:4D00:6D2A:ACA8:919E:62C4 (talk) 05:17, 4 October 2022 (UTC)


 * Let me guess: You are using one of those bad algorithm texts that presents radix sort only using binary numbers, sorted by radix sort one bit at a time, so that its running time, $$O(n\log_2 U)$$, is never faster than a comparison sort for an input where the range of values is large enough to allow all inputs to be distinct?
 * That might be a useful way to start to understand the algorithm but it is not the correct way to implement radix sort. Instead you should use a radix roughly proportional to $$n$$ (usually a nearby power of two so that the digit-extraction can be done quickly with bitwise binary shifts and masks). Then the running time is $$O(n\log_n U)=O((n\log U)/\log n)$$ and in this case when $$U=n^3$$ you do indeed get $$O(n)$$ running time. —David Eppstein (talk) 05:56, 4 October 2022 (UTC)

"Example" section: Modern Method builds output in reverse order
The description of the "modern method" says nothing about inserting to the output list at the head. The previous example appends the newly selected item to the end to the list, which seems more clear.

Is there a specific reason to put the newly selected number at the beginning of the output list?

<< For our first roll, we roll a random number from 1 to 8: this time it is 6, so we swap the 6th and 8th numbers in the list:

The next random number we roll from 1 to 7, and turns out to be 2. Thus, we swap the 2nd and 7th numbers and move on:

This should be

to match the prior example. Mjshulman (talk) 23:05, 24 November 2017 (UTC)

Comparison: misleading statements
The comparison section states that methods like merge sorts are not uniform, without any reference. It then states that coin flipping with a bounded number of times cannot produce uniform results, but then providing a calculation for the situation with a fixed number of coin-flips. Since the merge sort does not need a fixed number of coin-flips, the comparison does not apply and the statements are misleading.

I suggest that those un-referenced and IMO misleading statements are removed. 90.145.31.194 (talk) 16:28, 8 February 2019 (UTC)
 * The argument is correct, and applies whenever the number of coin flips is bounded, not just when it is fixed. (A bounded coin flip algorithm can be thought of as one that flips a fixed number of coins, the number given by the bound, but then might not look at some of them, so the two models are not really different.) In merge sort with coin flips replacing the comparisons, the number of flips is indeed bounded. So you can only get probabilities per permutation that are dyadic rational numbers (with powers of two as denominators). But the correct probabilities have other denominators, so this algorithm does not generate the correct probabilities. —David Eppstein (talk) 17:11, 8 February 2019 (UTC)


 * Thanks for the clarification. I still think more references are required, since the only reference now is the specific case of the Microsoft Browser Selection with 5 options. 90.145.31.194 (talk) 12:56, 12 February 2019 (UTC)

Need to compare with naive/simple method
[Edit: OK I added a short section with code example of what not to do. Hopefully it will help prevent readers from using a naive method.] — Preceding unsigned comment added by Ericjster (talk • contribs) 07:13, 9 December 2019 (UTC)

Readers need to be told that the naive method is biased, and it should be clear just sweeping over the article. By naive method I mean:

for (i=0; i < n; i++) { swap i with rand(0...n-1) // naive method -- bad, use Fisher-Yates instead }

Currently it seems not to be even mentioned. Good reference with details:
 * https://blog.codinghorror.com/the-danger-of-naivete/
 * https://stackoverflow.com/questions/859253/why-does-this-simple-shuffle-algorithm-produce-biased-results-what-is-a-simple%20%282nd%20answer%29 — Preceding unsigned comment added by Ericjster (talk • contribs) 06:42, 9 December 2019 (UTC)

Low end?
Sorry but what does "counting from the low end" mean? I googled this phrase and this article is the only time this phrase is used, on the entire internet. — Preceding unsigned comment added by 82.46.42.169 (talk) 10:02, 16 July 2020 (UTC)

“inside-out” seems to be biased, no truly random shuffle
The “inside-out” algorithm always begins by copying the first element of the source to the first element of the destination. It then relies on random chance to swap that out later. This seems to be biased in favour of keeping it there. The probabilities for each element also don’t reek like an even 50% either. Some proofs would be welcome.

mirabilos (talk) 20:15, 26 September 2020 (UTC)

First example graphic
On the one hand, shuffling 'sycophant' into 'actsphony' is amusing, but given that it's clearly been chosen rather than being random, is it the best example that there could be? Lovingboth (talk) 10:16, 20 October 2021 (UTC) 0