Talk:Binary search/Archive 2

Original Wirth's code restored!
Dear Editors!

Please see original book by N.Wirth, before your editions:

You wrote:

k := (i + j) shr 1;

But N.Wirth wrote:

k := (i + j) div 2;

Did you know that "shr" was not standard operator in standard Pascal by Wirth?

You wrote:

until A[k] = x or i > j;

But N.Wirth wrote:

until (A[k] = x) or (i > j);

Did you know that even in modern OOP Delphi Pascal the expression "A[k] = x or i > j" would be wrong? -- the first step may be "x or i"...

Please, see sources before your original editions!!!--Tim32 (talk) 23:17, 6 May 2009 (UTC)


 * Yes indeed, pascal compilers require brackets around the terms of compound boolean expressions. Yes, the use of shift operations instead of division by (powers of) two adds unnecessary obscurity in the form of something the compiler should handle anyway. And the referenced version suffers from the overflow error for large arrays, engages in unnecessary extra effort involving L and R (the reference uses the unhelpful names i and j), has to mess about with k ± 1 because it uses inclusive reckoning, and with its single comparison per iteration abandons the opportunity for early discovery of a match so that its average performance is proportional to log(N), rather than log(N) - 1. All of this has been endlessly argued over, but, since no reference also presents such arguments they are Original Research and so should be purged from the article along with the deviant code examples. NickyMcLean (talk) 04:43, 9 May 2009 (UTC)

Single-comparison code is wrong
// high == low, using high or low depends on taste if ((low < N) && (A[low] == value))

The low= 0 were untrue, in which case the first test of low<high would have evaluated to false (and the comment "high == low" would be untrue also). But if such a check is necessary in the context of the program, it probably should be done before this procedure is called. —Preceding unsigned comment added by 128.149.77.217 (talk) 21:20, 16 July 2009 (UTC)


 * Not in the last element but after the last element! You'll see that the search boundaries in all other examples go from 0 to N-1 or from 1 to N. Here it is from 0 to N, making N an invalid index and thus you have to check for that possibility before trying to access A[N]. This is probably intentionally done so to allow for binary inserts. —Preceding unsigned comment added by 80.218.135.235 (talk) 16:02, 13 October 2009 (UTC)

Binary Inserts
I have not found any references to binary inserts in this article, i.e. at what position new elements should be inserted into a sorted list or array. All shown examples simply return -1 when a given value is not found.

What I need is a binary search function that returns true or false whether a given value was found _AND_ the position it was found at or where it should be inserted. I think this is a commonly used and very minor extension to the standard binary search and really belongs in this article!

In fact, the last example about single comparisons was intended to support this, thus the range from 0 to N (N+1 elements) to allow insertion before the first and after the last element, but someone (the original poster?) castrated it to make it conform to the other examples.

Here is how the lower part should look like:

// high == low, using high or low depends on taste if ((low < N) && (A[low] == value)) return true,low // found else return false,low // not found

I realise this is sort of inconvenient since most languages do not allow returning multiple values. Maybe return -low-1 when not found? 80.218.135.235 (talk) 16:47, 13 October 2009 (UTC)


 * If you look at the three-way-IF version (in Fortran) while it yet survives you'll see that it returns -L for "Not Found" working on an array indexed 1:N. This is precisely so that one may determine where to insert a missing value without having to search the array afresh to re-find the insertion position and indeed I have used this ability. Others have questioned why -L was returned instead of -1 or 0. The point was carefully discussed as one of the possibilities of the "Extensions" section, but that discussion was removed by someone who had no interest in the matter. Similarly, there had been a discussion of functions returning compound results (such as you describe), and it too was removed by the simplifiers.NickyMcLean (talk) 19:44, 13 October 2009 (UTC)


 * Thanks for clearing that up. I hope there will be an Extensions section again. 80.218.135.235 (talk) 15:41, 14 October 2009 (UTC)

overuse of OR tag
Those kind of things are allowed. Go check out almost every single science article with a diagram. Just as examples in the same subjects, check out the diagrams in Dijkstra's algorithm, Johnson's algorithm. They are all uploaded by the editors themselves. In fact, almost every one of these trivial and non-controversial diagrams are made by the editors themselves. Not everything have to be sourced to published sources. Do you think that it is necessary to pick out every diagram on wikipedia that isn't from academic sources and put a giant OR tag on them. Also, if you really have to be ridiculously rigid with the rules, these diagrams and simple examples can qualify as trivial calculation/compilations of data, which is explicitly said to be allowed in the OR policy article.130.126.27.191 (talk) 23:41, 23 August 2010 (UTC)
 * By the way, I didn't see it at first, but weren't you the one who uploaded them in the first place. Why are you doing this now?130.126.27.191 (talk) 23:42, 23 August 2010 (UTC)
 * Yes it was I, which is why I know that the flowchart is not found in the usual approved sources (it is not the one in Knuth's compendium) and it does not correspond to the algorithm found in many publications. The "exclusive bounds" version is of course not my own invention, but I have not seen it published - indeed, I can't now recall how I came to use it since the 1980s; I had thought I'd found it in Knuth, but was surprised to find on checking in 2004 or so that it is not there, nor in Wirth's famous book either. I must have read something somewhere, or talked with someone who had! Similarly, to produce the graphs of performance, I wasted a day's computer time on my work computer in exhaustively running a particular type of test trial on each of N = 1 to 32767 to gather the statistics for the plots. While others have performed somewhat similar trials, and may have published somewhat similar plots, I have never seen any published article with the same plots nor the specific trial I used. Thus, both the flowchart and the plots are the result of my endeavours unsupported by published sources, and differ from published sources. (Indeed, that was why I prepared them, as the standard forms are less satisfactory, but, the discussion of the differences and why to prefer one over the other has long since been ejected, as being verbose, unnecessary and, unsupported by published sources, naturally enough) I have elsewhere been roundly criticised for just this practice, and it was emphasised that even if the algorithm worked and the results presented and discussed were true, without presence in approved published sources, my screed constituted OR and was against W. policy and was removed. With the suggestion that I could submit a paper to a suitable publication, and if published, I (or anyone) could then re-enter the material. I have yet to make such an attempt. Accordingly, similar effusions in this article must also be retracted, in accordance with W. orthodoxy. Similarly, someone has marked the example source files as unreferenced; the fortran version is taken from one of my source files that I've used for decades and is definitely correct, but, is not published anywhere (though surely equivalent to published versions, except it has been explained to me that this is also inadmissible, being my opinion) and so likewise should be purged if one were resolute in following W. policy. NickyMcLean (talk) 22:20, 24 August 2010 (UTC)
 * It is pointless to follow Wikipedia's policy so dogmatically. They are nowhere near perfect, so common sense should override them if there is no controversy. If you think about it, it would be near impossible to add any charts and diagrams on wikipedia, if it is absolutely required to have been published in an academic source. First, a science journal would be very concise and would not provide a friendly picture to elucidate the texts, so any useful diagrams have to be taken off of textbooks. Second, it will be a long and complicated process to ask for copyright permission for the publishers and the authors and I think most editors will be driven away if they are required to go through so much trouble just to put up a picture. I think it makes much more sense and it is much more practical to only follow the OR policy so closely if it's some controversial political topic. Otherwise, if no one objects, it is not for the good of anyone to be so rigid.130.126.27.191 (talk) 01:43, 25 August 2010 (UTC)

Copying a flowchart or code from an approved source would be a copyright violation, so it mystifies me that editors think we should be doing that. —David Eppstein (talk) 01:55, 25 August 2010 (UTC)
 * These apologia have been raised, and rejected. Talk:Interpolation_search, for example. I agree that over-literal application amounts to a demand that published articles be plagiarised, and it would make it impossible for any author to write a new textbook about subjects already covered in other texts. Thus, another description of an established algorithm is going to involve much the same wording, similarly, redrawing the same diagram. Surely it is proper for a W. author (or any author!) to draw upon prior work, and indeed it is W. policy to require this, with acknowledgement, of course. But to repeat, the algorithm I described is not the standard one (as found in published sources), the flowchart is not a redrawing of the standard one (as found in published sources), the fortran example source is from my own (known-correct, but not published) collection, and the statistics on the behaviour of the method are definitely the result of my own experimentations, as has been decried. All of this violates W. orthodoxy, as has been emphasised. NickyMcLean (talk) 02:08, 26 August 2010 (UTC)
 * "Redrawing the same diagram" as you suggest, using the same graphical elements in the same geometric relation to each other, is just as much of a copyvio as retyping a passage of text using the same characters in the same order. —David Eppstein (talk) 02:39, 26 August 2010 (UTC)
 * So, we're stuck. Referring to Prof. Knuth's flowchart, redrawing it with trivial variations in layout, shape and names (I'd use L and R, not L and U, for example) would be a violation of copyright as obviously, the entire diagram has to be copied rather than sampled for the purposes of scholarly reference. Actually, I'd expand on the rather sparse descriptive elements that otherwise can only be understood with reference to the text, so it just might not be regarded as plagiarism even if Prof. Knuth wanted to be rigorous over copyright rather than favouring the dissemination of understanding. So, such a diagram would not violate W. orthodoxy with regard to Original Research. But again, the stuff I have supplied in the article does violate OR, and according to W. orthodoxy, must be ejected. NickyMcLean (talk) 22:09, 26 August 2010 (UTC)

Single comparison per iteration
The premise of this section appears wrong. The notion of counting comparison operations is that the comparison is expensive. Consider a comparison on 64-byte keys. The section is suggesting that testing condition codes is expensive. For many library routines, the comparison routine returns positive, zero, or negative. Once the comparison is done, a three way branch can be done with two conditional branch instructions. The comparison need not be redone for the second branch. The complexity is counting required comparisons - not instructions or redundant calls. Glrx (talk) 18:37, 1 November 2010 (UTC)


 * When I wrote the prog. in IBM1130 assembler, indeed the three-way branching was effected via conditional branch instructions examining the accumulator's state tags of +, -, zero, overflow (etc.) without recalculating the comparison and there is no difficulty. Such code might be generated by a compiler written for a language that encompasses syntax for a three-way test (as in Fortran, for example, but not in fortrans compiled by translating the source into C or similar that lack three-way testing, then compiling that) but in other languages the two expressions are written out, and they are different expressions. (One has >, the other has <) Hagiographers of "modern optimising compilers" claim superb code for their cherished champions and suppose that they too would recognise this situation. I doubt this, so, examples demonstrating it are called for. There has already been a discussion of the problems with diff:=A(i).Key - A(j).Key and testing diff. Yes, you can always devise a function that compares two keys and returns -1, 0, +1 and the problem goes away somewhat (yet still there are two tests of the result variable), but now you have the overhead of function entry/exit and parameter passing. Proper machine code for the binary search method is tiny, so that any waste effort (or "overhead") has a large effect. So, the problem, discussed in terms of high-level languages, remains a problem. NickyMcLean (talk) 20:26, 1 November 2010 (UTC)


 * First, the section has been challenged to provide references since January 2009. That has not been done, so the entire section should be deleted. Where are the WP:RS that extol one comparison per iteration?
 * Second, the above reply focuses on counting instructions rather than key comparisons; its premise is at odds with the thrust of using comparisons to measure performance. Although code examples use integer comparisons to illustrate the concept, typical searches use comparisons that are more involved than simple arithmetic operations. Typical library sorting routines use a function call, and that fits well with the traditional notion is that the cost of the comparison dominates all other operations (such as the instructions that manage the loop and calculate the midpoint). If < stands for comparing words in a dictionary, then the comparisons will have costs in terms of instruction cycles, memory references and cache misses; an extra conditional branch (which neither references data memory nor causes a cache miss) is insignificant. The optimizing compiler comments are a garden path; furthermore, modern compilers have no trouble recognizing simple arithmetic common sub-expressions and testing condition codes. Theoretical comparisons should not and do not hinge on the use of crummy compilers. When counting comparisons, the two tests of the condition codes don't cost anything; neither does the small code that forms the heart of the algorithm.
 * Glrx (talk) 03:46, 2 November 2010 (UTC)
 * Offhand, I know of no references extolling one comparison per iteration nor discussing the issue and although the person who supplied the section might, I suspect that this was a piece of knowledge being explained for the presumed benefit of all. For what it's worth, I have already expunged my own offerings as being unreferenced, despite the nice images. Orthodoxy is to be upheld, no matter the consequences.
 * Second, indeed, library routines for sorting usually provide some generalised approach for specifying the key, being multi-part and possibly with ordering exceptions as well. Or, this may be done by requiring the specification of a function that takes two keys and reports their ordering. As with numerical integration and the like, speed assessment is dominated by the number of such function evaluations, which may be complex. Well and good. Unless the result of such a comparison function is saved so that in the absence of a three-way test it may be inspected twice, the equivalent of the code if a < b then ... else if a > b then ... else ...; contains two comparisons and they will not be converted into one comparison with multiple inspection of condition codes, because the two expressions "a b" are different, they are not common. And this is quite aside from the added modern confusions due to NaN usage which will damage the usual interpretation of not(a < b). Thus, in the code under discussion, despite comparing only integers, there are explicitly two comparisons per iteration, and they will be executed according to the discussed statistics, because the compiler will not recognise the opportunity to inspect condition codes instead. And yes, I have not performed an exhaustive inspection of all compilers of all computer languages on this issue. I await a counterexample, preferably of an ordinary compiler for an ordinary language, not some special research exercise. I agree that two tests of the condition codes resulting from one comparison are trivial (and standard for code produced by humans with their brain engaged), but I say that there will be two comparisons in code produced by compilers compiling source code with two comparisons, with the expectation of half the time only one being executed.
 * So, the section describes what will happen, and should stand, aside from issues of provenance. NickyMcLean (talk) 20:39, 2 November 2010 (UTC)

Using a single comparison per iteration only makes the search take one iteration longer on average. I think this gain of one iteration is not worth to invest two comparisons per iterations in most situations. (Maybe there are some cases where this is appropriate, I don't know.) But I think it is better to keep both variants. This way, the reader can decide which version fits best his needs. By the way, I have looked up binary search in the book "Programming in Modula-2" by N. Wirth (page 35, 1983). He does exactly that: he first states the two-comparison version, and then improves it to a "more sophisticated version" that uses only a single comparison per iteration. Wullschj (talk) 03:46, 11 April 2012 (UTC)


 * I don't have a problem with including a deferred equality version; it has important features. I do have a problem with confusing order stats, comparisons and instruction counts. Typically, theoretical sorting statistics are O and do not care whether there are one or two comparisons each time through a loop. Using < is fine for examples, but adding a second = won't slow the loop by a factor of two because there is a lot of other instruction overhead. In practice, even the two comparison version will essentially look at condition codes rather than repeating a typically expensive string comparison routine. Glrx (talk) 17:21, 12 April 2012 (UTC)


 * It may have important features, but where are the references for the uptight who purged the previous manifestation? As for the repeated comparisons, I have checked actual machine code produced by a compiler and condition codes from the first comparison are not checked a second time in place of a second comparison, as explained above. And as explained above, there is not at all a lot of instruction overhead to an iteration (aside from the comparison itself, as for string compares), so small differences have a comparatively large effect. But demonstrations of this involve OR. NickyMcLean (talk) 21:48, 12 April 2012 (UTC)

Midpoint calculation
The article's discussion about numerical difficulties should be removed. It's origin is not in something fundamental, but rather in some peculiarities of using limited indices. I would characterized it Losedows 16-bit braindamage.

Practical records will have multi-byte records. Consequently, indices would multiplied by at least 2 to get a byte address. That provides a factor of 2 safety margin for the midpoint calculation. The problem therefore occurs when indices and pointers are different sizes. Such a choice appears foolish -- it means, for example, that an implementation would be using 16 bit indices and 32 addresses. That means the sorting routing would not be suitable for tables with more than 64K entries. Why cripple the routine in such a large address space?

Consequently, I believe the numerical difficulty issue should be removed.

Failing that, the code examples should use midpoint(i1, i2) instead of the unreadable closed midpoint expression.

Glrx (talk) 01:24, 15 November 2010 (UTC)


 * Err,
 * 1) Sixteen bit numbers were used in the discussion because the resulting digit strings are much smaller than those for thirty-two bit integers and the point is the same both, and for 24-bit, 48-bit, 64-bit 128-bit and any other size limit.
 * 2) Although I'm using a 32-bit system I have often enough been messing with files exceeding two and even four gigabytes in size, the 32-bit limit. Thus the "factor of 2 safety margin" is irrelevant. NickyMcLean (talk) 19:59, 18 November 2010 (UTC)

bd - April 2012: I'm going to remove the "disputed" tag from the article. It is and will always be a severe logical error to calculate the mid-point as "(L + R) / 2" with indices of a limited size. The fact that this bug even have occurred in widely-used library code like the Java run-time library only adds to the importance and relevance of mentioning this issue in the article. — Preceding unsigned comment added by Briandamgaard (talk • contribs) 09:36, 23 April 2012 (UTC)


 * I disagree with the removal of the tag. This article confuses the distinction between an algorithm and a program. The main interest should be the algorithm. The expression (L+R)/2 is almost instantly recognized as a midpoint equation; the alternative expression is not. Hence it causes confusion. Issues such as integer overflow are programming issues.
 * I disagree that index arithmetic needs such protection when an int covers the address space, the items being indexed are greater than 4 bytes (allow sign and doubled value; typically true when one has over a billion entries), and the array index domain is reasonable (e.g., 0 based or 1 based arrays).
 * If pointer arithmetic is used, then the game is different because the pointers might be in high memory. However, compilers should complain when confronted with a dubious (p1 + p2)/2. Adding pointers doesn't make sense (subtraction does). Even if the sum is typed as an integer, that means the quotient will then be used as a pointer, so there should be a type check error somewhere in code unless the compiler is clueless about integers and pointers.
 * The above issue about 2 GB and 4 GB files occur when an int does not cover the address space. Furthermore, the value is really a pointer, and the file pointer access routines would be returning 64 bit values; the compiler should issue a truncation warning if that pointer is assigned to an 32 variable. Finally, calculating a midpoint using a pointer (memory or file) can land one in the middle of a record even when the records are fixed size. This page does not address the variable size record issue.
 * Once again, there is too much focus on programming issues here. WP is not a programming textbook.
 * I'm willing to have a section that discusses the midpoint calculation error, but the rest of the page should either bottle the midpoint issue up into a  function or use the ordinary midpoint formula (as it currently does).
 * Glrx (talk) 23:48, 26 April 2012 (UTC)

The last paragraph states that if (L + R) exceed the number range then, for signed integers, a more efficient (R + L) >>> 1 can be used. How does this not exceed the number range? This seems to be an entirely separate point that somehow got mixed in here. — Preceding unsigned comment added by 178.203.134.182 (talk) 07:42, 1 January 2013 (UTC)

Language support?
Why we need this section? Today we have 10,000++ programming languages and any of them is capable to implement the bin search. Do we need the list of 10,000++ refs to every language implementation? --Tim32 (talk) 21:47, 13 March 2011 (UTC)


 * I would dump most of the entries, but I would mention that many programming language runtime libraries include binary search routines. I'd include C and perhaps a couple others as examples, but no more. This article is not about programming language support. Glrx (talk) 00:28, 14 March 2011 (UTC)


 * This is not about being capable of implementing it. This is about languages that include binary search in their standard library. Not many languages do that. --67.169.97.192 (talk) 19:42, 10 April 2011 (UTC)


 * This is any encyclopedia, not a catalog. The section should be removed.  — Preceding unsigned comment added by 68.183.92.210 (talk) 02:35, 15 May 2014 (UTC)

Cauchy's intermediate value theorem
It may be worth mentioning here that Cauchy's proof of the intermediate value theorem was a divide and conquer algorithm. Tkuvho (talk) 08:00, 8 April 2011 (UTC)

terrible state of this article
Really folks, this article is still a disgrace. Just a few of the many problems that still exist: McKay (talk) 07:08, 13 June 2011 (UTC)
 * no separation between conceptual and implementation issues
 * some paragraphs are incomprehensible (such as that referring to an "exclusive" version that is no longer present)
 * the pseudocode should precede any "extensions" or advanced discussion and should be correct

I propose to replace the iterative pseudocode by this, please comment: binarysearch(A, lo, hi, target) // Search for target in the sorted array A[lo..hi]. // If target is present, return some i such that A[i]=value. // Otherwise, return the special value "not present". down := lo  up := hi   while down <= up do      mid := midpoint(down,up) if   A[mid] > target then up := mid-1 elseif A[mid] < target then down := mid+1 else return mid endif endwhile return "not present" and to follow it with this explanation:
 * The expression  should calculate an integer mid-way between   and  . If there are two such integers, either is ok. In implementations, care should be taken so that intermediate values in the computation of   do not exceed the maximum value of a integer. If   and   are non-negative, the expression   is usually better than   for this reason.

My justification is (1) it is correct even for boundary cases, (2) it doesn't assume anything about whether subscripts are positive or negative and what index is first in an explicitly declared array (even C allows negative subscripts!), (3) implementation niceties are left until later, (4) exactly the same specification can be used for the recursive version. McKay (talk) 05:02, 16 June 2011 (UTC)

Humm. In an Algolish pseudocode, %To find a value V in N elements of array A, indexed 1 to N.        L:=0;                %Exclusive bounds: first - 1 R:=N + 1;           %Last + 1. Probe:if (p:=(R - L)/2) <= 0 then Return(-L);  %Search exhausted? case(sign(A[p:=L + p]) - V)               %Compare the array element to V Negative:L:=p; GoTo Probe;                         %A[p] < V Positive:R:=p; GoTo Probe;                         %A[p] > V     Zero:Return(p);                                %A[p] = V Found at p.         esac;                %No other cases. (NaN not allowable) The exclusive-bounds version in this form remains superior, but the explanations as to why have long since been suppressed as being unnecessary/excessive/boring and anyway, unsourced. Similarly, neither of us have quoted any reference to a proper published source for our effusions. So they are OR and to be excluded. Despite its unsatisfactory nature, the Pascal version from Wirth's book is properly sourced so conforms to the requirement. NickyMcLean (talk) 21:00, 16 June 2011 (UTC)
 * Having something that is correct and clear is more important than having a source. Anyway, since I have a large number of algorithms books on my shelves it was easy to find it in Kingston, Algorithms and Data Structures, p8, and in Rawlins, Compared to what?, p106. The variation with lo=1 is in lots of other books too. Also, I don't know why you gave this Algolish code, it is (0) less general, (1) unstructured and difficult to read, (2) wrong wrt implementation (N+1 might be too large for an integer). This last problem is a serious handicap of the exclusive method as it doesn't allow the array to start at index 0 when the indexes are unsigned. Incidentally, your "exclusive" version uses the same number of arithmetic operations and one extra assignment per iteration compared to my "inclusive version". In actual timing tests I always get faster times for the inclusive method (C with gcc -O4 on Intel Core 2 Duo). McKay (talk) 04:46, 17 June 2011 (UTC)

You may say that having something that is correct and clear is more important than having a source and I'd be inclined to agree, but it remains W orthodoxy that a proper published source is required and that Original Research is rejected. The problem with overflow is discussed in the article (although some are arguing that the discussion should be ejected) and applies to both inclusive and exclusive versions. Some twit converted 16-bit numbers to 32-bit numbers just to have more imposing digit strings. The exclusive bound version is just as able to deal with arbitrary array bounds as the inclusive. If the array is A(first:last) then the initialisations are L:=first - 1; R:=last + 1; (as indicated in the comments) and for the inclusive bound version are L:=first; R:=last; As it happens, I have a prog. dealing with something like A(-600:33000) and although published, it is not in an approved journal. The only dificulty is the encoding of "not found" which in my case can be zero, as index zero in my application is not used. The Return(-L) is useful for A(1:N) in general as it indicates where to insert a missing element, but this discussion has long since been ejected from the article. In short, both versions can deal with arrays of arbitrary bounds, given attention to the encoding of "Not found", the possibility of signed variables being needed, and the bounds on their range. Your version dodges this detail by not specifying what the "Not found" return is. If the array starts at zero instead of one, then zero is indeed a possible array index and so can't be the value for "not found" and the Return(-L) protocol can't be used - perhaps Return(-32768) or similar? Anyway, here is a less fearsome version. %To find a value V in N elements of array A, indexed 1 to N.        L:=0;                %Exclusive bounds: first - 1 R:=N + 1;           %Last + 1. Probe:p:=(R - L)/2        %Offset to the middle. if p <= 0 then Return(-L);  %Search exhausted? p:=p + L;           %Index of the middle element. case(sign(A[p]) - V) %Compare the array element to V Negative:L:=p; GoTo Probe;  %A[p] < V Positive:R:=p; GoTo Probe;   %A[p] > V     Zero:Return(p);          %A[p] = V Found at p.         esac;                %No other cases. (NaN not allowable) It's true that there appear the dreaded "go to" decried by the structuralist fanatics, rather than hiding them with "while" and the like. There was a nice flowchart for the method that seemed clear to me and it had a nice structure too, or so I thought, but it was OR. I'm unclear about your claim of arithmetic operations per iteration, as your version does not lay out the arithmetic operations within function midpoint. Conversely, my pseudocode involving "case" does not mean the usual case statement. It is intended to stand for a three-way IF test (or equivalent) that does not repeat the comparison action as in the usual if a < b then ... else if a > b then ... else ...; in the absence of such a syntactic offering. Ah, I see that someone has removed the discussion of this point and the associated versions from the article, so the "Deferred Detection of Equality" dangles.

Counting operations is tricky without details, but I note two differences: the inclusive method's loop contol has while L <= R (as in while down <= up) and discards the result after testing then goes on to determine midpoint (which will surely involve something of the sort), whereas the exclusive version I offer retains the result as a part of indexing the middle element. Secondly, the inclusive form has to compute mid + 1 or mid - 1 to place in the appropriate bound whereas the exclusive form merely places a value without offset. These all involve additional arithmetic operations. Assessing exact counts would require an agreed machine code, such as Prof. Knuth's MIX, and both versions being analysed as he does. My original efforts in this regard were in assembler on an IBM1130, and actual machines have particularities that may mean that registers can be used with cunning and assignments be deferred, etc. Incidentally,  is better than.

But yes, the article is a mess. The removers have ejected stuff without bothering to consider internal references. NickyMcLean (talk) 21:20, 19 June 2011 (UTC)


 * So, in summary, my version can be cited to at least two reliable sources and has the advantages I listed above as (1)-(4). You did not identify any problems with it. It does not have any overflow problem if midpoint is implemented carefully (as it says). Your version has other overflow problems: try it for N=32767 when indexes are signed 16-bit integers. The return "not found" is fine; this is an algorithm in pseudo-code, not a program to feed into a compiler, and for the same reason a 3-part if and a 3-branch case are the same (but I have no objection to using a case syntax if it doesn't introduce new overflows like hi-lo). Anyway, returning an integer value for a missing target is bad engineering since every integer value is potentially an array index. Hardware-level concerns don't belong here at all.  So I am going to put my version in along with a recursive procedure having the same specification (which is not possible with your version). For checking, here is the planned recursive version (I have several sources):

binarysearch(A, lo, hi, target) // Search for target in the sorted array A[lo..hi]. // If target is present, return some i such that A[i]=value. // Otherwise, return a special value "not present". if lo > hi then return "not present" endif mid := midpoint(lo,hi) if   A[mid] > target then return binarysearch(A,lo,mid-1,target) elsif A[mid] < target then return binarysearch(A,mid+1,hi,target) else return mid endif McKay (talk) 07:23, 20 June 2011 (UTC)


 * If you stay with pseudocode leaving the "Not found" nature undefined, then there is no difficulty over the details of array bounds and signed/unsigned integers and whether or not an integer value might be usable as an array index. With regard to integer overflow, I repeat again that both versions have this problem as is/was discussed in the article, though in different ways. For N = 32767 as you suggest, try your inclusive bounds version where the sought value follows A(32767) in order. A clue for you: mid + 1 It is for this reason that the article explained a limit of N = 32766 rather than 32767, or the equivalent for other sizes whether signed integers or not. As for recursion, the version you present is tidy though I'd still use L and R. The algorithm may be "basic", but it can be implemented in ways that are surprisingly different (midpoint-and-width, for example), and used in a context can have additional attributes (the Return(-L) feature, for example) but describing these details takes space.
 * And an afterthought: here is a recursive version of the exclusive search.

Function BinarySearch(A,L,R,V)     %Exclusive bounds version. %Find some index value p such that A[p] = V, or report "not found" %For an array span with bounds first to last use BinarySearch(A,first - 1,last + 1,V) p:=(R - L)/2;                                  %Integer division. 3/2 gives 1. if p <= 0 then return "Not found";            %Remaining search span is exhausted! p:=L + p;                                      %Apply the offset to index the middle element. if A[p] < V then return BinarySearch(A,p,R,V); %Search the second half. if A[p] > V then return BinarySearch(A,L,p,V); %Search the first half. return p;                                      %Found at p! end function BinarySearch;
 * This version can enjoy all the praise words employed above, and has no wierd mid + 1 or mid -1 trickery...

NickyMcLean (talk) 22:28, 20 June 2011 (UTC)


 * You are quite right that my code might compute lo-1 or hi+1. It is not hard to fix, but only by leaving the sources behind and adding complexity. (Incidentally, for someone who likes to put assignments into the middle of arithmetic expressions to describe adding 1 as "trickery" is a bit rich.) Now that I think more about it, it might be best to leave the overflow issue entirely to the comments, so instead of using midpoint(down,up) use $$\lfloor (down+up)/2 \rfloor$$ like many textbooks do. (I was avoiding the issue of what "/2" means before anyway.) McKay (talk) 03:48, 22 June 2011 (UTC)


 * Ah, I admit to a taste for trickery. An assembler version would have the same topology but the placement of actions along each path would be rearranged to reduce the number of jumps in the iteration sequences. It is only via assignments within expressions that many assembler trickeries flinging off side effects to later advantage can be accommodated - otherwise high-level languages force lame semi-repetitions. Anyway, in comparing the two recursive versions (and incidentally, your version need not invoke "else" stuff), I think it fair to say that the mid+-1 version requires a bit more thinking to follow when assessing its correctness and only afterwards does it become "obvious". But the exclusive bounds version does have an odd invocation in order to maintain its premise. The (L + R)/2 overflow issue, whether decorated with fancy notation for "floor" and claims of unlimited range for arithmetic in pseudocode, is regulary ignored in texts, and innumerable resulting progs. use it as it stands. Whereas it has no place in the exclusive bounds version. Similarly with the 32766 issue being unmentioned. So, I'd favour an explication of the exclusive bound version, plus a later commentary about the inclusive bound and the (L + R)/2 overflow that it entices, associated with the 32766 issue applying to both versions. However, no reference near my hand describes the exclusive bound version, or indeed mentions overflow problems at all. If details are to be discussed, there are many, and some are traps to avoid. If details are not discussed, then traps will be fallen into if people actually use this article to produce a prog. Unfortunately, W descriptions of famous algorithms are routinely miscorrected. (QuickSort is notorious) This was why I introduced a flowchart depiction (images are more difficult to mess with!), but, I have no reference for it, so it was OR, as were the results on performance despite the pretty graphs with interesting bends. NickyMcLean (talk) 20:50, 23 June 2011 (UTC)

Why include both an iterative and a recursive version? Any loop is trivially equivalent to a tail-recursion. --Macrakis (talk) 14:23, 20 June 2011 (UTC)


 * A recursive version is conceptually closer to the way binary search is usually described, i.e. a divide and conquer algorithm where we take a binary search problem, and we turn it into a subproblem (smaller binary search problem), recursively. A recursive version represents this in a very apparent way by a recursive call. --208.80.119.67 (talk) 01:12, 12 July 2011 (UTC)

It still sucks. — Preceding unsigned comment added by 75.79.3.184 (talk) 23:03, 19 May 2014 (UTC)

Sorted Array
This article is entirely too long for something as basic as a binary search. I'd suggest creating the sorted array article and splitting / removing content. --Scandum (talk) 12:37, 20 June 2011 (UTC)


 * It could be a section of a sorted array article, that is not a bad idea and would be an excuse to start afresh (which this article badly needs). Whether as a stand-alone article, or a section of another, I propose subsections as follows:
 * Informal description and example
 * Pseudocode formal description (both iterative and recursive)
 * Running time analysis
 * Brief mention of variations (no code, no details)
 * To answer Macrakis, the reason to include both iterative and recursive is that elementary sources are split more or less evenly between the two views. Also, the view that people find easiest to understand varies (I've been teaching this since 1978 and didn't yet find an explanation that suits everyone). McKay (talk) 03:48, 22 June 2011 (UTC)
 * Dear McKay, I am not sure that any consensus is possible for this article. For example, see section "Implementations. Iterative". Somebody wrote "The following incorrect (see notes below) algorithm is slightly modified (to avoid overflow) from Niklaus Wirth's in standard Pascal". I think, if you will add your iterative pseudocode, then somebody will write that it is incorrect because it uses Goto (vs structured programming) and so on. Perhaps the article needs some sort of protection to restrict its editing? --Tim32 (talk) 09:58, 22 June 2011 (UTC)
 * You misread; my pseudocode is structured and has no gotos. McKay (talk) 03:34, 24 June 2011 (UTC)
 * Yes. Sorry for my mistake. I think, Pascal-like pseudocode would be better for beginners:

function binarySearch(A, lo, hi, target) { Search for target in the sorted array A[lo..hi]. If target is present, return some i such that A[i]=value. Otherwise, return the special value notPresent. For example, if lo and hi are non-negative, then notPresent=-1 } begin down := lo; up := hi; while down <= up do     begin mid := midpoint(down,up); if   A[mid] > target then up := mid-1 else A[mid] < target then down := mid+1 else Result := mid end; Result := notPresent; end;
 * --Tim32 (talk) 07:55, 24 June 2011 (UTC)
 * Um, no, "Result := mid" needs to return immmediately, not just set the return value. I was thinking of using the style shown at the style page you mentioned. McKay (talk) 08:14, 24 June 2011 (UTC)
 * Ok. It may be "else begin Result := mid; exit end;"--Tim32 (talk) 08:20, 24 June 2011 (UTC)
 * That could be taken to mean exiting from the loop (as it does in some computer languages). I don't think it is wise to avoid "return". McKay (talk) 08:35, 24 June 2011 (UTC)
 * Yes. This is interesting problem of pseudocode understanding. Let it be "else begin Result := mid; exitFromFunction end;" A beginner may have a problem to understand Return as well.--Tim32 (talk) 08:47, 24 June 2011 (UTC)
 * Err, why the copy of the parameters to internal variables? Ah, because their values are changed and you don't want to damage the originals. Proper pascal messes about with the keyword "Var" and its absence for value parameters. NickyMcLean (talk) 21:38, 26 June 2011 (UTC)
 * This is Pascal-like pseudocode, not standard Pascal. For realization the 1st line has to be:
 * function binarySearch(A : intArray; lo, hi, target : integer) : integer;
 * or
 * function binarySearch(A : realArray; lo, hi : integer; target : real) : real;
 * etc. But if you wish you can write:
 * function binarySearch(var A : realArray; var lo, hi : integer; var target : real) : real;
 * Ok?--Tim32 (talk) 23:26, 26 June 2011 (UTC)


 * Ah, Pascal programming. I've always been annoyed at the amount of babble needed in Pascal parameter lists. Not really OK. If the parameters are not "Var" type so that they are passed by value, there is no need for the internal copies that can be adjusted since the parameters are already, though not the array. If we abandon the Pascal detail for proper pseudocode, then there should be some explanation of why there are copies made. Or alternatively, make no such copies and so don't talk about those details (which add nothing to the understanding of the method), as is desired by the simplifiers though they do not want to see any code at all. Oddly enough, the exclusive-bound iterative version does not raise this problem because it uses internal variables anyway, but everyone and their source texts seems to be glued to the inclusive bounds version.NickyMcLean (talk) 20:48, 27 June 2011 (UTC)


 * WP:NOT I strongly suggest removing any code and pseudocode as Wikipedia is not a textbook. Like I mentioned earlier, this article is entirely too long for a relatively simple subject matter. --Scandum (talk) 12:21, 22 June 2011 (UTC)
 * The fact is that the majority of WikiArticles about algorithms have code and/or pseudocode. "Within this WikiProject, the consensus has generally been that where possible algorithms should be presented in pseudocode." WikiProject_Computer_science/Manual_of_style_(computer_science)--Tim32 (talk) 13:25, 22 June 2011 (UTC)
 * Taken in context that is to mean that pseudocode is preferred above a language specific example, not that algorithms in pseudocode should be added 'where possible'. It's in itself not a huge issue to me if the other problems are addressed. --Scandum (talk) 04:15, 23 June 2011 (UTC)
 * However, in contrast with you they did not write "We strongly suggest removing any code and pseudocode as Wikipedia is not a textbook".--Tim32 (talk) 08:04, 24 June 2011 (UTC)

Perl Implementation Did Not Contain Bug
The accusation that the Perl code contained a bug is false. The binary search algorithm check was fine.

The problem was that the index must be less than half the size of the variable used to store it (be it an integer, unsigned integer, or other).

This was not an algorithm bug as is purported on this page - and I feel strongly that this is unjust. A simple comment at the start of the algorithm (input maximum index may not be more than half the maximum of the variable type) would suffice as a routine interface contract. — Preceding unsigned comment added by 89.16.173.123 (talk) 12:31, 21 November 2011 (UTC)


 * Oh? And can you provide even one example of a textbook or documentation or interface specification that from the beginning announced this restriction? NickyMcLean (talk) 20:47, 23 November 2011 (UTC)

How about beginning the Implementations section with a correct implementation?
The Implementations section start with an implementation which says "The following incorrect (see notes below) algorithm [...]". If it is incorrect, why put it first? Nicolas1981 (talk) 11:19, 15 March 2012 (UTC)
 * Simply because WP is prowled by righteous purists who will decry as Original Research and excise any entry not backed by an approved published source, and the textbooks etc. uniformly present versions where integer overflow and the like are ignored as not relevant to the explanation of the method. Many potential contributors have well-tested and even provably correct versions of a binary search routine in various computer languages, but, none of them have been published in a proper approved publication. And yes, I have had stuff rejected for this reason too. For instance, there used to be a version that performed one test per iteration (rather than the two forced by most computer language syntax) under the heading "Deferred Detection of Equality", something that had never ocurred to me, and there was an explanation of its behaviour, but, because there was no approved source, out it went. The example binary search routine is from a famous text on programming, and thus is a proper source. Despite all the noted detractions. NickyMcLean (talk) 20:18, 15 March 2012 (UTC)


 * The section is terrible. Wikipedia policy on OR is no excuse for poor content and poor presentation.  The current text is more a critique of Wirth's code than it is an exposition of a binary search algorithm.  Just silly. --Macrakis (talk) 23:52, 15 March 2012 (UTC)


 * There is a statement somewhere to the effect that any rule slavishly followed that leads to poor results is to be disregarded, however the righteous disregard it and hold themselves all the more righteous thereby. NickyMcLean (talk) 02:57, 16 March 2012 (UTC)


 * The problem is that some people here don't recognize the distinction between algorithm and implementation and they think that an algorithm which has problems when copied unmodified onto a computer is erroneous. This mindset is contrary to how the great majority of algorithms are presented in computer science. Until it is changed, the prospect for this article is not great. McKay (talk) 01:57, 16 March 2012 (UTC)

imid-1 becomes negative
I see nothing in the recursive or iterative algorithms that avoids the "imid - 1 becomes negative" error. Aasuming the use of unsigned integers, a simple fix would be to return KEY_NOT_FOUND if imid is zero - before decrementing imid. — Preceding unsigned comment added by 68.183.92.210 (talk) 02:13, 15 May 2014 (UTC)


 * The given algorithms use  rather than , so the underlying premise is faulty. If   ever goes negative, then it is caught at the top of the recursion or in the   condition. Consequently, a separate test for the absolute left edge of the search region is not needed. Glrx (talk) 19:44, 16 May 2014 (UTC)


 * Nope - these are supposed to be algorithms, not implementations in a specific language/architecture. (By the way, there is nothing to "catch" a -1, even if legal)   Also, nothing prevents (imid/right/last) + 1 from exceeding the maximum possible.  — Preceding unsigned comment added by 75.79.3.184 (talk) 22:28, 19 May 2014 (UTC)

Algorithm section needs to specify how midpoint is calculated
Only one way should suffice for all three algorithms, but this is critical, to determine how implementation problems - notably integer range issues - can occur and may be avoided.

Deferred method
What is supposed to be meant by "assert ..."? Does the algorithm blow up if false? Does it do something to assure its true? Or most likely, is this another piece of unsupported fluff in this article?

C code replaced with pseudocode
Because C code is inaccessible to many readers and seems to be unsourced (or at least unsuppoerted) and written by an editor, I have replaced it with a rough synthesis of the pseudocode found in The Art of Computer Programming (particularly §6.2, "Algorithm B"). Esquivalience  t 01:52, 19 March 2016 (UTC)

Improvements
Article is much better! Still, I think that too much space is spent on things that aren't binary search (like tree balancing). Perhaps the real problem is that we're missing an article that is more general than binary search algorithm but more specific than search algorithm (which is extremely broad). Not sure what the right term is... I'll have to look at Knuth next time I have it nearby. --Macrakis (talk) 01:03, 8 April 2016 (UTC)
 * Since the section is titled "Improvements" I might as well just post some here. Many of the other sorting/searching algorithm pages have pseudocodes which I personally find extremely helpful to readers, but this article doesn't have one. I understand the complexities involved with binary search, but adding a pseudocode for binary search would be a huge improvement. Zamaster4536 (talk) 12:53, 17 April 2016 (UTC)

Bloom filters etc.
Recently, a paragraph was added about Bloom filters. Certainly an interesting and important algorithm. But in what way is it more important or significant than the other algorithms covered in the "Other Data Structures" section? In particular, it is an approximate algorithm, and it is only useful for set membership, not for attaching values to keys, so it really doesn't compete directly with binary search, hashing, etc. The simplest set membership algorithm, the bit array, is also—appropriately—in that section.

We shouldn't try to cover all possible lookup algorithms in the "Binary search versus other schemes" section, and certainly not in detail. What we should do is compare the domains of applicability and the time and space efficiency, and leave the implementation mechanism to the individual articles. --Macrakis (talk) 16:09, 2 August 2016 (UTC)
 * What I would like to see retained is the clear statement that for problems involving equality testing only (that is, both set membership and dictionary lookups), binary search is usually a bad choice, because other schemes (hash tables, Bloom filters, etc) are clearly better. The details of those other schemes are important here only to set the context, but the comparison of them against binary search is crucial. Otherwise, in my experience, too many students jump from the fact that you can use binary search for equality-based problems to the incorrect conclusion that you should use it. —David Eppstein (talk) 00:19, 5 August 2016 (UTC)


 * I certainly agree that we should be clear about the domain of applicability of different lookup algorithms, and their relative advantages and disadvantages. I don't think that that requires that we describe each of these other algorithms in detail in this article. --Macrakis (talk) 21:42, 5 August 2016 (UTC)

Trivial implementation observations
Article currently reads:
 * The iterative version of binary search only requires three extra variables, taking constant extra space. When compiled with a compiler that does not support tail call elimination, the recursive version takes space proportional to the number of iterations as extra stack space is required to store the variables. If the array is passed by value instead of by reference to the binary search subroutine, the original array will have to be copied, costing {\textstyle O(n)} extra time and space. Since binary search does not modify the elements of the array, it is safe to pass the array by reference.

This is all true, but utterly trivial, as it is true of any iterative algorithm converted to a recursive algorithm, even linear search. I have deleted it.

I have also deleted other references to a recursive form. After all, any iterative algorithm can be written in recursive form, and the efficiency issues are always the same. --Macrakis (talk) 21:38, 6 August 2016 (UTC)

Deferred detection of equality
There used to be an interesting (imho) paragraph on deferred detection of equality which vanished through user Esquivalience edits.

Why was this removed entirely? Gpakosz (talk) 09:13, 14 September 2016 (UTC)
 * The second-to-last paragraph of the "performance" section still appears to cover this. —David Eppstein (talk) 16:12, 14 September 2016 (UTC)

Boundary search is wrong
Boundary search examples are wrong:
 * 1) Set L to -1 and R to n--1
 * 2) while L - R > 1   the condition is wrong on the first test, the body is never executed.

The L + (R - L)/2 is really academic and obfuscating, as the variables in the pseudo code are not typed. Also, in most languages, a type could be used which prevents overflow. I suggest (L+R)/2 in the pseudo code, and use a formulation like:

To avoid overflow, L + (R - L)/2 is used instead of (R+L)/2 in concrete implementation.

instead of the currently last sentence (about overflow) in the section. — Preceding unsigned comment added by 82.136.103.121 (talk) 16:47, 8 April 2017 (UTC)

Ambiguity of "next smallest" and "next largest"
The section "Approximate matches" talks about "predecessor (next-smallest element)" and "successor (next-largest element)". I find this ambiguous. We're talking about a list in ascending order. So in my usual understanding of the phrases, 'next largest' would mean the highest value smaller than the given one (i.e. the predecessor) and 'next smallest' would normally mean the smallest value higher than the given one (i.e. the successor). Am I correct? --David Edgar (talk) 16:49, 6 May 2017 (UTC)

Approximate matches doesn't belong in this article
first of all someone needs to define which regex means "Approximate matches" I believe that they are exclusively talking about "begins with..." / "starts with..." but it has almost nothing to do with binary searching. AlwaysAngry (talk) 20:21, 7 November 2017 (UTC)


 * This has nothing to do with regexes. These are approximate matches w.r.t. the ordering, as clearly stated in the article. In particular, binary search efficiently finds where a given element would go (between what other elements), unlike hashing. --Macrakis (talk) 22:06, 7 November 2017 (UTC)


 * Then it should be called something else. In fact, as I clearly stated, this is a "begins with" type search which is much more specific and limited than a 'LIKE' in a SQL WHERE clause or regex which ARE approximate matches. AlwaysAngry (talk) 17:01, 28 December 2017 (UTC)


 * Well, it's not a "begins with" search, either. The algorithm allows you to search for Q in [A, D, O, T, Z] and determine that Q would fall between the 3rd and 4th elements. I'm not sure what the best name for that is. Ideas? --Macrakis (talk) 19:35, 28 December 2017 (UTC)


 * It seems to be a "begins with" search to me, although in your trivial case the begins with is also the exact match case.  Anyway, since you don't like "begins with", my suggestion would be "incremental matching". AlwaysAngry (talk) 13:44, 2 January 2018 (UTC)
 * Approximate matching is search as in finding an element approximately equal to the target, hence the name. If you view the array being searched as a list of numbers, like [1, 3, 7, 12, 16], and say you want to find which number in the array is most approximately equal to 9, then you yield 7 using nearest neighbor. Approximate matching has nothing to do with text processing or regexes. Esquivalience (talk) 02:55, 8 January 2018 (UTC)
 * I understand your perspective now. Since you are only considering numeric keys, approximate matching makes more sense.  This however isn't true for all binary searching, and my mind was focused on searching for non-trivial text in a binary tree.  and approximate matching doesn't make sense for me.  This is because there are other methods of approximating text comparisons and there aren't for numbers.  For numbers there is only subtraction.  For text there are regular expressions, like, wildcards, exact match. AlwaysAngry (talk) 15:50, 8 January 2018 (UTC)


 * , we use the standard terminology of the field. Incremental search is something else again. --Macrakis (talk) 14:27, 8 January 2018 (UTC)
 * right, for me it's extremely similar to that link. AlwaysAngry (talk) 15:50, 8 January 2018 (UTC)
 * , you still haven't got it right:
 * Binary search is equally applicable to text and to numbers.
 * It is not useful for searching in binary trees, but in linear arrays.
 * Incremental search has nothing to do with binary search. It uses some sort of string searching algorithm.
 * Approximate string matching involving wildcards or edit distance or phonetic similarity or whatever are certainly interesting, but again have nothing to do with binary search.
 * It seems to me that you need to go a bit further in your computer science education before you are ready to edit articles like this one. --Macrakis (talk) 21:26, 10 January 2018 (UTC)
 * You still cannot understand that the term approximate matching is thought of differently when applied to text as it is when applied to numbers.  Readers will believe that Approximate string matching should certainly apply when it really does not.
 * It seems to me that you need to interact with more people and less computers, as you are coming off as a complete ass. AlwaysAngry (talk) 14:55, 12 January 2018 (UTC)

My feeling is that binary search is useful for finding nearest neighbors in a linear order (a form of approximate searching), and NOT USEFUL for exact matching (because you should be using a hash table for that instead). So yes, we should definitely cover approximate searching in this article. —David Eppstein (talk) 17:29, 8 January 2018 (UTC)
 * the topic is fine, the term is the only thing in question. Approximate matching can be construed as Approximate string matching which will not work. AlwaysAngry (talk) 14:55, 12 January 2018 (UTC)
 * There are many kinds of approximation. The article is quite clear about the sorts of approximate search that binary search can perform efficiently, and certainly doesn't lead the user to believe that it's talking about approximate string matching or incremental search. Have you found some reliable source that uses a better name?
 * And kindly avoid the personal attacks. --Macrakis (talk) 23:15, 13 January 2018 (UTC)
 * By that logic, common words like "run" are in trouble as there are tens or even hundreds of synonyms for them. Yet English speakers, even non-native ones, seem to speak the language just fine by examining their context and deducing. The article clearly gives context that binary search is not a search algorithm primarily for strings. Esquivalience (talk) 03:46, 14 January 2018 (UTC)

I don't think the wording is bad enough to argue over. By comparison, consider the correctness: "Rank queries can be performed using a modified version of binary search. By returning m on a successful search, and L on an unsuccessful search, the number of elements less than the target value is returned instead." The value m is the index of one of the successful matches, but not necessarily the first. So on a successful search it is not necessarily true that m is the number of smaller elements. This can be fixed by requiring only distinct elements in the array but that would severely hamper the usefulness. McKay (talk) 02:40, 15 January 2018 (UTC)
 * The better fix is to be careful to make the search always return the first match. —David Eppstein (talk) 03:14, 15 January 2018 (UTC)

Obscure text
Immediately after the pseudocode for the algorithm, there is
 * "Some implementations may check whether the middle element is equal to the target at the end of the procedure. This results in a faster comparison loop, but requires one more iteration on average."

I have no idea what this means. More correctly, I know what it means but no novice reader has a chance of knowing. It refers to replacing the pair of tests (for &lt; and &gt;) by a one-sided test until the possible matches are reduced to one, followed at the end by an equality comparison with that single possibility. Either this has to be spelt out or deleted; currently it is pretty useless. McKay (talk) 02:32, 15 January 2018 (UTC)
 * There was example code and description for deferred detection of equality, explaining the average expectation of two tests per probe due to the lack of a three-way IF-statement in many computer languages (for instance, the Pascal of Wirth's version) but along the way it was removed (probably because no-one can quote a recognised reference)in the usual piecewise fashion, so obscure references remain. NickyMcLean (talk) 11:26, 12 June 2018 (UTC)
 * That version is already covered in the alternative procedure and duplicate elements section, and there are multiple alternate procedures which defer the equality check: Bottenbruch's version (which will return the rightmost element), and the version that used to be there (which will return the leftmost element). The performance of this version is covered in the only subsection in the Performance section. Esquivalience (talk) 12:01, 12 June 2018 (UTC)

Flawed algorithms in "Alternative procedure" and "Duplicate elements" sections
The pseudocode for the functions in the "Alternative procedure" and "Duplicate elements" sections (as of 2018-06-19) can fail if given an array of one element (n=1), because they can return an uninitialized value of m, due to m only being set in the loop body which is never executed for n=1. --Brouhaha (talk) 23:23, 19 June 2018 (UTC)
 * I have fixed the procedure by setting R to n (I have missed that part in the source), which is the same solution used by Python's implementation. However, the alternative procedure is from Bottenbruch's ALGOL 60 guide, and looking at the procedure from the guide, it also has that error (given A[1]...A[n], it takes the midpoint even if n=1), but I added a clarifying note for that. Esquivalience (talk) 21:05, 20 June 2018 (UTC)
 * It's broken. The first clue that something is fishy should be  .  If A[L] equals T then why return m -- you should just return L.  But the code will never return a successful search in any case.  If the target value is in the array then outside the while loop L will be one past the rightmost occurrence of the target and the A[L] == T test will always fail.  In fact it will be an out of bounds array reference if T >= A[n-1].  You probably need to do something like  .  Quale (talk) 06:40, 23 June 2018 (UTC)
 * Hopefully it is fixed now, but I decided to just make it return L and it can be interpreted in certain ways. Esquivalience (talk) 15:45, 23 June 2018 (UTC)
 * What certain ways can it be interpreted? I don't see that explained in the article.
 * Also the alternative procedure given earlier in the article and attributed to Bottenbruch is obviously broken too. I haven't seen his Algol 60 code, but I'm pretty sure it doesn't implement the algorithm described.  The word description of the algorithm doesn't match the pseudocode because it's an infinite loop.  Step 5 is dead code.  It will never be executed so the loop never terminates.  Quale (talk) 16:06, 23 June 2018 (UTC)
 * It is technically flawed, because if n=1, then the algorithm fails. Bottenbruch even assumed the existence of an element A[L+1] in his paper, implying that multiple elements are required for his algorithm to work. It could be corrected by setting R = n, but I left it that way for historical accuracy However, because his version takes ceil((L + R)/2) instead of floor, even if say n=10 and L = 8 and R = 9, then ceil(8.5)=9 and the algorithm arrives at L = 9, so step 5 is always reached. Esquivalience (talk) 16:33, 23 June 2018 (UTC)
 * It's more than technically flawed, it's a totally broken infinite loop, possibly because of a typo in step 4. How is step 5 reached???
 * Set L to 0 and R to n − 1.
 * Set m (the position of the middle element) to the ceiling, or the least integer greater than (L + R) / 2.
 * If Am > T, set R to m − 1 and go to step 2.
 * Otherwise, if Am ≤ T, set L to m and go to step 2.
 * Now L = R, the search is done. If AL = T, return L. Otherwise, the search terminates as unsuccessful.
 * In step 4 Am ≤ T is a useless test because it will always be true and step 5 is never reached. Probably step 4 should use < rather than ≤.  Quale (talk) 21:48, 23 June 2018 (UTC)
 * It's supposed to terminate once L = R. The point of the alternative procedure is to not check for Am = T or any similar comparisons like that until the end of the search. I added a clarifying note to go to step 5 once L = R, but it is implied by step 5 (now L = R). Esquivalience (talk) 22:52, 23 June 2018 (UTC)
 * It should be almost identical to left most but with ceil instead of floor. — Preceding unsigned comment added by 24.16.83.93 (talk) 00:00, 7 April 2019 (UTC)

I can't tell what has been fixed since June. As of August 15, 2018, the section looks like this:
 * Hermann Bottenbruch published the first implementation to leave out this check in 1962. However, Bottenbruch's version assumes that there is more than one element in the array. The subroutine for Bottenbruch's implementation is as follows, assuming that n > 1:


 * Set L to 0 and R to n &minus; 1.
 * If L ≥ R, go to step 6.
 * Set m (the position of the middle element) to the ceiling, or the least integer greater than (L + R) / 2.
 * If Am > T, set R to m &minus; 1 and go to step 2.
 * Otherwise, if Am ≤ T, set L to m and go to step 2.
 * Now, the search is done. If , return L. Otherwise, the search terminates as unsuccessful.

The procedure certainly does work for n=1. Both L and R are set 0 in Step 1, then Step 2 sends us to Step 6, which compares the single element A0 against T. So that caveat is (no longer) required. In addition, the test L ≥ R in Step 2 can be L = R unless we want to allow n = 0 (in which case Step 6 would have to be changed to allow for it). Furthermore, the test Am ≤ T in Step 5 always succeeds if the test in Step 4 failed, so it isn't needed. Finally, the explanation of ceiling is wrong. Overall, I think a correct procedure for n ≥ 1 is:


 * 1) Set L to 0 and R to n &minus; 1.
 * 2) If L = R, go to step 6.
 * 3) Set m (the position of the middle element) to the ceiling, i.e. the least integer greater than or equal to, (L + R) / 2.
 * 4) If Am > T, set R to m &minus; 1 and go to step 2.
 * 5) Set L to m and go to step 2.
 * 6) Now, the search is done. If , return L. Otherwise, the search terminates as unsuccessful.

Any disagreement? McKay (talk) 05:37, 15 August 2018 (UTC)
 * I think your proposed version is correct, so I have no objections. Esquivalience (talk) 13:51, 15 August 2018 (UTC)
 * Thanks, I put it in with your wording change. McKay (talk) 04:10, 16 August 2018 (UTC)

Malicious Attack
A number of pictures from obvious malicious attackers are added into this page today, when it is the feature page on the main page. — Preceding unsigned comment added by Rustc firefox (talk • contribs)
 * It was vandalism to one of the templates used in the article. —David Eppstein (talk) 06:56, 29 October 2018 (UTC)