Byte Sieve

The Byte Sieve is a computer-based implementation of the Sieve of Eratosthenes published by Byte as a programming language performance benchmark. It first appeared in the September 1981 edition of the magazine and was revisited on occasion. Although intended to compare the performance of different languages on the same computers, it quickly became a widely used machine benchmark.

The Sieve was one of the more popular benchmarks of the home computer era, another being the Creative Computing Benchmark of 1983, and the Rugg/Feldman benchmarks, mostly seen in the UK in this era. Byte later published the more thorough NBench in 1995 to replace it.

Origins
Jim Gilbreath of the Naval Ocean System Center had been considering the concept of writing a small language benchmarking program for some time, desiring one that would be portable across languages, small enough that the program code would fit on a single printed page, and did not rely on specific features like hardware multiplication or division. The solution was inspired by a meeting with Chuck Forsberg at the January 1980 USENIX meeting in Boulder, CO, where Forsberg mentioned an implementation of the sieve written by Donald Knuth.

Gilbreath felt the sieve would be an ideal benchmark as it avoided indirect tests on arithmetic performance, which varied widely between systems. The algorithm mostly stresses array lookup performance and basic logic and branching capabilities. Nor does it require any advanced language features like recursion or advanced collection types. The only modification from Knuth’s original version was to remove a multiplication by two and replace it with an addition instead. With the original version, machines with hardware multipliers would otherwise run so much faster that the rest of the performance would be hidden.

After six months of effort porting it to as many platforms as he had access to, the first results were introduced in the September 1981 edition of Byte in an article entitled "A High-Level Language Benchmark". Gilbreath was quick to point out that:

I should emphasize that this benchmark is not the only criterion by which to judge a language or compiler.

The article provided reference implementations in ten languages, including more popular selections like BASIC, C, Pascal, COBOL, and FORTRAN, and some less well-known examples like Forth, ZSPL, Ratfor, PL/1 and PLMX.

Example runs were provided for a variety of machines, mostly Zilog Z80 or MOS 6502-based. The best time was initially 16.5 seconds, turned in by Ratfor on a 4 MHz Z80 machine, but Gary Kildall personally provided a version in Digital Research's prototype version of PL/1 that ran in 14 seconds and set the mark for this first collection. The slowest was Microsoft COBOL on the same machine, which took a whopping 5115 seconds (almost one and a half hours), longer even than interpreted languages like BASIC. A notable feature of this first run was that C, Pascal and PL/1 all turned in a roughly similar performance that easily beat the various interpreters.

A second set of tests was carried out on more powerful machines, with Motorola 68000 assembly language turning in the fastest times at 1.12 seconds, slightly besting C on a PDP-11/70 and almost twice as fast as 8086 assembler. Most PDP-11 and HP-3000 times were much slower, on the order of 10 to 50 seconds. Tests on these machines using only high-level languages was led by NBS Pascal on the PDP-11, at 2.6 seconds.

UCSD Pascal provided another interesting set of results as the same program can be run on multiple machines. Running on the dedicated Ithaca InterSystems Pascal-100 machine, a Pascal MicroEngine based computer, it ran in 54 seconds, while on the Z80 it was 239, and 516 on the Apple II.

Spread
Gilbreath, this time along with his brother Gary, revisited the code in the January 1983 edition of Byte. This version removed most of the less popular languages, leaving Pascal, C, FORTRAN IV, and COBOL, while adding Ada and Modula-2. Thanks to readers providing additional samples, the number of machines, operating systems and languages compared in the resulting tables was greatly expanded.

Motorola 68000 (68k) assembly remained the fastest, almost three times the speed of the Intel 8086 running at the same 8 MHz clock. Using high-level languages the two were closer in performance, with the 8086 generally better than half the speed of the 68k and often much closer. A wider variety of minicomputers and mainframes was also included, with times that the 68k generally beat except for the very fastest machines like the IBM 3033 and high-end models of the VAX. Older machines like the Data General Nova, PDP-11 and HP-1000 were nowhere near as fast as the 68k.

Gilbreath's second article appeared as the benchmark was becoming quite common as a way to compare the performance of various machines, let alone languages. In spite of his original warning not to do so, it soon began appearing in magazine advertisements as a way to compare performance against the competition, and as a general benchmark.

Byte once again revisited the sieve later in August 1983 as part of a whole-magazine series of articles on the C language. In this case the use was more in keeping with the original intent, using a single source code and running it on a single machine to compare the performance of C compilers on the CP/M-86 operating system, on CP/M-80, and for the IBM PC.

In spite of Gilbreath's concern in the original article, by this time the code had become almost universal for testing, and one of the articles remarked that "The Sieve of Eratosthenes is a mandatory benchmark". It was included in the Byte UNIX Benchmark Suite introduced in August 1984.

Today
New versions of the code continue to appear for new languages, eg Rosetta Code and GitHub has many versions available. It is often used as an example of functional programming in spite of the common version not actually using the sieve algorithm.

Implementation
The provided implementation calculated odd primes only, so the 8191 element array actually represented primes less than 16385. As shown in a sidebar table, the 0th element represented 3, 1st element 5, 2nd element 7, and so on.

This is the original BASIC version of the code presented in 1981. The dialect is not specified, but a number of details mean it does not run under early versions of Microsoft BASIC (4.x and earlier), among these the use of long variable names like SIZE and FLAGS. The lack of line numbers may suggest a minicomputer variety that reads source from a text file, but may have also been a printing error. And in C, with some whitespace adjustments from the original: