Colossus computer

Colossus was a set of computers developed by British codebreakers in the years 1943–1945 to help in the cryptanalysis of the Lorenz cipher. Colossus used thermionic valves (vacuum tubes) to perform Boolean and counting operations. Colossus is thus regarded as the world's first programmable, electronic, digital computer, although it was programmed by switches and plugs and not by a stored program.

Colossus was designed by General Post Office (GPO) research telephone engineer Tommy Flowers based on plans developed by mathematician Max Newman at the Government Code and Cypher School (GC&CS) at Bletchley Park.

Alan Turing's use of probability in cryptanalysis (see Banburismus) contributed to its design. It has sometimes been erroneously stated that Turing designed Colossus to aid the cryptanalysis of the Enigma. (Turing's machine that helped decode Enigma was the electromechanical Bombe, not Colossus.)

The prototype, Colossus Mark 1, was shown to be working in December 1943 and was in use at Bletchley Park by early 1944. An improved Colossus Mark 2 that used shift registers to quintuple the processing speed, first worked on 1 June 1944, just in time for the Normandy landings on D-Day. Ten Colossi were in use by the end of the war and an eleventh was being commissioned. Bletchley Park's use of these machines allowed the Allies to obtain a vast amount of high-level military intelligence from intercepted radiotelegraphy messages between the German High Command (OKW) and their army commands throughout occupied Europe.

The existence of the Colossus machines was kept secret until the mid-1970s. All but two machines were dismantled into such small parts that their use could not be inferred. The two retained machines were eventually dismantled in the 1960s. In January 2024, new photos were released by GCHQ that showed re-engineered Colossus in a very different environment from the Bletchley Park buildings, presumably at GCHQ Cheltenham. A functioning reconstruction of a Mark 2 Colossus was completed in 2008 by Tony Sale and a team of volunteers; it is on display in The National Museum of Computing at Bletchley Park.

Purpose and origins


The Colossus computers were used to help decipher intercepted radio teleprinter messages that had been encrypted using an unknown device. Intelligence information revealed that the Germans called the wireless teleprinter transmission systems "Sägefisch" (sawfish). This led the British to call encrypted German teleprinter traffic "Fish", and the unknown machine and its intercepted messages "Tunny" (tunafish).

Before the Germans increased the security of their operating procedures, British cryptanalysts diagnosed how the unseen machine functioned and built an imitation of it called "British Tunny".

It was deduced that the machine had twelve wheels and used a Vernam ciphering technique on message characters in the standard 5-bit ITA2 telegraph code. It did this by combining the plaintext characters with a stream of key characters using the XOR Boolean function to produce the ciphertext.

In August 1941, a blunder by German operators led to the transmission of two versions of the same message with identical machine settings. These were intercepted and worked on at Bletchley Park. First, John Tiltman, a very talented GC&CS cryptanalyst, derived a keystream of almost 4000 characters. Then Bill Tutte, a newly arrived member of the Research Section, used this keystream to work out the logical structure of the Lorenz machine. He deduced that the twelve wheels consisted of two groups of five, which he named the χ (chi) and ψ (psi) wheels, the remaining two he called μ (mu) or "motor" wheels. The chi wheels stepped regularly with each letter that was encrypted, while the psi wheels stepped irregularly, under the control of the motor wheels.

With a sufficiently random keystream, a Vernam cipher removes the natural language property of a plaintext message of having an uneven frequency distribution of the different characters, to produce a uniform distribution in the ciphertext. The Tunny machine did this well. However, the cryptanalysts worked out that by examining the frequency distribution of the character-to-character changes in the ciphertext, instead of the plain characters, there was a departure from uniformity which provided a way into the system. This was achieved by "differencing" in which each bit or character was XOR-ed with its successor. After Germany surrendered, allied forces captured a Tunny machine and discovered that it was the electromechanical Lorenz SZ (Schlüsselzusatzgerät, cipher attachment) in-line cipher machine.

In order to decrypt the transmitted messages, two tasks had to be performed. The first was "wheel breaking", which was the discovery of the cam patterns for all the wheels. These patterns were set up on the Lorenz machine and then used for a fixed period of time for a succession of different messages. Each transmission, which often contained more than one message, was enciphered with a different start position of the wheels. Alan Turing invented a method of wheel-breaking that became known as Turingery. Turing's technique was further developed into "Rectangling", for which Colossus could produce tables for manual analysis. Colossi 2, 4, 6, 7 and 9 had a "gadget" to aid this process.

The second task was "wheel setting", which worked out the start positions of the wheels for a particular message and could only be attempted once the cam patterns were known. It was this task for which Colossus was initially designed. To discover the start position of the chi wheels for a message, Colossus compared two character streams, counting statistics from the evaluation of programmable Boolean functions. The two streams were the ciphertext, which was read at high speed from a paper tape, and the keystream, which was generated internally, in a simulation of the unknown German machine. After a succession of different Colossus runs to discover the likely chi-wheel settings, they were checked by examining the frequency distribution of the characters in the processed ciphertext. Colossus produced these frequency counts.

Decryption processes
By using differencing and knowing that the psi wheels did not advance with each character, Tutte worked out that trying just two differenced bits (impulses) of the chi-stream against the differenced ciphertext would produce a statistic that was non-random. This became known as Tutte's "1+2 break in". It involved calculating the following Boolean function:
 * $$\Delta Z_1 \oplus \Delta Z_2 \oplus \Delta\chi_1 \oplus \Delta\chi_2 = \bullet$$

and counting the number of times it yielded "false" (zero). If this number exceeded a pre-defined threshold value known as the "set total", it was printed out. The cryptanalyst would examine the printout to determine which of the putative start positions was most likely to be the correct one for the chi-1 and chi-2 wheels.

This technique would then be applied to other pairs of, or single, impulses to determine the likely start position of all five chi wheels. From this, the de-chi (D) of a ciphertext could be obtained, from which the psi component could be removed by manual methods. If the frequency distribution of characters in the de-chi version of the ciphertext was within certain bounds, "wheel setting" of the chi wheels was considered to have been achieved, and the message settings and de-chi were passed to the "Testery". This was the section at Bletchley Park led by Major Ralph Tester where the bulk of the decrypting work was done by manual and linguistic methods.

Colossus could also derive the start position of the psi and motor wheels. The feasibility of utilizing this additional capability regularly was made possible in the last few months of the war when there were plenty of Colossi available and the number of Tunny messages had declined.

Design and construction
Colossus was developed for the "Newmanry", the section headed by the mathematician Max Newman that was responsible for machine methods against the twelve-rotor Lorenz SZ40/42 on-line teleprinter cipher machine (code-named Tunny, for tunafish). The Colossus design arose out of a parallel project that produced a less-ambitious counting machine dubbed "Heath Robinson". Although the Heath Robinson machine proved the concept of machine analysis for this part of the process, it had serious limitations. The electro-mechanical parts were relatively slow and it was difficult to synchronise two looped paper tapes, one containing the enciphered message, and the other representing part of the keystream of the Lorenz machine. Also the tapes tended to stretch and break when being read at up to 2000 characters per second.

Tommy Flowers MBE was a senior electrical engineer and Head of the Switching Group at the Post Office Research Station at Dollis Hill. Prior to his work on Colossus, he had been involved with GC&CS at Bletchley Park from February 1941 in an attempt to improve the Bombes that were used in the cryptanalysis of the German Enigma cipher machine. He was recommended to Max Newman by Alan Turing, who had been impressed by his work on the Bombes. The main components of the Heath Robinson machine were as follows.
 * A tape transport and reading mechanism that ran the looped key and message tapes at between 1000 and 2000 characters per second.
 * A combining unit that implemented the logic of Tutte's method.
 * A counting unit that had been designed by C. E. Wynn-Williams of the Telecommunications Research Establishment (TRE) at Malvern, which counted the number of times the logical function returned a specified truth value.

Flowers had been brought in to design the Heath Robinson's combining unit. He was not impressed by the system of a key tape that had to be kept synchronised with the message tape and, on his own initiative, he designed an electronic machine which eliminated the need for the key tape by having an electronic analogue of the Lorenz (Tunny) machine. He presented this design to Max Newman in February 1943, but the idea that the one to two thousand thermionic valves (vacuum tubes and thyratrons) proposed, could work together reliably, was greeted with great scepticism, so more Robinsons were ordered from Dollis Hill. Flowers, however, knew from his pre-war work that most thermionic valve failures occurred as a result of the thermal stresses at power-up, so not powering a machine down reduced failure rates to very low levels. Additionally, if the heaters were started at a low voltage then slowly brought up to full voltage, thermal stress was reduced. The valves themselves could be soldered-in to avoid problems with plug-in bases, which could be unreliable. Flowers persisted with the idea and obtained support from the Director of the Research Station, W Gordon Radley.

Flowers and his team of some fifty people in the switching group spent eleven months from early February 1943 designing and building a machine that dispensed with the second tape of the Heath Robinson, by generating the wheel patterns electronically. Flowers used some of his own money for the project. This prototype, Mark 1 Colossus, contained 1,600 thermionic valves (tubes). It performed satisfactorily at Dollis Hill on 8 December 1943 and was dismantled and shipped to Bletchley Park, where it was delivered on 18 January and re-assembled by Harry Fensom and Don Horwood. It was operational in January and it successfully attacked its first message on 5 February 1944. It was a large structure and was dubbed 'Colossus'. A memo held in the National Archives written by Max Newman on 18 January 1944 records that "Colossus arrives today".

During the development of the prototype, an improved design had been developed – the Mark 2 Colossus. Four of these were ordered in March 1944 and by the end of April the number on order had been increased to twelve. Dollis Hill was put under pressure to have the first of these working by 1 June. Allen Coombs took over leadership of the production Mark 2 Colossi, the first of which – containing 2,400 valves – became operational at 08:00 on 1 June 1944, just in time for the Allied Invasion of Normandy on D-Day. Subsequently, Colossi were delivered at the rate of about one a month. By the time of V-E Day there were ten Colossi working at Bletchley Park and a start had been made on assembling an eleventh. Seven of the Colossi were used for 'wheel setting' and three for 'wheel breaking'.

The main units of the Mark 2 design were as follows.
 * A tape transport with an 8-photocell reading mechanism.
 * A six character FIFO shift register.
 * Twelve thyratron ring stores that simulated the Lorenz machine generating a bit-stream for each wheel.
 * Panels of switches for specifying the program and the "set total".
 * A set of functional units that performed Boolean operations.
 * A "span counter" that could suspend counting for part of the tape.
 * A master control that handled clocking, start and stop signals, counter readout and printing.
 * Five electronic counters.
 * An electric typewriter.

Most of the design of the electronics was the work of Tommy Flowers, assisted by William Chandler, Sidney Broadhurst and Allen Coombs; with Erie Speight and Arnold Lynch developing the photoelectric reading mechanism. Coombs remembered Flowers, having produced a rough draft of his design, tearing it into pieces that he handed out to his colleagues for them to do the detailed design and get their team to manufacture it. The Mark 2 Colossi were both five times faster and were simpler to operate than the prototype.

Data input to Colossus was by photoelectric reading of a paper tape transcription of the enciphered intercepted message. This was arranged in a continuous loop so that it could be read and re-read multiple times – there being no internal storage for the data. The design overcame the problem of synchronizing the electronics with the speed of the message tape by generating a clock signal from reading its sprocket holes. The speed of operation was thus limited by the mechanics of reading the tape. During development, the tape reader was tested up to 9700 characters per second (53 mph) before the tape disintegrated. So 5000 characters/second (40 ft/s) was settled on as the speed for regular use. Flowers designed a 6-character shift register, which was used both for computing the delta function (ΔZ) and for testing five different possible starting points of Tunny's wheels in the five processors. This five-way parallelism enabled five simultaneous tests and counts to be performed giving an effective processing speed of 25,000 characters per second. The computation used algorithms devised by W. T. Tutte and colleagues to decrypt a Tunny message.

Operation
The Newmanry was staffed by cryptanalysts, operators from the Women's Royal Naval Service (WRNS) – known as "Wrens" – and engineers who were permanently on hand for maintenance and repair. By the end of the war the staffing was 272 Wrens and 27 men.

The first job in operating Colossus for a new message was to prepare the paper tape loop. This was performed by the Wrens who stuck the two ends together using Bostik glue, ensuring that there was a 150-character length of blank tape between the end and the start of the message. Using a special hand punch they inserted a start hole between the third and fourth channels $2 1/2$ sprocket holes from the end of the blank section, and a stop hole between the fourth and fifth channels $1 1/2$ sprocket holes from the end of the characters of the message. These were read by specially positioned photocells and indicated when the message was about to start and when it ended. The operator would then thread the paper tape through the gate and around the pulleys of the bedstead and adjust the tension. The two-tape bedstead design had been carried on from Heath Robinson so that one tape could be loaded whilst the previous one was being run. A switch on the Selection Panel specified the "near" or the "far" tape.

After performing various resetting and zeroizing tasks, the Wren operators would, under instruction from the cryptanalyst, operate the "set total" decade switches and the K2 panel switches to set the desired algorithm. They would then start the bedstead tape motor and lamp and, when the tape was up to speed, operate the master start switch.

Programming
Howard Campaigne, a mathematician and cryptanalyst from the US Navy's OP-20-G, wrote the following in a foreword to Flowers' 1983 paper "The Design of Colossus"."My view of Colossus was that of cryptanalyst-programmer. I told the machine to make certain calculations and counts, and after studying the results, told it to do another job. It did not remember the previous result, nor could it have acted upon it if it did. Colossus and I alternated in an interaction that sometimes achieved an analysis of an unusual German cipher system, called 'Geheimschreiber' by the Germans, and 'Fish' by the cryptanalysts."

Colossus was not a stored-program computer. The input data for the five parallel processors was read from the looped message paper tape and the electronic pattern generators for the chi, psi and motor wheels. The programs for the processors were set and held on the switches and jack panel connections. Each processor could evaluate a Boolean function and count and display the number of times it yielded the specified value of "false" (0) or "true" (1) for each pass of the message tape.

Input to the processors came from two sources, the shift registers from tape reading and the thyratron rings that emulated the wheels of the Tunny machine. The characters on the paper tape were called Z and the characters from the Tunny emulator were referred to by the Greek letters that Bill Tutte had given them when working out the logical structure of the machine. On the selection panel, switches specified either Z or ΔZ, either $$\chi$$ or Δ$$\chi$$ and either $$\psi$$ or Δ$$\psi$$ for the data to be passed to the jack field and 'K2 switch panel'. These signals from the wheel simulators could be specified as stepping on with each new pass of the message tape or not.

The K2 switch panel had a group of switches on the left-hand side to specify the algorithm. The switches on the right-hand side selected the counter to which the result was fed. The plugboard allowed less specialized conditions to be imposed. Overall the K2 switch panel switches and the plugboard allowed about five billion different combinations of the selected variables.

As an example: a set of runs for a message tape might initially involve two chi wheels, as in Tutte's 1+2 algorithm. Such a two-wheel run was called a long run, taking on average eight minutes unless the parallelism was utilised to cut the time by a factor of five. The subsequent runs might only involve setting one chi wheel, giving a short run taking about two minutes. Initially, after the initial long run, the choice of the next algorithm to be tried was specified by the cryptanalyst. Experience showed, however, that decision trees for this iterative process could be produced for use by the Wren operators in a proportion of cases.

Influence and fate
Although the Colossus was the first of the electronic digital machines with programmability, albeit limited by modern standards, it was not a general-purpose machine, being designed for a range of cryptanalytic tasks, most involving counting the results of evaluating Boolean algorithms.

A Colossus computer was thus not a fully Turing complete machine. However, University of San Francisco professor Benjamin Wells has shown that if all ten Colossus machines made were rearranged in a specific cluster, then the entire set of computers could have simulated a universal Turing machine, and thus be Turing complete.

Colossus and the reasons for its construction were highly secret and remained so for 30 years after the War. Consequently, it was not included in the history of computing hardware for many years, and Flowers and his associates were deprived of the recognition they were due. All but two of the Colossi were dismantled after the war and parts returned to the Post Office. Some parts, sanitised as to their original purpose, were taken to Max Newman's Royal Society Computing Machine Laboratory at Manchester University. Two Colossi, along with two Tunny machines, were retained and moved to GCHQ's new headquarters at Eastcote in April 1946, and then to Cheltenham between 1952 and 1954. One of the Colossi, known as Colossus Blue, was dismantled in 1959; the other in the 1960s. Tommy Flowers was ordered to destroy all documentation. He duly burnt them in a furnace and later said of that order: "That was a terrible mistake. I was instructed to destroy all the records, which I did. I took all the drawings and the plans and all the information about Colossus on paper and put it in the boiler fire. And saw it burn."The Colossi were adapted for other purposes, with varying degrees of success; in their later years they were used for training. Jack Good related how he was the first to use Colossus after the war, persuading the US National Security Agency that it could be used to perform a function for which they were planning to build a special-purpose machine. Colossus was also used to perform character counts on one-time pad tape to test for non-randomness.

A small number of people who were associated with Colossus—and knew that large-scale, reliable, high-speed electronic digital computing devices were feasible—played significant roles in early computer work in the UK and probably in the US. However, being so secret, it had little direct influence on the development of later computers; it was EDVAC that was the seminal computer architecture of the time. In 1972, Herman Goldstine, who was unaware of Colossus and its legacy to the projects of people such as Alan Turing (ACE), Max Newman (Manchester computers) and Harry Huskey (Bendix G-15), wrote that,

"Britain had such vitality that it could immediately after the war embark on so many well-conceived and well-executed projects in the computer field."

Professor Brian Randell, who unearthed information about Colossus in the 1970s, commented on this, saying that: "It is my opinion that the COLOSSUS project was an important source of this vitality, one that has been largely unappreciated, as has the significance of its places in the chronology of the invention of the digital computer."

Randell's efforts started to bear fruit in the mid-1970s. The secrecy about Bletchley Park had been broken when Group Captain Winterbotham published his book The Ultra Secret in 1974. Randell was researching the history of computer science in Britain for a conference on the history of computing held at the Los Alamos Scientific Laboratory, New Mexico on 10–15 June 1976, and got permission to present a paper on wartime development of the COLOSSI at the Post Office Research Station,  Dollis Hill (in October 1975 the British Government had released a series of captioned photographs from the Public Record Office). The interest in the "revelations" in his paper resulted in a special evening meeting when Randell and Coombs answered further questions. Coombs later wrote that no member of our team could ever forget the fellowship, the sense of purpose and, above all, the breathless excitement of those days. In 1977 Randell published an article The First Electronic Computer in several journals.

In October 2000, a 500-page technical report on the Tunny cipher and its cryptanalysis—entitled General Report on Tunny—was released by GCHQ to the national Public Record Office, and it contains a fascinating paean to Colossus by the cryptographers who worked with it: "It is regretted that it is not possible to give an adequate idea of the fascination of a Colossus at work; its sheer bulk and apparent complexity; the fantastic speed of thin paper tape round the glittering pulleys; the childish pleasure of not-not, span, print main header and other gadgets; the wizardry of purely mechanical decoding letter by letter (one novice thought she was being hoaxed); the uncanny action of the typewriter in printing the correct scores without and beyond human aid; the stepping of the display; periods of eager expectation culminating in the sudden appearance of the longed-for score; and the strange rhythms characterizing every type of run: the stately break-in, the erratic short run, the regularity of wheel-breaking, the stolid rectangle interrupted by the wild leaps of the carriage-return, the frantic chatter of a motor run, even the ludicrous frenzy of hosts of bogus scores."

Reconstruction
A team led by Tony Sale built a fully functional reconstruction of a Colossus Mark 2 between 1993 and 2008. In spite of the blueprints and hardware being destroyed, a surprising amount of material had survived, mainly in engineers' notebooks, but a considerable amount of it in the U.S. The optical tape reader might have posed the biggest problem, but Dr. Arnold Lynch, its original designer was able to redesign it to his own original specification. The reconstruction is on display, in the historically correct place for Colossus No. 9, at The National Museum of Computing, in H Block Bletchley Park in Milton Keynes, Buckinghamshire.

In November 2007, to celebrate the project completion and to mark the start of a fundraising initiative for The National Museum of Computing, a Cipher Challenge pitted the rebuilt Colossus against radio amateurs worldwide in being first to receive and decode three messages enciphered using the Lorenz SZ42 and transmitted from radio station DL0HNF in the Heinz Nixdorf MuseumsForum computer museum. The challenge was easily won by radio amateur Joachim Schüth, who had carefully prepared for the event and developed his own signal processing and code-breaking code using Ada. The Colossus team were hampered by their wish to use World War II radio equipment, delaying them by a day because of poor reception conditions. Nevertheless, the victor's 1.4 GHz laptop, running his own code, took less than a minute to find the settings for all 12 wheels. The German codebreaker said: "My laptop digested ciphertext at a speed of 1.2 million characters per second—240 times faster than Colossus. If you scale the CPU frequency by that factor, you get an equivalent clock of 5.8 MHz for Colossus. That is a remarkable speed for a computer built in 1944."

The Cipher Challenge verified the successful completion of the rebuilding project. "On the strength of today's performance Colossus is as good as it was six decades ago", commented Tony Sale. "We are delighted to have produced a fitting tribute to the people who worked at Bletchley Park and whose brainpower devised these fantastic machines which broke these ciphers and shortened the war by many months."



Other meanings
There was a fictional computer named Colossus in the 1970 film Colossus: The Forbin Project which was based on the 1966 novel Colossus by D. F. Jones. This was a coincidence as it pre-dates the public release of information about Colossus, or even its name.

Neal Stephenson's novel Cryptonomicon (1999) also contains a fictional treatment of the historical role played by Turing and Bletchley Park.