BinHex

BinHex, originally short for "binary-to-hexadecimal", is a binary-to-text encoding system that was used on the classic Mac OS for sending binary files through e-mail. Originally a hexadecimal encoding, subsequent versions of BinHex are more similar to uuencode, but combined both "forks" of the Mac file system together along with extended file information. BinHexed files take up more space than the original files, but will not be corrupted by non-"8-bit clean" software.

TRS-80 BinHex (.hex)
BinHex was originally written in 1981 by Tim Mann for the TRS-80 as a standalone version of an encoding scheme originally built into a popular terminal emulator, ST80-III by Lance Micklus. BinHex was used for sending files via major online services, such as CompuServe, which were not "8-bit clean" and required ASCII armoring to survive. Not everyone used ST-80, however, so Mann wrote BinHex to allow users of other terminals to use the format.

The original ST-80 system worked by converting the binary file contents to hexadecimal numbers, which were encoded as ASCII digits and letters ( –, – ). It then added a newline after every 60 characters. The system became very popular after Mann uploaded it to CompuServe's TRS-80 files area. The system quickly gained the addition of a checksum at the end of every line to check for errors. Bill Stockwell converted that version to the BASIC/S compiler, which ran much faster than Mann's interpreted version.

BinHex files of the era were typically given the file extension .hex. Ports soon appeared for other popular platforms of the era, including the Apple II. CompuServe later added support for 8-bit transfers, and the format quickly disappeared.

Mac BinHex (.hex)
The file upload problem still existed on CompuServe when the Mac was first released in 1984. In April 1984, William Davis ported BinHex to the Mac using Microsoft BASIC to produce a version that was largely identical to the TRS-80 versions of the same era. This version only supported encoding of the "data fork", ignoring the resource fork, which meant it could only be used for data files. The rise in use of Internet e-mail coincided roughly with the release of the Macintosh, and Davis's version was posted on the Info-Mac mailing list by Joel Heller in June 1984. Several newer versions were published during 1984, resulting in BinHex 3 that could encode both forks.

Yves Lempereur, author of the first assembler for the Mac, MacASM, found that in order to upload his files to CompuServe he had to use BinHex. The BASIC version was very slow, so Lempereur ported BinHex 3 to assembler and released it as BinHex 1.0. The program was roughly a hundred times as fast as the BASIC version, and soon upgrade requests were flooding in.

Compact BinHex (.hcx)
The original BinHex was a fairly simple format, one that was not very efficient because it expanded every byte of input into two, as required by the hexadecimal representation—an 8-to-4 bit encoding. For BinHex 2.0, Lempereur used a new 8-to-6 encoding that decreased file size by 50%. He also took the opportunity to expand the checksum from 8 to 16-bits.

This new encoding used the first 64 ASCII printing characters, including the space, to represent the data, similarly to uuencode. Even though the new encoding was no longer hexadecimal in nature, the established name of the program was retained. The smaller files were incompatible with the older ones, so the extension became .hcx, c for compact. The new version replaced the earlier ones "overnight".

BinHex 4 (.hqx)
Lempereur had concerns about some of the features of BinHex, notably its use of a checksum instead of a cyclic redundancy check (CRC) and the fact that the metadata information in the header was in plain text and thus could be corrupted in the same way as the data.

In order to solve all of these problems, Lempereur released BinHex 4.0 in 1985, skipping 3.0 to avoid confusion with the now long-dead BASIC version. 4.0 first combined the data fork, resource fork and file metadata into a common 8-bit format, ran run-length encoding (RLE) on the result to provide some compression, and then ran the 8->6 conversion on the result and protected everything with multiple CRCs. The resulting  files were roughly the same size of the  's, but much more robust.

BinHex 5
At about the time BinHex 4 was released, most online services started supporting robust 8-bit file transfer protocols such as ZMODEM, and the need for ASCII armoring went away. This left a problem on the Mac, however, as there was still the need to encode the two forks into one.

A team effort among Macintosh communications programmers, including Lempereur, resulted in MacBinary. These  files left the contents of the forks in their original 8-bit format and added a simple header for combining them on reception; MacBinary files were thus much smaller than BinHex. Lempereur released BinHex 5.0, almost identical to 4.0 with the exception that it used MacBinary to combine the forks before running the 8-to-6 encoding. This saw little use, as he expected.

On the Internet, e-mail was still the primary method of moving files. At the time, relatively few people had full access to the Internet, and services like FTPmail were the only way many users could download files. Years later when he first got onto the Internet, Lempereur was surprised to find that BinHex 4.0 was still extremely popular.

The same ends could be achieved by first using MacBinary or AppleSingle to combine the forks, and then using Uuencode or Base64 on the resulting file, but none of these solutions ever became popular and BinHex 4.0 survived well into the late 1990s. File archives of classic Mac OS software are still filled with BinHexed files.

BinHex 4 file format
Looking at the contents of a BinHex file, one will notice that it has a message usually on the first line identifying it as BinHex, followed by many 64-character lines made up of seemingly random letters, numbers, and punctuation marks. Here is a sample of what BinHex actually looks like:

(This file must be converted with BinHex 4.0)


 * $f*TEQKPH#jdCA0d,R0TG!"6594%8dP8)3#3"!&m!*!%EMa6593K!!%!!!&mFNa

KG3,r!*!$&[rr$3d,BQPZD'9i,R4PFh3!RQ+!!"AV#J#3!i!!N!@QKUjrU!#3'[q 3"&4&@&483N)f!3#Xaj6bV-H8mJ!!!B3!N!0"!*!$[3#3!cR@iiY)!*!'[I%4!!J Fp$X%X3@J!mZE6!GRiKUi$HGKMf0U61S46%i1"AB!TI,fLl!d1X3RDDE8ALfTCbM 8UP9p4iUqY-0k4krHpk9XK@`rbj2Ti'U@5rGH@+[fr-i4T6-qXpfl26,k!H5$Nml TIkI'(l3GI4)f8mII&01CNEbC2LrNLBeaZ1HG@$G8!Z6"k)hh,q9p"r6FC*!!Se" (ic,Pd(4(b`pflKC`H1&JN5)GVX3mREdH55[l`%`Yhp%q092c`A(hPV)!83Dr&f4 $$L#I1aM-"VjqV-q$34KQq6$M$f8#,Zc,i),!(`*ZN!$K$rS!LA%3cL+dYi"@,K( Z"`#3!fKi!!!:

There must be a text line, which is used by users and tools to recognize BinHex versions:. Any text before this line is to be ignored.

The rest of the file consists of three parts, a header (containing file name, size etc.), a data fork (containing the file data) and a resource fork. Each has a two-byte CRC checksum.

Everything except the ... line is then seen as an area of binary data, which is encoded to ASCII characters. The encoding algorithm says that three bytes input are divided into four 6-bit values, in a way similar to the way in which Base64 does. Number 0–63 are given characters according to the following list

When encoding, a should be inserted after every 64 characters. After encoding, a colon is placed before and after the data.