Unified Emulator Format

Unified Emulator Format (UEF) is a container format for the compressed storage of audio tapes, ROMs, floppy discs and machine state snapshots for the 8-bit range of computers manufactured by Acorn Computers. First implemented by Thomas Harte's ElectrEm emulator and related tools, it is now supported by major emulators of Acorn machines and carried by two online archives of Acorn software numbering thousands of titles.

UEF attempts to concisely reproduce media borne signals rather than simply the data represented by them, the intention being an accurate archive of original media rather than merely a capability to reproduce files stored on them. A selection of metadata can be included, such as compatibility ratings, position markers, images of packaging and the text of instruction manuals.

The Acorn machines implement the Kansas City standard (KCS) for tape data encoding and as a result the file format is suitable for creating backups of original media for several non-Acorn machines. As of version 0.10 the file format carries BASICODE signals as well.

TZX is a chunked format with similar scope for the ZX Spectrum series.

History
Before the development of the UEF, archives of Acorn computer software on the World Wide Web had adopted a convention of hosting ZIP archives of the raw files on a tape, each raw file accompanied by a sidecar file, with extension, carrying the load and execution addresses from the file header. The INF convention, described and implemented by Wouter Scholten in bbcim (1995), extends the output format of the  command (built into Acorn DFS and ADFS, which lists file lengths and other metadata attached to files on disc ) to cover CRCs and the order of files on tape. While it works adequately for storing user files, it does not preserve the baud rate of the recording, precise timing information or the non-standard data streams used in copy protected titles.

In the case of disc-based software, it became increasingly convenient to send a sector dump of the disc instead, and by the time of the UEF's introduction the file extensions  and   were already established for single-sided and double-sided raw images of DFS discs, respectively. Distributed bare or in a ZIP archive, they remain popular on archive sites.

Aims
In a 2010 post to the Stardot forum, Harte explained at length his reasons for creating the format: being the first to address emulation of the Acorn Electron and its primary medium, tape, Harte wanted a fine-grained and technically optimal representation of media, compared to existing ad hoc formats; and to package the multiple media elements of a software release into a single file, so that downloading a UEF is "more like obtaining the original product". He went on to observe that it was the tools in use, and "user need", that determined the actual uses to which the UEF had been put.

Structure
A UEF file consists of a fixed length header that identifies itself, followed by a linked list of chunks containing the data of interest. The header comprises the magic string, a terminating null character, and the two-byte version number of the UEF specification in use. A reading application needs to pay attention to the version number, as the unit of measurement in some chunks differs according to the specification version, and one chunk has been redefined between versions.

Each chunk consists of a two-byte ID which determines its meaning, the length of the body in four bytes, and the body itself. An application can readily skip the bodies of chunks it does not need to process. After the last chunk the file simply ends. Currently, UEF chunks do not nest.

The whole UEF file, including the header, may optionally be compressed in gzip format. By examining the start of the file for a gzip or UEF header, a decompression library can be invoked as appropriate.

Content
The Unified Emulator Format models software on cassette as a contiguous sequence of segments, which may be carrier tones, the modulated asynchronous signals of ordinary data blocks, security cycles (modulated synchronous signals, said to be an "identification feature" ) or gaps where no recognised signal is present. Tape UEF chunks are concatenated in the order they appear, to build up the representation of a whole recording. When generated from a real source tape, each waveform on the tape corresponds directly to a tape chunk, such that the source can be accurately reconstructed (with any non-encodable signals replaced by gaps of equal length.)

Standard Acorn streams (chunk ID: 0x0100) are encoded so that their bytes reappear in the UEF chunk body. From version 0.10, direct support is extended to all asynchronous formats (0x0104) including the  format of BASICODE. Otherwise there is a generic chunk (0x0102) to accommodate any arbitrary sequence of bits. Security wave chunks (0x0114) also carry bit streams, encoded in a different form to allow the half-length one bits observed in commercial recordings to be represented.

There are some modal variables affecting the interpretation of these chunks: the baud rate, 1200 baud for Acorn signals or 300 baud for KCS; the exact carrier frequency, which determines the playing time of the reconstructed tape; and the phase of the signal. The latter two may change within a published recording, and their absolute values depend on the tape player, amplifier and sound card used to digitise the signal.

A UEF file can contain markers to separate the tapes of a multiple-tape distribution, and the sides of each tape; positions of interest within each side can also be marked.

Discs are stored as raw sector dumps of each surface, along with their geometry and a byte identifying the file system. Previous versions of the specification had provisions to encode discs at the byte stream level, or the magnetic domain level. With SSD and DSD sector dumps serving standard BBC discs well, and the mature FDI format catering for copy-protected software, the disc image function of UEF is little used.

Sideways ROMs are likewise stored as raw data, plus an indication of their purpose and a ROM slot recommendation. Again the user base prefers bare ROM dumps for archival.

State snapshot UEF files include standardised chunks to store the major portions of an Acorn Electron or BBC Micro's state: main, shadow and expansion bus memory, the CPU and the WD1770 floppy drive controller; also the Electron ULA and the Slogger Master RAM Board, a common Electron add-on. A patch memory chunk rewrites a block of memory at any address, allowing the UEF format to package pokes. To store state elements not accommodated in the standard chunks, emulators can define their own chunks. A private use area of chunk IDs is reserved for this or any other purpose, although some emulators save state under invalid chunk IDs in the public space.

Multiplexed data is an extension for emulators, used by ElectrEm but without a published specification:

"Bit multiplexing supplies the emulator with additional information so that old programs may be run to produce a greater quality of output. This feature is really only for emulation use of UEF files and ignoring bit multiplexing will have no effect on the accuracy of your tool to original hardware."

One salient application mentioned by Harte is to superimpose "new graphics on old games", and a single example, a 256-colour enhanced Daredevil Dennis, is available from StairwayToHell.com to run in ElectrEm.

Multiplexed data chunks are intended to follow ordinary data chunks in any of the above classes, supplementing the data. Their contents are not meant to be visible to the Acorn computer, whether real or emulated, but otherwise their meaning has not been specified.

Chunks providing content information include the file origin chunk, which identifies the application that generated the UEF file. Inlay scan chunks, intended as a file preview, hold a raw bitmap of the cover art although anything beyond a thumbnail can take up more data than a typical game. The UEF author can also provide the text of an instruction booklet or a URL for more information, a short title for display, minimum machine specification and keyboard mapping for the enclosed software; and where a game does not use the whole screen, the coordinates of the visible area can be given. A minority of UEF files available online contain anything in this class but an origin chunk.

A UEF file can contain multiple classes of data at once, as Harte intended; it is not possible to know which classes it contains without scanning the whole file. In its file selection box ElectrEm displays an icon according to the first data class chunk it finds.

MakeUEF
MakeUEF is a Windows application written by Thomas Harte and expanded by Fraser Ross to convert audio samples into UEF files. Two grades are offered. An 'amateur' version reads WAV files or a live signal played to the sound card, and transcribes only standard data blocks with accuracy. The 'professional' grade accepts only CSW files, which represent waves preprocessed into rectangular pulse trains, but it encodes all audio information supported by the UEF specification.

MakeUEF claims to have been the sole creator of all UEF files available on the Web before November 2004, the month of its version 1.0 release. Although the file format was more capable, supporting "gap lengths" since February 2001 at the latest, only "program data" was retained by MakeUEF prior to version 1.0. From November 2004 the fidelity of MakeUEF improved and the file spec was further refined, and an extension of  ("high quality") was adopted to reflect this. The AcornPreservation.org archive only carries the HQ.UEF variety as well as the CSW source files. Its sister site StairwayToHell.com accepts 'amateur' UEF translations and files produced by pre-1.0 MakeUEF. the latter site hosts 1,494 transcriptions of BBC Micro cassette titles and at least 800 of Electron titles.

Others

 * Several emulators of Acorn machines support UEF natively, to read and write tape data (at original speed or faster) and store state snapshots. Examples include ElectrEm, BeebEm and B-Em.
 * FreeUEF by Thomas Harte and the UEFReader Java Sound plugin convert a UEF file to a wave suitable for recording on tape or playing back to a physical computer.
 * UberCassette are cross-platform, multi-format encoders emitting UEF from samples of Acorn cassettes.
 * The UEFwalk Perl script validates and extracts data from UEF files.
 * The XVUEF patch extends the Xv image editor to support the little-used inlay scan chunks of the UEF.

Use on real BBC Micros
The GoMMC and GoSDC hardware extensions, produced by John Kortink from 2004, provide a virtual cassette playing capability. The accompanying PC tools import the cassette data from UEF files and store the extracted cassette stream on a memory card.

In February 2012, Martin Barr released version 5.0 of UPURS, a ROM based suite of utilities to aid data transfer to real BBC Microcomputers. As part of that release, the tool UPCFS saw its first release which enabled a claimed 86% compatibility rate with existing decompressed UEF files allowing them to be transferred to a real BBC Micro using a custom User Port cable that presents an RS-232 capable connection to a PC.