Atari BASIC

Atari BASIC is an interpreter for the BASIC programming language that shipped with Atari 8-bit computers. Unlike most American BASICs of the home computer era, Atari BASIC is not a derivative of Microsoft BASIC and differs in significant ways. It includes keywords for Atari-specific features and lacks support for string arrays.

The language was distributed as an 8 KB ROM cartridge for use with the 1979 Atari 400 and 800 computers. Starting with the 600XL and 800XL in 1983, BASIC is built into the system. There are three primary versions of the software: the original cartridge-based "A", the built-in "B" for the 600XL/800XL, and the final "C" version in late-model XLs and the XE series.

Despite the Atari 8-bit computers running at a higher speed than most of its contemporaries, several technical decisions placed Atari BASIC near the bottom in performance benchmarks. The original authors addressed most of these issues in a series of improved versions: BASIC A+ (1981), BASIC XL (1983), and BASIC XE (1985).

The complete, annotated source code and design specifications of Atari BASIC were published as The Atari BASIC Source Book in 1983.

Development
The machines that would become the Atari 8-bit computers were originally developed as second-generation video game consoles intended to replace the Atari VCS. Ray Kassar, the new president of Atari, decided to challenge Apple Computer by building a home computer instead.

This meant the designs needed to include the BASIC programming language, the standard for home computers. In early 1978, Atari licensed the source code to the MOS 6502 version of Microsoft BASIC. It was offered in two versions: one using a 32-bit floating point format that was about 7800 bytes when compiled, and another using an extended 40-bit format that was close to 9 KB.

Even the 32-bit version barely fit into the 8 KB size of the machine's ROM cartridge format. Atari also felt that they needed to expand the language to support the hardware features of their computers, similar to what Apple had done with Applesoft BASIC. This increased the size of Atari's version to around 11 KB; AppleSoft BASIC on the Apple II+ was 10,240 bytes long. After six months the code was pared down to almost fit in an 8 KB ROM, but Atari was facing a January 1979 deadline for the Consumer Electronics Show (CES) where the machines would be demonstrated. They decided to ask for help to get a version of BASIC ready in time for the show.

Shepardson Microsystems


In September 1978, Shepardson Microsystems won the bid on completing BASIC. At the time they were finishing Cromemco 16K Structured BASIC for the Z80-based Cromemco S-100 bus machines. Developers Kathleen O'Brien and Paul Laughton used Data General Business Basic, an integer-only implementation, as the inspiration for their BASIC, given Laughton's experience with Data General on a time-sharing system.

Cromemco BASIC included an extended floating point implementation using a 14-digit binary-coded decimal (BCD) format made possible using all 16 registers of the Zilog Z80 processor. As it converted all data to the internal format at edit time, small constants like "1" would use up a considerable amount of memory, and this could be a particular issue when storing arrays of numbers. To address this, the language also supported a 6-digit BCD format. It also included a separate 16-bit integer format for storing internal values like line numbers and similar system values.

Even the smallest BASICs on the 6502 generally used about 10K, for instance, Commodore BASIC used 9K but also relied on support from the KERNAL, while Applesoft BASIC used 10780 bytes. To meet the goal of fitting in an 8K ROM, the new BASIC would be in two parts, the language itself on the cartridge, and a separate FP library using 2K in the system's 10K ROM. To fit within 2k, the floating-point system supported only the 6-digit format.

Atari accepted the proposal, and when the specifications were finalized in October 1978, Laughton and O'Brien began work on the new language. The contract specified the delivery date on or before 6 April 1979 and this also included a File Manager System (later known as DOS 1.0). Atari's plans were to take an early 8K version of Microsoft BASIC to the 1979 CES, then switch to Atari BASIC for production. Development proceeded quickly, helped by a bonus clause in the contract, which led to the initial version being delivered in October. Atari took an 8K cartridge version to CES instead of Microsoft's. Atari Microsoft BASIC later became available as a separate product.

Releases
The version Shepardson gave to Atari for the CES demo was not intended to be final, and Shepardson continued to fix bugs. Unknown to Shepardson, Atari had already sent the CES version to manufacturing.

This version was later known as Revision A. It contains a major bug in a routine that copies memory: deleting lines of code that are exactly 256 bytes long causes a lockup after the next command is entered. The key does not fix it.

Revision B attempted to fix the major bugs in Revision A and was released in 1983 as a built-in ROM in the 600XL and 800XL models. While fixing the memory copying bug, the programmer noticed the same pattern of code in the section for inserting lines, and applied the same fix. This instead introduced the original bug into this code. Inserting new lines is much more common than deleting old ones, so the change dramatically increased the number of crashes. Revision B also adds 16 bytes to a program every time it is d and  ed, eventually causing the machine to run out of memory for even the smallest programs. Mapping the Atari described these as "awesome bugs" and advised Revision B owners "Don't fool around; get the new ROM, which is available on cartridge" from Atari. The book provides a type-in program to patch Revision B to Revision C for those without the cartridge.

Revision C eliminates the memory leaks in Revision B. It is built-in on later versions of the 800XL and all XE models including the XEGS. Revision C was also available as a cartridge.

The version can be determined by typing  at the READY prompt. The result is  for Revision A,   for Revision B, and   for Revision C.

Program editing


Like most home computer BASICs, Atari BASIC is anchored around its line editor. Program lines can be up to three physical screen lines of 40 characters, 120 characters total. The cursor can be moved freely, with the editor automatically tracking which BASIC program line the current screen line is part of. For instance, if the cursor is currently positioned in line 30 and the user uses cursor-up into line 20, any editing from that point will be carried out on line 20.

Atari BASIC's editor catches many errors that would not be noticed in MS-derived versions. If an error is found, the editor re-displays the line, highlighting the text near the error in inverse video. Errors are displayed as numeric codes, with the descriptions printed in the manual. Because of the way the line editor works, the user can immediately fix the error. In the example pictured above (with ), the error can be fixed by moving the cursor over the , typing  (the editor only has an overwrite mode), and hitting.

A line entered with a leading number, from 0 to 32767, is inserted in the current program or replaces an existing line. If there's no line number, the interpreter assigns it the number -1 (800016) and the commands are executed immediately, in "immediate mode". The  command executes the stored program from the lowest line number. Atari BASIC allows all commands to be executed in both modes. For example,  can be used inside a program, whereas in many interpreters this would be available in immediate mode only.

During entry, keywords can be abbreviated using the pattern set by Palo Alto Tiny BASIC, by typing a period at any point in the word. So  is expanded to , as is. Only enough letters have to be typed to make the abbreviation unique, so  requires   because the single letter P is not unique. To expand an abbreviation, the tokenizer searches through its list of reserved words to find the first that matches the portion supplied. More commonly used commands occur first in the list of reserved words, with  at the beginning (it can be typed as  ). When the program is later ed it will always write out the full words with three exceptions:   has a synonym,  ;   has a synonym,  ; and   has a synonym which is the empty string (so   and   mean the same thing). These are separate tokens, and so will remain as such in the program listing. MS BASICs also allowed  as a short-form for , but this used the same token so it expanded back to PRINT when LISTed, treating it as an abbreviation, not a synonym.

Tokenizer
When the user presses while editing, the current line is copied into the BASIC Input Line Buffer in memory between 580 and 5FF16. Atari BASIC's tokenizer scans the text, converting each keyword to a single-byte token (for example, PRINT is 2016), each number to a six-byte floating point value, each variable name to an index into a table, and so on, until the line is fully turned into an easy to interpret format. The result is stored in an output buffer located at the first 256 bytes of the lowest available free memory, pointed to by the LOMEM pointer stored at 80, 8116. The output from the tokenizer is then relocated. The program is stored as a parse tree.

Shepardson referred to this complete-tokenizing concept as a "pre-compiling interpreter". The resulting tokenized code eliminates any parsing during runtime, making it run faster. It has the disadvantage that small constants, like 0 or 1, are six bytes each, longer than the original text.

A set of pointers (addresses) indicates various data: variable names are stored in the variable name table (VNTP – 82, 8316) and their values are stored in the variable value table (pointed to at VVTP – 86, 8716). By indirecting the variable names in this way, a reference to a variable needs only one byte to address its entry into the appropriate table. String variables have their own area (pointed to at STARP – 8C, 8D16) as does the runtime stack (pointed to at RUNSTK – 8E, 8F16) used to store the line numbers of looping statements and subroutines. Finally, the end of BASIC memory usage is indicated by an address stored at MEMTOP – 90, 9116) pointer.

Math functions
Atari BASIC includes three trigonometric functions: sine, cosine, and arc tangent. and  set whether these functions use radians or degrees, defaulting to radians. Eight additional functions include rounding, logarithms, and square root. The random function,, generates a number between 0 and 1; the parameter not being used.

String handling
Atari BASIC copied the string-handling system of Hewlett-Packard BASIC, where the basic data type is a single character, and strings are arrays of characters. Internally, a string is represented by a pointer to the first character in the string and its length. To initialize a string, it must be DIMensioned with its maximum length. For example:

In this program, a 20 character string is reserved, and any characters in excess of the string length will be truncated. The maximum length of a string is 32,768 characters. There is no support for arrays of strings.

A string is accessed using array indexing functions, or slicing. returns a string of the first 10 characters of. The arrays are 1-indexed, so a string of length 10 starts at 1 and ends at 10. Slicing functions simply set pointers to the start and end points within the existing allocated memory.

Arrays are not initialized, so a numeric array or string contains whatever data was in memory when it was allocated. The following trick allows fast string initialization, and it is also useful for clearing large areas of memory of unwanted garbage. Numeric arrays can only be cleared with a FOR...NEXT loop:

String concatenation works as in the following example. The target string must be large enough to hold the combined string or an error will result:

Values in DATA statements are comma-delimited and untyped. Consequently, strings in DATA statements are not typically enclosed by quote marks. As a result, it is not possible for data items to contain a comma but they can incorporate double-quotes. Numeric values in DATA statements are read as strings or as numbers according to the type of the variable they are read into. The READ statement cannot be used with array variables.

Input/output
The Atari OS includes a subsystem for peripheral device input/output (I/O) known as CIO (Central Input/Output). Most programs can be written independently of what device they might use, as they all conform to a common interface; this was rare on home computers at the time. New device drivers could be written fairly easily that would automatically be available to Atari BASIC and any other program using the Atari OS, and existing drivers could be supplanted or augmented by new ones. A replacement E:, for example could displace the one in ROM to provide an 80-column display, or to piggyback on it to generate a checksum whenever a line is returned (such as used to verify a type-in program listing).

Atari BASIC supports CIO access with reserved words OPEN #, CLOSE #, PRINT #, INPUT #, GET #, PUT #, NOTE #, POINT # and XIO #. There are routines in the OS for simple graphics drawing functions but not all are available as specific BASIC keywords. PLOT and DRAWTO for line drawing are supported while a command providing area fill for primitive linear geometric shapes is not. The fill feature can be used through the general CIO entry point, which is called using the BASIC command XIO.

The BASIC statement OPEN # prepares a device for I/O access:

Here, OPEN # means "ensure channel 1 is free," call the C: driver to prepare the device (this will set the cassette tape spools onto tension and advance the heads keeping the cassette tape player "paused". The 4 means "read" (other codes are 8 for write and for "read-and-write"). The third number is auxiliary information, set to 0 when not needed. The C:MYPROG.DAT is the name of the device and the filename; the filename is ignored for the cassette driver. Physical devices can have numbers (mainly disks, printers and serial devices), so "P1:" might be the plotter and "P2:" the daisy-wheel printer, or "D1:" may be one disk drive and "D2:" and so on. If not present, 1 is assumed.

The LPRINT statement sends a string to the printer.

A is read by PEEKing memory locations maintained by the keyboard driver or by opening it as a file (e.g. OPEN 1,4,0,"K:":GET #1,A$). The latter waits for a keypress.

Typing DOS from BASIC exits to the Atari DOS command menu. Any unsaved programs are lost unless a memory-swapping file feature has been enabled on the current disk. There is no command to display a disk directory from within BASIC; this must be done by exiting to DOS.

Graphics and sound
Atari BASIC supports sound, (via the SOUND statement), graphics (GRAPHICS, SETCOLOR, COLOR, PLOT, DRAWTO), and controllers (STICK, STRIG, PADDLE, PTRIG). The SOUND statement sets one of hardware's 4 square-wave channels with parameters for volume, pitch and distortion.

Advanced capabilities of the hardware such as higher pitch resolution, high-pass filters, digitised sound and waveforms, player/missile graphics (sprites), redefined character sets, scrolling, and custom graphics modes are not supported by BASIC; these will require machine language routines or PEEK/POKE statements. A few of the 17 basic character/graphics modes supported by the hardware cannot be simply accessed from BASIC on the Atari 400/800 as the OS ROMs do not support them. These include some multicolour character modes (ANTIC modes 4 & 5), descender character mode (ANTIC mode 3) and the highest resolution 2 and 4-color modes (ANTIC modes C & E, 160x192 pixels). The only way to access them is via PEEK/POKE or machine language, setting the ANTIC registers and Display List manually. The OS ROMs on the XL/XE added support for these modes except for ANTIC mode 3, which requires a character set redefined in RAM to operate correctly.

Bitmap modes in BASIC are normally set to have a text window occupying the last four rows at the bottom of the screen so the user may display prompts and enter data in a program. If a 16 is added to the mode number invoked via the GRAPHICS statement, the entire screen will be in bitmap mode (e.g. GRAPHICS 8+16). If bitmap mode in full screen is invoked Atari BASIC will gracefully switch back into text mode when program execution is terminated, avoiding leaving the user with an unresponsive screen that must be escaped by typing a blind command or resetting the computer.

Bitmap coordinates are in the range of 0 to maximum row/column minus one, thus in Mode 6 (160x192), the maximum coordinates for a pixel can be 159 and 191. If Atari BASIC attempts to plot beyond the allowed coordinates for the mode a runtime error occurs.

Line labels
Atari BASIC allows numeric variables and expressions to be used to supply line numbers to  and   commands. For instance, a subroutine that clears the screen can be written as, which is easier to understand than.

Strings as a way to manipulate memory
The base addresses of a string is stored in a variable table. String addresses can be redirected to point to arbitrary areas of RAM. This allows the rapid memory-shifting routines underlying string and substring assignment can be applied from BASIC to the memory used for the screen or player/missile graphics. This is particularly useful for achieving rapid vertical movement of player/missile images directly from Atari BASIC.

Random access via DATA/RESTORE
Numeric variables and expressions can be used as the parameter for the  statement, allowing   statements to be randomly accessed through code such as RESTORE ROOMBASE+ROOMNUMBER:READ DESCRIPTION$, TREASURE$, EXITS. This can also be used to emulate static string arrays: RESTORE STRBASE+INDEX:READ A$:PRINT A$.

Error handling with TRAP
The  statement jumps to a line number when an error occurs, and this reduces the need for manual error-checking. For example, when drawing graphics on the screen it is not necessary to check whether lines go beyond screen boundaries of the current graphics mode. This error state can be trapped, and the error handled if necessary.

Includes
The  command reads source code from a device and merges it into the current program, as if the user had typed it in. This allows programs to be saved out in sections via, reading them in using   to merge or replace existing code. By using blocks of line numbers that do not overlap, programmers can build libraries of subroutines and merge them into new programs as needed.

Self-modifying code
The editor can be set-up to repeatedly read input from the screen until an EOF is reached. This allows a program to write new program code followed by a  statement to the screen then, positioning the screen cursor at the start of the new code,   the running program, causing the new code to be read in then execution be continued by the   statement.

Embedded machine language
Atari BASIC can call machine code subroutines stored in strings or ed into memory. The 256 byte area starting at address 153610 (60016) is often used for this purpose.

Machine code is invoked with the  function. The first parameter is the address of the subroutine and the following values are parameters. If the code is stored in a string named  it can be called with two parameters as ANSWER=USR(ADR(ROUTINE$),VAR1,VAR2).

Parameters are pushed onto the hardware stack as 16-bit integers in the order specified in the  call in low byte, high byte order. A final byte is pushed indicating the number of arguments. The machine language code must remove these values before returning via the  instruction. A 16-bit value can be returned to BASIC by placing it in addresses 21210 and 21310 (D416 and D516).

Performance
In theory, Atari BASIC should run faster than contemporary BASICs based on the MS pattern. Because the source code is fully tokenized when it is entered, the entire tokenization and parsing steps are already complete. Even complex mathematical operations are ready-to-run, with any numerical constants already converted to the internal 40-bit format, and variables values are looked up by address rather than having to be searched for. In practice, Atari BASIC is slower than most other home computer BASICs, often by a large amount.

On two widely used benchmarks of the era, Byte magazine's Sieve of Eratosthenes and the Creative Computing benchmark test written by David H. Ahl, the Atari finished near the end of the list in terms of performance, and was much slower than the contemporary Apple II and PET, in spite of having the same CPU running at roughly twice the speed. It finished behind slower machines like the ZX81 and even some programmable calculators.

Most of the language's slowness stems from three problems.

The first is that the floating-point math routines are poorly optimized. In the Ahl benchmark, a single exponent operation was responsible for much of the machine's poor showing. The conversion between floating-point and 16-bit integers is also particularly slow. Internally, these integers are used for line numbers and array indexing, along with a few other tasks, but numbers in the tokenized program are stored in binary-coded decimal (BCD) format. Whenever one of these is encountered, such as the line number in GOTO 100, the BCD value is converted to an integer, which can take up to 3500 microseconds.

Another issue is how Atari BASIC implements branches. To perform a branch in a  or , the interpreter searches through the entire program for the matching line number. In contrast, contemporary versions of MS-derived BASICs would search forward from the current line if the line number of the branch target was greater, thereby improving branch performance about two times on average.

A related and more serious problem is the implementation of ...  loops. When a  statement is executed, Atari BASIC records its line number. Every time the  is reached, it searches through the program for that line, despite it being in the same place as the last time. All other BASICs instead record the memory location of the  statement and can immediately return to it without having to search.

The reason for this poor performance is best illustrated by a quote from one of its primary authors, Bill Wilkinson; in 1982 he stated:

"Personally, I have never been sure it is necessary for an interpreted language (e.g., BASIC) to be fast. Authors... have claimed that Atari BASIC is the slowest language ever created. My first impulse was to say, 'Who cares?'"

One may contrast this philosophy with that of Steve Wozniak's Apple BASIC for the original Apple I which was designed specifically to have the performance required to write games:

"After designing hardware arcade games, I knew that being able to program them in BASIC was going to change the world."

Several third-party BASICs emerged on the platform that addressed some or all of these issues. This included Wilkinson's own BASIC XL, which reduced the time for the Byte benchmark from 194 to 58 seconds, over three times as fast. On the Ahl benchmark, Atari BASIC required 405 seconds, while exactly the same code in Turbo-BASIC XL took 41.6 seconds, an order of magnitude improvement.

Differences from Microsoft BASIC

 * Syntax is checked and errors highlighted immediately on line entry.
 * Variable names can be of arbitrary length, and all characters are significant.
 * The following keywords are not in Atari BASIC:,  , ,  ,  ,.
 * All arrays must be dimensioned prior to use while Microsoft BASIC defaults an array to 10 elements if not dimensioned.
 * String variables are treated as character arrays and must be dimensioned before use. MS BASIC stores strings on the heap and sometimes pauses for garbage collection.
 * The functions,  , and   are replaced by string indexing.
 * There is not an operator for string concatenation.
 * There are no arrays of strings.
 * There is no support for integer variables.
 * There are no bitwise operators.
 * does not allow a prompt.
 * may be abbreviated as  as in Microsoft BASIC, but Atari BASIC does not tokenize it into  . It remains a question mark.
 * The target of  and   can be a variable or expression.
 * may take a numeric constant, variable, or expression as a parameter, causing the next  to begin from the specified line number
 * loops in Atari BASIC must have a variable name referenced by the  statement while Microsoft BASIC does not require it.
 * Multiple variables are not permitted with  statements as they are in Microsoft BASIC (e.g.,  ).
 * uses a comma to separate a range instead of a minus sign.