User:Dr.S.Pandikumar/sandbox

= Introduction to Assembler =

Basics of Assembler
Assembler is system software which is used to convert an assembly language program to its equivalent object code. The input to the assembler is a source code written in assembly language (using mnemonics) and the output is the object code. The design of an assembler depends upon the machine architecture as the language used is mnemonic language.

Basic Assembler Functions
The basic assembler functions are:


 * Translating mnemonic language code to its equivalent object code.
 * Assigning machine addresses to symbolic labels.

The design of assembler can be to perform the following:


 * Scanning (tokenizing)
 * Parsing (validating the instructions)
 * Creating the symbol table
 * Resolving the forward references
 * Converting into the machine language

The design of assembler in other words:


 * Convert mnemonic operation codes to their machine language equivalents
 * Convert symbolic operands to their equivalent machine addresses
 * Decide the proper instruction format Convert the data constants to internal machine representations
 * Write the object program and the assembly listing

So for the design of the assembler we need to concentrate on the machine architecture of the SIC/XE machine. We need to identify the algorithms and the various data structures to be used. According to the above required steps for assembling the assembler also has to handle assembler directives, these do not generate the object code but directs the assembler to perform certain operation. These directives are:

SIC Assembler Directive:


 * START:  Specify name & starting address.
 * END: End of the program, specify the first execution instruction.
 * BYTE, WORD, RESB, RESW
 * End of record: a null char(00)
 * End of file: a zero length record

Types of Assembler
The assembler design can be done:


 * Single pass assembler
 * Multi-pass assembler

Single-pass Assembler:

In this case the whole process of scanning, parsing, and object code conversion is done in single pass. The only problem with this method is resolving forward reference. This is shown with an example below:

10        1000                FIRST             STL     RETADR                    141033

--

--

--

--

95        1033                RETADR        RESW             1

In the above example in line number 10 the instruction STL will store the linkage register with the contents of RETADR. But during the processing of this instruction the value of this symbol is not known as it is defined at the line number 95. Since I single-pass assembler the scanning, parsing and object code conversion happens simultaneously. The instruction is fetched; it is scanned for tokens, parsed for syntax and semantic validity. If it valid then it has to be converted to its equivalent object code. For this the object code is generated for the opcode STL and the value for the symbol RETADR need to be added, which is not available.

Due to this reason usually the design is done in two passes. So a multi-pass assembler resolves the forward references and then converts into the object code. Hence the process of the multi-pass assembler can be as follows:

Pass-1


 * Assign addresses to all the statements
 * Save the addresses assigned to all labels to be used in Pass-2
 * Perform some processing of assembler directives such as RESW, RESB to find the length of data areas for    assigning the address values.
 * Defines the symbols in the symbol table(generate the symbol table)

Pass-2


 * Assemble the instructions (translating operation codes and looking up addresses).
 * Generate data values defined by BYTE, WORD etc.
 * Perform the processing of the assembler directives not done during pass-1.
 * Write the object program and assembler listing.