Macro (computer science)

In computer programming, a macro (short for "macro instruction"; ) is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion. The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.

Macros are used to make a sequence of computing instructions available to the programmer as a single program statement, making the programming task less tedious and less error-prone. Thus, they are called "macros" because a "big" block of code can be expanded from a "small" sequence of characters. Macros often allow positional or keyword parameters that dictate what the conditional assembler program generates and have been used to create entire programs or program suites according to such variables as operating system, platform or other factors. The term derives from "macro instruction", and such expansions were originally used in generating assembly language code.

Keyboard and mouse macros
Keyboard macros and mouse macros allow short sequences of keystrokes and mouse actions to transform into other, usually more time-consuming, sequences of keystrokes and mouse actions. In this way, frequently used or repetitive sequences of keystrokes and mouse movements can be automated. Separate programs for creating these macros are called macro recorders.

During the 1980s, macro programs – originally SmartKey, then SuperKey, KeyWorks, Prokey – were very popular, first as a means to automatically format screenplays, then for a variety of user input tasks. These programs were based on the terminate-and-stay-resident mode of operation and applied to all keyboard input, no matter in which context it occurred. They have to some extent fallen into obsolescence following the advent of mouse-driven user interfaces and the availability of keyboard and mouse macros in applications, such as word processors and spreadsheets, making it possible to create application-sensitive keyboard macros.

Keyboard macros can be used in massively multiplayer online role-playing games (MMORPGs) to perform repetitive, but lucrative tasks, thus accumulating resources. As this is done without human effort, it can skew the economy of the game. For this reason, use of macros is a violation of the TOS or EULA of most MMORPGs, and their administrators spend considerable effort to suppress them.

Application macros and scripting
Keyboard and mouse macros that are created using an application's built-in macro features are sometimes called application macros. They are created by carrying out the sequence once and letting the application record the actions. An underlying macro programming language, most commonly a scripting language, with direct access to the features of the application may also exist.

The programmers' text editor, Emacs, (short for "editing macros") follows this idea to a conclusion. In effect, most of the editor is made of macros. Emacs was originally devised as a set of macros in the editing language TECO; it was later ported to dialects of Lisp.

Another programmers' text editor, Vim (a descendant of vi), also has an implementation of keyboard macros. It can record into a register (macro) what a person types on the keyboard and it can be replayed or edited just like VBA macros for Microsoft Office. Vim also has a scripting language called Vimscript to create macros.

Visual Basic for Applications (VBA) is a programming language included in Microsoft Office from Office 97 through Office 2019 (although it was available in some components of Office prior to Office 97). However, its function has evolved from and replaced the macro languages that were originally included in some of these applications.

XEDIT, running on the Conversational Monitor System (CMS) component of VM, supports macros written in EXEC, EXEC2 and REXX, and some CMS commands were actually wrappers around XEDIT macros. The Hessling Editor (THE), a partial clone of XEDIT, supports Rexx macros using Regina and Open Object REXX (oorexx). Many common applications, and some on PCs, use Rexx as a scripting language.

Macro virus
VBA has access to most Microsoft Windows system calls and executes when documents are opened. This makes it relatively easy to write computer viruses in VBA, commonly known as macro viruses. In the mid-to-late 1990s, this became one of the most common types of computer virus. However, during the late 1990s and to date, Microsoft has been patching and updating its programs. In addition, current anti-virus programs immediately counteract such attacks.

Parameterized and parameterless macro
A parameterized macro is a macro that is able to insert given objects into its expansion. This gives the macro some of the power of a function.

As a simple example, in the C programming language, this is a typical macro that is not a parameterized macro, i.e., a parameterless macro: #define PI  3.14159 This causes  to always be replaced with   wherever it occurs. An example of a parameterized macro, on the other hand, is this: #define pred(x) ((x)-1) What this macro expands to depends on what argument x is passed to it. Here are some possible expansions: pred(2)   →  ((2)   -1) pred(y+2) →  ((y+2) -1) pred(f(5)) → ((f(5))-1) Parameterized macros are a useful source-level mechanism for performing in-line expansion, but in languages such as C where they use simple textual substitution, they have a number of severe disadvantages over other mechanisms for performing in-line expansion, such as inline functions.

The parameterized macros used in languages such as Lisp, PL/I and Scheme, on the other hand, are much more powerful, able to make decisions about what code to produce based on their arguments; thus, they can effectively be used to perform run-time code generation.

Text-substitution macros
Languages such as C and some assembly languages have rudimentary macro systems, implemented as preprocessors to the compiler or assembler. C preprocessor macros work by simple textual substitution at the token, rather than the character level. However, the macro facilities of more sophisticated assemblers, e.g., IBM High Level Assembler (HLASM) can't be implemented with a preprocessor; the code for assembling instructions and data is interspersed with the code for assembling macro invocations. A classic use of macros is in the computer typesetting system TeX and its derivatives, where most of the functionality is based on macros.

MacroML is an experimental system that seeks to reconcile static typing and macro systems. Nemerle has typed syntax macros, and one productive way to think of these syntax macros is as a multi-stage computation.

Other examples:
 * m4 is a sophisticated stand-alone macro processor.
 * TRAC
 * Macro Extension TAL, accompanying Template Attribute Language
 * SMX: for web pages
 * ML/1 (Macro Language One)
 * The General Purpose Macrogenerator is a contextual pattern-matching macro processor, which could be described as a combination of regular expressions, EBNF and AWK
 * troff and nroff: for typesetting and formatting Unix manpages.
 * CMS EXEC: for command-line macros and application macros
 * EXEC 2 in Conversational Monitor System (CMS): for command-line macros and application macros
 * CLIST in IBM's Time Sharing Option (TSO): for command-line macros and application macros
 * REXX: for command-line macros and application macros in, e.g., AmigaOS, CMS, OS/2, TSO
 * SCRIPT: for formatting documents
 * Various shells for, e.g., Linux

Some major applications have been written as text macro invoked by other applications, e.g., by XEDIT in CMS.

Embeddable languages
Some languages, such as PHP, can be embedded in free-format text, or the source code of other languages. The mechanism by which the code fragments are recognised (for instance, being bracketed by  and  ) is similar to a textual macro language, but they are much more powerful, fully featured languages.

Procedural macros
Macros in the PL/I language are written in a subset of PL/I itself: the compiler executes "preprocessor statements" at compilation time, and the output of this execution forms part of the code that is compiled. The ability to use a familiar procedural language as the macro language gives power much greater than that of text substitution macros, at the expense of a larger and slower compiler. Macros in PL/I, as well as in many assemblers, may have side effects, e.g., setting variables that other macros can access.

Frame technology's frame macros have their own command syntax but can also contain text in any language. Each frame is both a generic component in a hierarchy of nested subassemblies, and a procedure for integrating itself with its subassembly frames (a recursive process that resolves integration conflicts in favor of higher level subassemblies). The outputs are custom documents, typically compilable source modules. Frame technology can avoid the proliferation of similar but subtly different components, an issue that has plagued software development since the invention of macros and subroutines.

Most assembly languages have less powerful procedural macro facilities, for example allowing a block of code to be repeated N times for loop unrolling; but these have a completely different syntax from the actual assembly language.

Syntactic macros
Macro systems—such as the C preprocessor described earlier—that work at the level of lexical tokens cannot preserve the lexical structure reliably. Syntactic macro systems work instead at the level of abstract syntax trees, and preserve the lexical structure of the original program. The most widely used implementations of syntactic macro systems are found in Lisp-like languages. These languages are especially suited for this style of macro due to their uniform, parenthesized syntax (known as S-expressions). In particular, uniform syntax makes it easier to determine the invocations of macros. Lisp macros transform the program structure itself, with the full language available to express such transformations. While syntactic macros are often found in Lisp-like languages, they are also available in other languages such as Prolog, Erlang, Dylan, Scala, Nemerle, Rust, Elixir, Nim, Haxe, and Julia. They are also available as third-party extensions to JavaScript and C#.

Early Lisp macros
Before Lisp had macros, it had so-called FEXPRs, function-like operators whose inputs were not the values computed by the arguments but rather the syntactic forms of the arguments, and whose output were values to be used in the computation. In other words, FEXPRs were implemented at the same level as EVAL, and provided a window into the meta-evaluation layer. This was generally found to be a difficult model to reason about effectively.

In 1963, Timothy Hart proposed adding macros to Lisp 1.5 in AI Memo 57: MACRO Definitions for LISP.

Anaphoric macros
An anaphoric macro is a type of programming macro that deliberately captures some form supplied to the macro which may be referred to by an anaphor (an expression referring to another). Anaphoric macros first appeared in Paul Graham's On Lisp and their name is a reference to linguistic anaphora—the use of words as a substitute for preceding words.

Hygienic macros
In the mid-eighties, a number of papers introduced the notion of hygienic macro expansion, a pattern-based system where the syntactic environments of the macro definition and the macro use are distinct, allowing macro definers and users not to worry about inadvertent variable capture (cf. referential transparency). Hygienic macros have been standardized for Scheme in the R5RS, R6RS, and R7RS standards. A number of competing implementations of hygienic macros exist such as,  , explicit renaming, and syntactic closures. Both  and   have been standardized in the Scheme standards.

Recently, Racket has combined the notions of hygienic macros with a "tower of evaluators", so that the syntactic expansion time of one macro system is the ordinary runtime of another block of code, and showed how to apply interleaved expansion and parsing in a non-parenthesized language.

A number of languages other than Scheme either implement hygienic macros or implement partially hygienic systems. Examples include Scala, Rust, Elixir, Julia, Dylan, Nim, and Nemerle.

Applications

 * Evaluation order
 * Macro systems have a range of uses. Being able to choose the order of evaluation (see lazy evaluation and non-strict functions) enables the creation of new syntactic constructs (e.g. control structures) indistinguishable from those built into the language. For instance, in a Lisp dialect that has  but lacks , it is possible to define the latter in terms of the former using macros. For example, Scheme has both continuations and hygienic macros, which enables a programmer to design their own control abstractions, such as looping and early exit constructs, without the need to build them into the language.


 * Data sub-languages and domain-specific languages
 * Next, macros make it possible to define data languages that are immediately compiled into code, which means that constructs such as state machines can be implemented in a way that is both natural and efficient.


 * Binding constructs
 * Macros can also be used to introduce new binding constructs. The most well-known example is the transformation of  into the application of a function to a set of arguments.

Felleisen conjectures that these three categories make up the primary legitimate uses of macros in such a system. Others have proposed alternative uses of macros, such as anaphoric macros in macro systems that are unhygienic or allow selective unhygienic transformation.

The interaction of macros and other language features has been a productive area of research. For example, components and modules are useful for large-scale programming, but the interaction of macros and these other constructs must be defined for their use together. Module and component-systems that can interact with macros have been proposed for Scheme and other languages with macros. For example, the Racket language extends the notion of a macro system to a syntactic tower, where macros can be written in languages including macros, using hygiene to ensure that syntactic layers are distinct and allowing modules to export macros to other modules.

Macros for machine-independent software
Macros are normally used to map a short string (macro invocation) to a longer sequence of instructions. Another, less common, use of macros is to do the reverse: to map a sequence of instructions to a macro string. This was the approach taken by the STAGE2 Mobile Programming System, which used a rudimentary macro compiler (called SIMCMP) to map the specific instruction set of a given computer into machine-independent macros. Applications (notably compilers) written in these machine-independent macros can then be run without change on any computer equipped with the rudimentary macro compiler. The first application run in such a context is a more sophisticated and powerful macro compiler, written in the machine-independent macro language. This macro compiler is applied to itself, in a bootstrap fashion, to produce a compiled and much more efficient version of itself. The advantage of this approach is that complex applications can be ported from one computer to a very different computer with very little effort (for each target machine architecture, just the writing of the rudimentary macro compiler). The advent of modern programming languages, notably C, for which compilers are available on virtually all computers, has rendered such an approach superfluous. This was, however, one of the first instances (if not the first) of compiler bootstrapping.

Assembly language
While macro instructions can be defined by a programmer for any set of native assembler program instructions, typically macros are associated with macro libraries delivered with the operating system allowing access to operating system functions such as
 * peripheral access by access methods (including macros such as OPEN, CLOSE, READ and WRITE)
 * operating system functions such as ATTACH, WAIT and POST for subtask creation and synchronization. Typically such macros expand into executable code, e.g., for the EXIT macroinstruction,
 * a list of define constant instructions, e.g., for the DCB macro—DTF (Define The File) for DOS —or a combination of code and constants, with the details of the expansion depending on the parameters of the macro instruction (such as a reference to a file and a data area for a READ instruction);
 * the executable code often terminated in either a branch and link register instruction to call a routine, or a supervisor call instruction to call an operating system function directly.
 * Generating a Stage 2 job stream for system generation in, e.g., OS/360. Unlike typical macros, sysgen stage 1 macros do not generate data or code to be loaded into storage, but rather use the PUNCH statement to output JCL and associated data.

In older operating systems such as those used on IBM mainframes, full operating system functionality was only available to assembler language programs, not to high level language programs (unless assembly language subroutines were used, of course), as the standard macro instructions did not always have counterparts in routines available to high-level languages.

History
In the mid-1950s, when assembly language programming was commonly used to write programs for digital computers, the use of macro instructions was initiated for two main purposes: to reduce the amount of program coding that had to be written by generating several assembly language statements from one macro instruction and to enforce program writing standards, e.g. specifying input/output commands in standard ways. Macro instructions were effectively a middle step between assembly language programming and the high-level programming languages that followed, such as FORTRAN and COBOL. Two of the earliest programming installations to develop "macro languages" for the IBM 705 computer were at Dow Chemical Corp. in Delaware and the Air Material Command, Ballistics Missile Logistics Office in California. A macro instruction written in the format of the target assembly language would be processed by a macro compiler, which was a pre-processor to the assembler, to generate one or more assembly language instructions to be processed next by the assembler program that would translate the assembly language instructions into machine language instructions.

By the late 1950s the macro language was followed by the Macro Assemblers. This was a combination of both where one program served both functions, that of a macro pre-processor and an assembler in the same package.

In 1959, Douglas E. Eastwood and Douglas McIlroy of Bell Labs introduced conditional and recursive macros into the popular SAP assembler, creating what is known as Macro SAP. McIlroy's 1960 paper was seminal in the area of extending any (including high-level) programming languages through macro processors.

Macro Assemblers allowed assembly language programmers to implement their own macro-language and allowed limited portability of code between two machines running the same CPU but different operating systems, for example, early versions of MS-DOS and CP/M-86. The macro library would need to be written for each target machine but not the overall assembly language program. Note that more powerful macro assemblers allowed use of conditional assembly constructs in macro instructions that could generate different code on different machines or different operating systems, reducing the need for multiple libraries.

In the 1980s and early 1990s, desktop PCs were only running at a few MHz and assembly language routines were commonly used to speed up programs written in C, Fortran, Pascal and others. These languages, at the time, used different calling conventions. Macros could be used to interface routines written in assembly language to the front end of applications written in almost any language. Again, the basic assembly language code remained the same, only the macro libraries needed to be written for each target language.

In modern operating systems such as Unix and its derivatives, operating system access is provided through subroutines, usually provided by dynamic libraries. High-level languages such as C offer comprehensive access to operating system functions, obviating the need for assembler language programs for such functionality.

Moreover, standard libraries of several newer programming languages, such as Go, actively discourage the use of syscalls in favor of platform-agnostic libraries as well if not necessary, to improve portability and security.