Comparison of programming languages (syntax)

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Expressions
Programming language expressions can be broadly classified into four syntax structures:


 * prefix notation
 * Lisp
 * infix notation
 * Fortran
 * suffix, postfix, or Reverse Polish notation
 * Forth
 * math-like notation
 * TUTOR

Statements
When a programming languages has statements, they typically have conventions for:


 * statement separators;
 * statement terminators; and
 * line continuation

A statement separator demarcates the boundary between two separate statements. A statement terminator defines the end of an individual statement. Languages that interpret the end of line to be the end of a statement are called "line-oriented" languages.

"Line continuation" is a convention in line-oriented languages where the newline character could potentially be misinterpreted as a statement terminator. In such languages, it allows a single statement to span more than just one line.

Line continuation
Line continuation is generally done as part of lexical analysis: a newline normally results in a token being added to the token stream, unless line continuation is detected.


 * Whitespace – Languages that do not need continuations
 * Ada – Lines terminate with semicolon
 * C# – Lines terminate with semicolon
 * JavaScript – Lines terminate with semicolon (which may be inferred)
 * Lua
 * OCaml


 * Ampersand as last character of line
 * Fortran 90, Fortran 95, Fortran 2003, Fortran 2008


 * Backslash as last character of line
 * bash and other Unix shells
 * C, C++ preprocessor
 * Mathematica, Wolfram Language
 * Python
 * Ruby
 * JavaScript – only within single- or double-quoted strings


 * Backtick as last character of line
 * PowerShell


 * Hyphen as last character of line
 * SQL*Plus


 * Underscore as last character of line
 * AutoIt
 * Cobra
 * Visual Basic
 * Xojo


 * Ellipsis (as three periods–not one special character)
 * MATLAB: The ellipsis token need not be the last characters on the line, but any following it will be ignored. (In essence, it begins a comment that extends through (i.e. including) the first subsequent newline character. Contrast this with an inline comment, which extends until the first subsequent newline.)


 * Comma delimiter as last character of line
 * Ruby (comment may follow delimiter)


 * Left bracket delimiter as last character of line
 * Batch file: starting a parenthetical block can allow line continuation
 * Ruby: left parenthesis, left square bracket, or left curly bracket


 * Operator as last object of line
 * Ruby (comment may follow operator)


 * Operator as first character of continued line
 * AutoHotkey: Any expression operators except ++ and --, and a comma or a period


 * Backslash as first character of continued line
 * Vimscript


 * Some form of inline comment serves as line continuation
 * Turbo Assembler:
 * m4:
 * TeX:


 * Character position
 * Fortran 77: A non-comment line is a continuation of the prior non-comment line if any non-space character appears in column 6. Comment lines cannot be continued.
 * COBOL: String constants may be continued by not ending the original string in a PICTURE clause with, then inserting a   in column 7 (same position as the   for comment is used.)
 * TUTOR: Lines starting with a tab (after any indentation required by the context) continue the prior command.


 * [End and Begin] using normal quotes
 * C, C++ preprocessor: The string is ended normally and continues by starting with a quote on the next line.

Libraries
To import a library is a way to read external, possibly compiled, routines, programs or packages. Imports can be classified by level (module, package, class, procedure,...) and by syntax (directive name, attributes,...)


 * File import
 * MATLAB
 * COPY    filename. COBOL
 * Prolog
 * ASP
 * , AutoHotkey, AutoIt, C, C++
 * AutoHotkey, AutoIt, C, C++
 * , Objective-C
 * Objective-C
 * Import["filename"] Mathematica, Wolfram Language
 * include  'filename' Fortran
 * PHP
 * , Pick Basic
 * Pick Basic
 * Rust
 * Ruby
 * load %filename Red
 * Lua
 * Perl, PHP
 * require "filename" Ruby
 * source(""filename"") R
 * Zig


 * Package import
 * C, C++
 * , Rust
 * Objective-C
 * Mathematica, Wolfram Language
 * :-use_module(module). Prolog:
 * Python
 * , Rust
 * Rust
 * , Rust
 * R:
 * Oberon
 * Go:
 * , D
 * D
 * , Haskell
 * Haskell
 * Java, MATLAB, Kotlin
 * JavaScript:
 * , JavaScript:
 * Scala
 * , Scala
 * Swift
 * V (Vlang)
 * , Python
 * Lua:
 * require "gem", Ruby
 * , Fortran 90+
 * Fortran 90+
 * , Perl
 * Perl
 * Cobra
 * Pascal
 * Ada
 * Zig


 * Class import
 * Python
 * Java, MATLAB, kotlin
 * , JavaScript
 * , JavaScript
 * JavaScript
 * , Scala
 * , Scala
 * Scala
 * , PHP
 * PHP


 * Procedure/function import
 * Python:
 * , D:
 * D:
 * Haskell:
 * , JavaScript:
 * , JavaScript:
 * JavaScript:
 * MATLAB:
 * , Scala:
 * Scala:
 * use Module ('symbol');Perl:
 * , PHP:
 * PHP:
 * , Rust:
 * , Rust:
 * Rust:


 * Constant import
 * PHP

The above statements can also be classified by whether they are a syntactic convenience (allowing things to be referred to by a shorter name, but they can still be referred to by some fully qualified name without import), or whether they are actually required to access the code (without which it is impossible to access the code, even with fully qualified names).


 * Syntactic convenience
 * Java
 * Java
 * OCaml


 * Required to access code
 * Go
 * JavaScript
 * Python

Blocks
A block is a notation for a group of two or more statements, expressions or other units of code that are related in such a way as to comprise a whole.


 * Braces (a.k.a. curly brackets)  ...  :
 * Curly bracket programming languages: C, C++, Objective-C, Go, Java, JavaScript/ECMAScript, V (Vlang), C#, D, Perl, PHP ( &   loops, or pass a block as argument), R, Rust, Scala, S-Lang, Swift, PowerShell, Haskell (in do-notation), AutoHotkey, Zig


 * Parentheses  ...
 * Batchfile, F# (lightweight syntax), OCaml, Prolog, Standard ML
 * Square brackets  ...
 * Rebol, Red, Self, Smalltalk (blocks are first class objects. a.k.a. closures)
 * Ada, ALGOL, F# (verbose syntax), Pascal, Ruby (,  &   loops), OCaml, SCL, Simula, Erlang.
 * PL/I, REXX
 * Bash ( &   loops), F# (verbose syntax) Visual Basic, Fortran, TUTOR (with mandatory indenting of block body), Visual Prolog
 * Lua, Ruby (pass blocks as arguments,  loop), Seed7 (encloses loop bodies between   and  )
 * X ...  (e.g.   ...  ):
 * Ruby (, ,  ,  ,  ,   statements), OCaml (  &   loops), MATLAB (  &   conditionals,   &   loops,   clause,  ,  ,  ,  ,  , &   blocks), Lua (  /   &  )
 * Scheme
 * (progn ...):
 * Lisp
 * Clojure
 * Scheme
 * (progn ...):
 * Lisp
 * Clojure
 * Clojure
 * Clojure


 * Indentation
 * Off-side rule languages: Boo, Cobra, CoffeeScript, F#, Haskell (in do-notation when braces are omitted), LiveScript, occam, Python, Nemerle (Optional; the user may use white-space sensitive syntax instead of the curly-brace syntax if they so desire), Nim, Scala (Optional, as in Nemerle)
 * Free-form languages: most descendants from ALGOL (including C, Pascal, and Perl); Lisp languages


 * Others
 * Ada, Visual Basic, Seed7:  ...
 * APL:  ...   or   ...
 * Bash, sh, and ksh:  ... ,   ...  ,   ...  ;
 * ALGOL 68:  ... ,   ...  ,   ...  ,   ...
 * Lua, Pascal, Modula-2, Seed7:  ...
 * COBOL:  ... ,   ...  , etc. for statements; ...   for sentences.
 * Visual Basic .Net:  ... ,   ...  ,   ...
 * Small Basic:  ... ,   ...  ,   ...

Comments
Comments can be classified by:
 * style (inline/block)
 * parse rules (ignored/interpolated/stored in memory)
 * recursivity (nestable/non-nestable)
 * uses (docstrings/throwaway comments/other)

Inline comments
Inline comments are generally those that use a newline character to indicate the end of a comment, and an arbitrary delimiter or sequence of tokens to indicate the beginning of a comment.

Examples:

Block comments
Block comments are generally those that use a delimiter to indicate the beginning of a comment, and another delimiter to indicate the end of a comment. In this context, whitespace and newline characters are not counted as delimiters. In the examples, the symbol ~ represents the comment; and, the symbols surrounding it are understood by the interpreters/compilers as the delimiters.

Examples:

Unique variants

 * Fortran
 * Indenting lines in Fortran 66/77 is significant. The actual statement is in columns 7 through 72 of a line. Any non-space character in column 6 indicates that this line is a continuation of the prior line. A ' ' in column 1 indicates that this entire line is a comment. Columns 1 though 5 may contain a number which serves as a label. Columns 73 though 80 are ignored and may be used for comments; in the days of punched cards, these columns often contained a sequence number so that the deck of cards could be sorted into the correct order if someone accidentally dropped the cards. Fortran 90 removed the need for the indentation rule and added inline comments, using the  character as the comment delimiter.


 * COBOL
 * In fixed format code, line indentation is significant. Columns 1–6 and columns from 73 onwards are ignored. If a  or   is in column 7, then that line is a comment. Until COBOL 2002, if a   or   was in column 7, it would define a "debugging line" which would be ignored unless the compiler was instructed to compile it.


 * Cobra
 * Cobra supports block comments with " ...  " which is like the "  ...  " often found in C-based languages, but with two differences. The   character is reused from the single-line comment form "  ...", and the block comments can be nested which is convenient for commenting out large blocks of code.


 * Curl
 * Curl supports block comments with user-defined tags as in.


 * Lua
 * Like raw strings, there can be any number of equals signs between the square brackets, provided both the opening and closing tags have a matching number of equals signs; this allows nesting as long as nested block comments/raw strings use a different number of equals signs than their enclosing comment: . Lua discards the first newline (if present) that directly follows the opening tag.


 * Perl
 * Block comments in Perl are considered part of the documentation, and are given the name Plain Old Documentation (POD). Technically, Perl does not have a convention for including block comments in source code, but POD is routinely used as a workaround.


 * PHP


 * PHP supports standard C/C++ style comments, but supports Perl style as well.


 * Python
 * The use of the triple-quotes to comment-out lines of source, does not actually form a comment. The enclosed text becomes a string literal, which Python usually ignores (except when it is the first statement in the body of a module, class or function; see docstring).


 * Elixir
 * The above trick used in Python also works in Elixir, but the compiler will throw a warning if it spots this. To suppress the warning, one would need to prepend the sigil  (which prevents string interpolation) to the triple-quoted string, leading to the final construct  . In addition, Elixir supports a limited form of block comments as an official language feature, but as in Perl, this construct is entirely intended to write documentation. Unlike in Perl, it cannot be used as a workaround, being limited to certain parts of the code and throwing errors or even suppressing functions if used elsewhere.


 * Raku
 * Raku uses  to denote block comments. Raku actually allows the use of any "right" and "left" paired brackets after   (i.e. ,  ,  ,  , and even the more complicated   are all valid block comments). Brackets are also allowed to be nested inside comments (i.e.   goes to the last closing brace).


 * Ruby
 * Block comment in Ruby opens at  line and closes at   line.


 * S-Lang
 * The region of lines enclosed by the  and   delimiters are ignored by the interpreter. The tag name can be any sequence of alphanumeric characters that may be used to indicate how the enclosed block is to be deciphered. For example,   could indicate the start of a block of LaTeX formatted documentation.


 * Scheme and Racket
 * The next complete syntactic component (s-expression) can be commented out with.

ABAP supports two different kinds of comments. If the first character of a line, including indentation, is an asterisk the whole line is considered as a comment, while a single double quote  begins an in-line comment which acts until the end of the line. ABAP comments are not possible between the statements  and   because Native SQL has other usages for these characters. In the most SQL dialects the double dash can be used instead.
 * ABAP


 * Esoteric languages
 * Many esoteric programming languages follow the convention that any text not executed by the instruction pointer (e.g., Befunge) or otherwise assigned a meaning (e.g., Brainfuck), is considered a "comment".

Comment comparison
There is a wide variety of syntax styles for declaring comments in source code. in italics is used here to indicate block comment style. in italics is used here to indicate inline comment style.