S (programming language)

S is a statistical programming language developed primarily by John Chambers and (in earlier versions) Rick Becker, Trevor Hastie, William Cleveland and Allan Wilks of Bell Laboratories. The aim of the language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully". It is widely used by academic researchers.

A major implementation of S is S-PLUS, a commercial product that was formerly sold by TIBCO Software.

The modern R, a part of the GNU free software project, was based on S and can run many S programs, although it is not fully backwards compatible.

"Old S"
S is one of several statistical computing languages that were designed at Bell Laboratories, and first took form between 1975–1976. Up to that time, much of the statistical computing was done by directly calling Fortran subroutines; however, S was designed to offer an alternate and more interactive approach, motivated in part by exploratory data analysis advocated by John Tukey. Early design decisions that hold even today include interactive graphics devices (printers and character terminals at the time), and providing easily accessible documentation for the functions.

Development of the project was led by John Chambers and Trevor Hastie, and included developers Richard Becker, Allan Wilks, John Chambers, and William Cleveland, all of whom were then employees of AT&T. Out of the developers who contributed to S, Chambers is generally agreed to be the most significant contributor. Chambers received the Software System Award from the Association for Computing Machinery for his work on S.

The first working version of S was built in 1976, and operated on the GCOS operating system. At this time, S was unnamed, and suggestions included ISCS (Interactive SCS), SCS (Statistical Computing System), and SAS (Statistical Analysis System) (which was already taken: see SAS System). The name 'S' (used with single quotation marks until 1979) was chosen, as it was a common letter in the suggestions and consistent with other programming languages designed from the same institution at the time (namely the C programming language). It stands for the word "statistics".

When UNIX/32V was ported to the (then new) 32-bit DEC VAX, computing on the Unix platform became feasible for S. In late 1979, S2 was ported from GCOS to UNIX, which would become the new primary platform.

In 1980 the first version of S was distributed outside Bell Laboratories and in 1981 source versions were made available. S was distributed freely in academic circles, and became popular among academic statisticians. In 1984 two books were published by the research team at Bell Laboratories: S: An Interactive Environment for Data Analysis and Graphics (1984 Brown Book) and Extending the S System. Also, in 1984 the source code for S became licensed through AT&T Software Sales for education and commercial purposes.

"New S"
The first version of S-PLUS was released by Statistical Sciences, Inc. in 1988. S-PLUS was later sold to TIBCO Software. By this time, many changes were made to S and the syntax of the language with the release of S3. The New S Language (1988 Blue Book) was published to introduce the new features, such as the transition from macros to functions and how functions can be passed to other functions (such as ). Many other changes to the S language were to extend the concept of "objects", and to make the syntax more consistent (and strict). However, many users found the transition to New S difficult, since their macros needed to be rewritten. Many other changes to S took hold, such as the use of X11 and PostScript graphics devices, rewriting many internal functions from Fortran to C, and the use of double precision (only) arithmetic. The New S language is very similar to that used in modern versions of S-PLUS and R.

The graphical user interface of S was also updated interactive graphical features after integration with Axum.

In 1991, Statistical Models in S (1991 White Book) was published, which introduced the use of formula-notation (which use the  operator), data frame objects, and modifications to the use of object methods and classes.

S4
The latest version of the S standard is S4, released in 1998. It provides advanced object-oriented features. S4 classes differ markedly from S3 classes; S4 formally defines the representation and inheritance for each class, and has multiple dispatch: the generic function can be dispatched to a method based on the class of any number of arguments, not just one.