Talk:SAS (software)/proposed revision

Untitled
SAS is a software suite developed by SAS Institute for advanced analytics, business intelligence, data management, and predictive analytics. It is the largest market-share holder for advanced analytics.

SAS was developed at North Carolina State University from 1966 until 1976, when SAS Institute was incorporated. SAS was further developed in the 1980s and 1990s with the addition of new statistical procedures, additional components and the introduction of JMP. A point-and-click interface was added in version 9 in 2004. A social media analytics product was added in 2010.

Overview
SAS is a software suite that can mine, alter, manage and retrieve data from a variety of sources and perform statistical analysis on it. It is widely used in insurance, public health, scientific research, finance, human resources, IT, utilities, and retail, and is used for operations research, project management, quality improvement, forecasting and decision-making. It is the standard statistical analysis software for submitting clinical pharmaceutical trials to the US Food and Drug administration. SAS provides a graphical point-and-click user interface for non-technical users and more advanced options through the SAS programming language. SAS programs have a DATA step, which retrieves and manipulates data, and a PROC step, which analyzes data.

Technical description and terminology
SAS programs have two main components called the DATA step and the PROC step. In most cases, a DATA step creates a SAS data set and passes the data for processing by the by PROC step. Each step consists of a series of statements.

The DATA step has executable statements that result in the software taking an action, and declarative statements that provide instructions to read a data set or alter the data's appearance. The DATA step has two phases, compilation and execution. In the compilation phase, declarative statements are processed and syntax errors are identified. Afterwards, the execution phase processes each executable statement sequentially. Data sets are organized into tables with rows called "observations" and columns called "variables". Additionally, each piece of data has a descriptor and a value.

The PROC step consists of PROC statements that call upon named procedures. Procedures perform analysis and reporting on data sets to produce statistics, analyses and graphics. There are more than 300 procedures and each one contains a substantial body of programming and statistical work. PROC statements can also display results, sort data or perform other operations. SAS Macros are pieces of code or variables that are coded once and referenced to perform repetitive tasks.

SAS data can be published in HTML, PDF, Excel and other formats using the Output Delivery System, which was first introduced in 2007. The SAS Enterprise Guide is SAS' point-and-click interface. It generates code to manipulate data or perform analysis automatically and does not require SAS programming experience to use.

The SAS software suite has a number of individual components Some of the SAS components include: • Base SAS - Basic procedures and data management

• SAS/STAT - Statistical analysis

• SAS/GRAPH - Graphics and presentation

• SAS/OR - Operations research

• SAS/ETS - Econometrics and Time Series Analysis

• SAS/IML - Interactive matrix language

• SAS/AF - Applications facility

• SAS/QC - Quality control

• SAS/INSIGHT - Data mining

• SAS/PH - Clinical trial analysis

• Enterprise Miner - data mining

Code examples
DATA step DATA distance; Miles = 26.22; Kilometers = 1.61 * Miles;

PROC step: PROC PRINT DATA = distance; RUN

Origins
Anthony J. Barr began developing the structure and language of SAS at North Carolina State University in 1966 by placing statistical procedures into a formatted file framework. He was joined by student James Goodnight in 1972, who developed the software's statistical routines. In 1968, Barr and Goodnight integrated new multiple regression and analysis of variance routines. John Sall joined the project in 1973 and contributed to the software's econometrics, time series, and matrix algebra. Another early participant, Caroll G. Perkins, contributed to SAS' early programming. Jolayne W. Service and Jane T. Helwig created SAS' first documentation. The software was originally intended to increase the output of crops by analyzing agricultural data from the United States Department of Agriculture.

The first versions of SAS were named after the year in which they were released. In 1971, SAS 71 was published as a limited release. It was used only on IBM mainframes and had the main elements of SAS programming, such as the DATA step and the most common procedures in the PROC step. The following year a full version was released with SAS 72, which introduced the MERGE statement and added features for handling missing data or combining data sets. In 1976, Barr, Goodnight, Sall, and Helwig took the project out of North Carolina State and incorporated SAS Institute, Inc..

Development
SAS was was re-designed in SAS 76 with an open architecture that allowed for compilers and procedures. The INPUT and INFILE statements were improved so they could read most data formats used by IBM mainframes. Generating reports was added through the PUT and FILE statements. The ability to analyze general linear models was also added as was the FORMAT procedure, which allowed developers to customize the appearance of data. In 1979, SAS 79 added support for the CMS operating system and introduced the DATASETS procedure. A few years later, SAS 82 introduced an early macro language and the APPEND procedure.

SAS version 4 had limited features. Version 5 introduced a complete macro language, array subscripts, and a full-screen interactive user interface called Display Manager. In 1985 SAS, which was previously written in PL/I, Fortran, and assembly language, was re-written in C. This allowed for SAS' Multivendor Architecture and for it to run on UNIX, MS-DOS, and Windows.

In the 1980s and 1990s, SAS released a number of components to complement Base SAS. SAS/GRAPH, which produces graphics, was released in 1980, as well as the SAS/ETS component, which supports econometric and time series analysis. A component intended for pharmaceutical users, SAS/PH-Clinical, was released in the 1990s. The Food and Drug Administration standardized on SAS/PH-Clinical for new drug applications in 2002. Vertical products like SAS Financial Management and SAS Human Capital Management (then called CFO Vision and HR Vision respectively) were also introduced. JMP was developed by SAS co-founder John Saul and a team of developers to take advantage of the graphical user interface introduced in the 1984 Apple Macintosh and shipped for the first time in 1989. Additional releases of JMP were released in 2000, 2002, 2005, 2007, 2008, 2009, 2010, and 2012.

SAS version 6 was used throughout the 1990s and was available on a wider range of operating systems, including Macintosh, OS/2, Silicon Graphics, and Primos. SAS introduced new features through dot-releases. From 6.06 to 6.09, a user interface based on the windows paradigm was introduced and support for SQL was added. Version 7 introduced the Output Delivery System (ODS) and an improved text editor. ODS was improved upon in successive releases. For example, more output options were added in version 8. The number of operating systems that were supported was reduced to UNIX, Windows and z/OS, and Linux was added. SAS version 8 and SAS Enterprise Miner were released in 1999.

Recent history
In 2004 SAS released Version 9.0, which was dubbed “Project Mercury” and was designed to make SAS accessible by a broader range of business users. Version 9.0 added custom user interfaces based on the user’s role and established the point-and-click user interface of SAS Enterprise Guide as the primary GUI. The CRM features were improved in 2004 with SAS Interaction Management. In 2008 SAS announced Project Unity, a project to integrate data quality, data integration and master data management.

SAS sued World Programming, the developers of a competing implementation, World Programming System, alleging that they had infringed SAS's copyright in part by implementing the same functionality. This case was referred from the United Kingdom's High Court of Justice to the European Court of Justice on 11 August 2010. In May 2012, the European Court of Justice ruled in favor of World Programming, finding that "the functionality of a computer program and the programming language cannot be protected by copyright."

SAS Social Media Analytics, a tool for social media monitoring, engagement and sentiment analysis, was released in 2010. SAS Rapid Predictive Modeler (RPM), which creates basic analytical models using Microsoft Excel, was introduced that same year. The release of JMP 9 in 2010 added a Microsoft Excel add-in, mapping features, integration with R and improvements to the creation and distribution of custom JMP applications. A High Performance Computing appliance was made available in a partnership with Teradata and EMC Greenplum. In 2011 the company released Enterprise Miner 7.1.

Market share
SAS is the largest market-share holder in advanced analytics with 36.2 percent of the market as of 2012. It is the fifth largest market-share holder for BI software with a 6.9% share and the largest independent vendor. It competes in the BI market against conglomerates, such as SAP BusinessObjects, IBM Cognos, SPSS Modeler, Oracle Hyperion, and Microsoft BI. SAS has been named in the Gartner Leader's Quadrant for Data Integration Tools and for Business Intelligence and Analytical Platforms. SAS was given the strongest position out of all the vendors evaluated in the Forrester Wave for Big Data Predictive Analytics Solutions.