User:Anil1iitr/sandbox

In artificial intelligence, genetic programming (GP) is a technique whereby computer programs are encoded as a set of genes that are then modified (evolved) using an evolutionary algorithm (often a genetic algorithm, "GA") – it is an application of (for example) genetic algorithms where the space of solutions consists of computer programs. The results are computer programs that are able to perform well in a predefined task. The methods used to encode a computer program in an artificial chromosome and to evaluate its fitness with respect to the predefined task are central in the GP technique and still the subject of active research.

History
The first record of the proposal to evolve programs is probably that of Alan Turing in the 1950s. However, there was a gap of some thirty years before Richard Forsyth demonstrated the successful evolution of small programs, represented as trees, to perform classification of crime scene evidence for the UK Home Office.

Although the idea of evolving programs, initially in the computer language Lisp, was current amongst John Holland’s students, it was not until they organised the first Genetic Algorithms conference in Pittsburgh that Nichael Cramer published evolved programs in two specially designed languages. In 1988 John Koza (also a PhD student of John Holland) patented his invention of a GA for program evolution. This was followed by publication in the International Joint Conference on Artificial Intelligence IJCAI-89.

Koza followed this with 205 publications on “Genetic Programming” (GP), name coined by David Goldberg, also a PhD student of John Holland. However, it is the series of 4 books by Koza, starting in 1992 with accompanying videos, that really established GP. Subsequently, there was an enormous expansion of the number of publications with the Genetic Programming Bibliography, surpassing 10,000 entries. In 2010, Koza listed 77 results where Genetic Programming was human competitive.

In 1996 Koza started the annual Genetic Programming conference which was followed in 1998 by the annual EuroGP conference, and the first book in a GP series edited by Koza. 1998 also saw the first GP textbook. GP continued to flourish, leading to the first specialist GP journal and three years later (2003) the annual Genetic Programming Theory and Practice (GPTP) workshop was established by Rick Riolo. Genetic Programming papers continue to be published at a diversity of conferences and associated journals. Today there are nineteen GP books including several for students.

Foundational Work in GP
Early work that set the stage for current genetic programming research topics and applications is diverse, and includes software synthesis and repair, predictive modelling, data mining, financial modeling , soft sensors , design , and image processing. Applications in some areas, such as design, often make use of intermediate representations, such as Fred Gruau’s cellular encoding. Industrial uptake has been significant in several areas including finance, the chemical industry, bioinformatics and the steel industry.

Program representation


GP evolves computer programs, traditionally represented in memory as tree structures. Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable).

Non-tree representations have been suggested and successfully implemented, such as linear genetic programming which suits the more traditional imperative languages [see, for example, Banzhaf et al. (1998)]. The commercial GP software Discipulus uses automatic induction of binary machine code ("AIM") to achieve better performance. µGP uses directed multigraphs to generate programs that fully exploit the syntax of a given assembly language. Other program representations on which significant research and development have been conducted include programs for stack-based virtual machines , and sequences of integers that are mapped to arbitrary programming languages via grammars. Cartesian genetic programming is another form of GP, which uses a graph representation instead of the usual tree based representation to encode computer programs.

Most representations have structurally noneffective code (introns). Such non-coding genes may seem to be useless, because they have no effect on the performance of any one individual. However, they alter the probabilities of generating different offspring under the variation operators, and thus alter the individual's variational properties. Experiments seem to show faster convergence when using program representations that allow such non-coding genes, compared to program representations that do not have any non-coding genes.

Selection
Selection is a process whereby certain individuals are selected from the current generation that would serve as parents for the next generation. The individuals are selected probabilistically such that the better performing individuals have a higher chance of getting selected. The most commonly used selection method in GP is Tournament selection, although other methods such as Fitness proportionate selection, Lexicase Selection, and many others have been demonstrated to perform better for many GP problems.

Variation
Various genetic operators (i.e., crossover and mutation) are applied to the individuals selected in the selection step described above to breed new individuals. The rate at which these operators are applied determine the diversity in the population.

Applications
GP has been successfully used as an automatic programming tool, a machine learning tool and an automatic problem-solving engine. GP is especially useful in the domains where the exact form of the solution is not known in advance or an approximates solution is acceptable (possibly because finding the exact solution is very difficult). Some of the applications of GP are curve fitting, data modeling, Symbolic regression, feature selection, classification, etc. John R. Koza mentions 76 instances where Genetic Programming has been able to produce results that are competitive with human-produced results (called Human-competitive results). Since 2004, the annual Genetic and Evolutionary Computation Conference (GECCO) holds Human Competitive Awards (called Humies) competition, where cash awards are presented to human-competitive results produced by any form of genetic and evolutionary computation. GP has won many awards in this competition over the years.

Meta-genetic programming
Meta-genetic programming is the proposed meta learning technique of evolving a genetic programming system using genetic programming itself. It suggests that chromosomes, crossover, and mutation were themselves evolved, therefore like their real life counterparts should be allowed to change on their own rather than being determined by a human programmer. Meta-GP was formally proposed by Jürgen Schmidhuber in 1987. Doug Lenat's Eurisko is an earlier effort that may be the same technique. It is a recursive but terminating algorithm, allowing it to avoid infinite recursion. In the "autoconstructive evolution" approach to meta-genetic programming, the methods for the production and variation of offspring are encoded within the evolving programs themselves, and programs are executed to produce new programs to be added to the population.

Critics of this idea often say this approach is overly broad in scope. However, it might be possible to constrain the fitness criterion onto a general class of results, and so obtain an evolved GP that would more efficiently produce results for sub-classes. This might take the form of a meta evolved GP for producing human walking algorithms which is then used to evolve human running, jumping, etc. The fitness criterion applied to the meta GP would simply be one of efficiency.