User talk:Mchailie

Mchailie's article on the C++ programming language C++ Paradigm: multi-paradigm: generic-programming, object-oriented, procedural Appeared in: 1985, last revised 2003 Designed by: Bjarne Stroustrup Typing discipline: static, weak, unsafe, nominative Major implementations: GNU Compiler Collection, Microsoft Visual C++, Borland C++ Builder Dialects: ANSI C++ 1998, ANSI C++ 2003 Influenced by: C, Simula, Ada 83, CLU Influenced: Ada 95, C#, Java C++ (generally pronounced "see plus plus") is a general-purpose programming language. It is a statically-typed free-form multi-paradigm language supporting procedural programming, data abstraction, object-oriented programming, and generic programming. Since the 1990s, C++ has been one of the most popular commercial programming languages.

Bjarne Stroustrup developed C++ (originally named "C with Classes") in 1983 at Bell Labs as an enhancement to the C programming language. Enhancements started with the addition of classes, followed by, among other features, virtual functions, operator overloading, multiple inheritance, templates, and exception handling. The C++ programming language standard was ratified in 1998 as ISO/IEC 14882:1998, the current version of which is the 2003 version, ISO/IEC 14882:2003. A new version of the standard (known informally as C++0x) is being developed.

Contents [hide] 1 History 1.1 The name "C++" 1.2 Old problems 1.3 Future development 2 Philosophy 3 Standard library 4 Features introduced in C++ 5 Incompatibility with C 6 Sample code 6.1 Minimal program 6.2 Hello world program 7 Language features 7.1 Operators 7.2 Objects 7.2.1 Encapsulation 7.2.2 Inheritance 7.3 Polymorphism 7.3.1 Static polymorphism 7.3.1.1 Function overloading 7.3.1.2 Operator overloading 7.3.1.3 Template functions and classes 7.3.2 Dynamic polymorphism 7.3.2.1 Polymorphism through inheritance 7.3.2.2 Virtual member functions 7.3.2.3 An example 8 Criticism 9 See also 10 References 11 External links 11.1 Materials 11.1.1 Tutorials 11.1.2 Electronic Books 11.2 Support 11.3 Other 11.3.1 Libraries and code repositories

[edit] History Stroustrup began work on C with Classes in 1979. The idea of creating a new language originated from Stroustrup's experience in programming for his Ph.D. thesis. Stroustrup found that Simula had features that were very helpful for large software development, but the language was too slow for practical use, while BCPL was fast but too low-level and unsuitable for large software development. When Stroustrup started working in Bell Labs, he had the problem of analyzing the UNIX kernel with respect to distributed computing. Remembering his Ph.D. experience, Stroustrup set out to enhance the C language with Simula-like features. C was chosen because it is general-purpose, fast, and portable. Besides C and Simula, some other languages which inspired him were ALGOL 68, Ada, CLU and ML. At first, the class, derived class, strong type checking, inlining, and default argument features were added to C via Cfront. The first commercial release occurred in October 1985[1].

In 1983, the name of the language was changed from C with Classes to C++. New features that were added to the language included virtual functions, function name and operator overloading, references, constants, user-controlled free-store memory control, improved type checking, and a new comment style (//). In 1985, the first edition of The C++ Programming Language was released, providing an important reference to the language, as there was not yet an official standard. In 1989, Release 2.0 of C++ was released. New features included multiple inheritance, abstract classes, static member functions, const member functions, and protected members. In 1990, The Annotated C++ Reference Manual was released and provided the basis for the future standard. Late addition of features included templates, exceptions, namespaces, new casts, and a Boolean type.

As the C++ language evolved, a standard library also evolved with it. The first addition to the C++ standard library was the stream I/O library which provided facilities to replace the traditional C functions such as printf and scanf. Later, among the most significant additions to the standard library, was the Standard Template Library.

After years of work, a joint ANSI-ISO committee standardized C++ in 1998 (ISO/IEC 14882:1998). For some years after the official release of the standard in 1998, the committee processed defect reports, and published a corrected version of the C++ standard in 2003. In 2005, a technical report, called the "Library Technical Report 1" (often known as TR1 for short) was released. While not an official part of the standard, it gives a number of extensions to the standard library which are expected to be included in the next version of C++. Support for TR1 is growing in almost all currently maintained C++ compilers.

No one owns the C++ language, as it is royalty-free. However, the standard document itself is not freely available.

[edit] The name "C++" This name is credited to Rick Mascitti (mid-1983) and was first used in December 1983. Earlier, during the research period, the developing language had been referred to as "C with Classes". The final name stems from C's "++" operator (which increments the value of a variable) and a common naming convention of using "+" to indicate an enhanced computer program. According to Stroustrup: "the name signifies the evolutionary nature of the changes from C". C+ was the name of an earlier, unrelated programming language.

Stroustrup addressed the origin of the name in the preface of later editions of his book, The C++ Programming Language, adding that "C++" might be inferred from the appendix of George Orwell's Nineteen Eighty-Four. Of the three segments of the fictional language Newspeak, the "C vocabulary" is the one dedicated to technical terms and jargon. "Doubleplus" is the superlative modifier for Newspeak adjectives. Thus, "C++" might hold the meaning "most extremely technical or jargonous" in Newspeak.

When Rick Mascitti was questioned informally in 1992 about the naming, he indicated that it was given in a tongue-in-cheek spirit. He never thought that it would become the formal name of the language.

[edit] Old problems Traditionally, C++ compilers have had a range of problems. The C++ standard does not cover implementation of name decoration, exception handling, and other implementation-specific features, making object code produced by different compilers incompatible; there are, however, 3rd-party standards for particular machines or OSs which attempt to standardize compilers on those platforms, for example C++ ABI, and now many compilers have standardized these items.

For many years, different C++ compilers implemented the C++ standard to different levels of correctness, in particular partial template specialisation. Recent releases of most popular C++ compilers support almost all of the C++ 1998 standard [2]. One particular point of contention is the export keyword, intended to allow template definitions to be separated from their declarations. The first compiler to implement export was Comeau C++, in early 2003 (5 years after the release of the standard); in 2004, beta compiler of Borland C++ Builder X was also released with export. Both of these compilers are based on the EDG C++ front end. It should also be noted that many C++ books provide example code for implementing the keyword export (Ivor Horton's Beginning ANSI C++, pg. 827) which will not compile, but there is no reference to the problem with the keyword export mentioned. Other compilers such as Microsoft Visual C++ and GCC do not support it at all. Herb Sutter, secretary of the C++ standards committee, has recommended that export be removed from future versions of the C++ standard [3], but finally the decision was made to leave it in the C++ standard.

[edit] Future development C++ continues to evolve to meet future requirements. One group in particular, Boost.org, works to make the most of C++ in its current form and advises the C++ standards committee as to which features work well and which need improving. Current work indicates that C++ will capitalize on its multi-paradigm nature more and more. The work at Boost, for example, is greatly expanding C++'s functional and metaprogramming capabilities. A new version of the C++ standard is currently been worked on, entitled "C++0X" (denoting the fact it is expected to be released before 2010) which will include a number of new features.

[edit] Philosophy In The Design and Evolution of C++ (1994), Bjarne Stroustrup describes some rules that he uses for the design of C++. Knowing the rules helps to understand why C++ is the way it is. The following is a summary of the rules. Much more detail can be found in The Design and Evolution of C++.

C++ is designed to be a statically typed, general-purpose language that is as efficient and portable as C C++ is designed to directly and comprehensively support multiple programming styles (procedural programming, data abstraction, object-oriented programming, and generic programming) C++ is designed to give the programmer choice, even if this makes it possible for the programmer to choose incorrectly C++ is designed to be as compatible with C as possible, therefore providing a smooth transition from C C++ avoids features that are platform specific or not general purpose C++ does not incur overhead for features that are not used C++ is designed to function without a sophisticated programming environment Stanley B. Lippman documents in his in-depth book "Inside the C++ Object Model" (1996) how compilers convert C++ program statements into an in-memory layout. Lippman worked on implementing and maintaining C-front, the original C++ implementation at Bell Labs.

[edit] Standard library The 1998 C++ standard consists of two parts: the core language and the C++ standard library; the latter includes most of the Standard Template Library and a slightly modified version of the C standard library. Many C++ libraries exist which are not part of the standard, and using external linkage can even be written in C.

The C++ standard library incorporates the C standard library with some small modifications to make it work better with the C++ language. Another large part of the C++ library is based on the Standard Template Library (STL). This provides such useful tools as containers (for example vectors and lists), iterators (generalized pointers) to provide these containers with array-like access and algorithms to perform operations such as searching and sorting. Furthermore (multi)maps (associative arrays) and (multi)sets are provided, all of which export compatible interfaces. Therefore it is possible, using templates, to write generic algorithms that work with any container or on any sequence defined by iterators. As in C, the features of the library are accessed by using the #include directive to include a standard header. C++ provides sixty-nine standard headers, of which nineteen are deprecated.

Using the standard library--for example, using std::vector or std::string instead of a C-style array--can help lead to safer and more scalable software.

The STL was originally a third-party library from HP and later SGI, before its incorporation into the C++ standard. The standard does not refer to it as "STL", as it is merely a part of the standard library, but many people still use that term to distinguish it from the rest of the library (input/output streams [known as IOstreams], internationalization, diagnostics, the C library subset, etc.).

Most C++ compilers provide an implementation of the C++ standard library, including the STL. Compiler-independent implementations of the STL, such as STLPort, also exist. Other projects also produce various custom implementations of the C++ standard library and the STL with various design goals.

[edit] Features introduced in C++ Compared to the C language, C++ introduced extra features, including declarations as statements, function-like casts, new/delete, bool, reference types, inline functions, default arguments, function overloading, namespaces, classes (including all class-related features such as inheritance, member functions, virtual functions, abstract classes, and constructors), operator overloading, templates, the :: operator, exception handling, and runtime type identification.

Contrary to popular belief, C++ did not introduce the const keyword first. Const was formally added to C shortly before it was adopted by C++.

C++ also performs more type checking than C in several cases (see "Incompatibility with C" below).

Comments starting with two slashes ("//") were originally part of C's predecessor, BCPL, and were reintroduced in C++.

Several features of C++ were later adopted by C, including declarations in for loops, C++-style comments (using the // symbol), and inline, though the C99 definition of the inline keyword is not compatible with its C++ definition. However, C99 also introduced features that do not exist in C++, such as variadic macros and better handling of arrays as parameters; some C++ compilers may implement some of these features as extensions, but others are incompatible with existing C++ features.

A very common source of confusion is a subtle terminology issue: because of its derivation from C, in C++ the term object means memory area, just like in C, and not class instance, which is what it means in most other object oriented languages. For example in both C and C++ the line

int i; defines an object of type int, that is the memory area where the value of the variable i will be stored on assignment.

[edit] Incompatibility with C For more details on this topic, see Compatibility of C and C++. C++ is often considered as a superset of C, but this is not strictly true. Most C code can easily be made to compile correctly in C++, but there are a few differences that cause some valid C code to be invalid in C++, or to behave differently in C++.

Perhaps the most commonly encountered difference is that C allows implicit conversion from void* to other pointer types, but C++ does not. So, the following is valid C code:

int *i = malloc(sizeof(int) * 5);    /* Implicit conversion from void* to int* */ but to make it work in both C and C++ one would need to use an explicit cast:

int *i = (int *) malloc(sizeof(int) * 5); Another common portability issue is that C++ defines many new keywords, such as this and class, that may be used as identifiers (e.g. variable names) in a C program.

Some incompatibilities have been removed by the latest (C99) C standard, which now supports C++ features such as // comments and mixed declarations and code. However, C99 introduced a number of new features that conflict with C++ (such as variable-length arrays, native complex-number types, and compound literals), so the languages may be diverging more than they are converging.

In order to intermix C and C++ code, any C++ functions which are to be called from C-compiled code must be declared as extern "C".

[edit] Sample code [edit] Minimal program This is an example of a program which does nothing. It begins executing and immediately terminates. It consists of one thing: a main function. The function main is the designated start of a C++ program.

int main { } The C++ Standard requires that main returns type int. A program which uses any other return type for main is technically not Standard C++, although many compilers do not enforce this strictly. The Standard also does not say what the return value of main actually means. Traditionally, it is interpreted as the return value of the program itself. The Standard guarantees that returning zero from main indicates successful termination. Unsuccessful termination can be indicated by returning a nonzero value such as EXIT_FAILURE. If, as in this example, execution reaches the end of main without encountering a return statement then zero is returned implicitly.

[edit] Hello world program This is an example of a Hello world program, which uses the C++ standard library (not STL) cout facility to display a message, then terminates.


 * 1) include // for std::cout and std::endl

int main {   std::cout << "Hello World!" << std::endl; return 0; } Note: "std" refers to a namespace for the iostream class. The "::" and "cout" call an instance of the namespace "std", or the "standard" namespace, that instance being the cout function.

For more examples, see C++ examples.

[edit] Language features [edit] Operators Main article: Operators in C and C++ [edit] Objects C++ introduces some object-oriented (OO) features to C. It offers classes, which provide the four features commonly present in OO (and some non-OO) languages: abstraction, encapsulation, inheritance and polymorphism.

[edit] Encapsulation C++ implements encapsulation by allowing all members of a class to be declared as either public, private, or protected. A public member of the class will be accessible to any function. A private member will only be accessible to functions that are members of that class and to functions and classes explicitly granted access permission by the class ("friends"). A protected member will be accessible to members of classes that inherit from the class in addition to the class itself and any friends.

The OO principle is that all and only the functions that can access the internal representation of a type should be encapsulated within the type definition. C++ supports this (via member functions and friend functions), but does not enforce it: the programmer can declare parts or all of the representation of a type to be public, and is also allowed to make public entities that are not part of the representation of the type. Because of this C++ supports not just OO programming but other weaker decomposition paradigms, like modular programming.

It is generally considered good practice to make all data private, or at least protected, and to make public only those functions that are part of a minimal interface for users of the class that hides implementation details.

[edit] Inheritance Inheritance from a base class may be declared as public, protected, or private. This access specifier determines whether unrelated and derived classes can access the inherited public and protected members of the base class. Only public inheritance corresponds to what is usually meant by "inheritance". The other two forms are much less frequently used. If the access specifier is omitted, inheritance is assumed to be private for a class base and public for a struct base. Base classes may be declared as virtual; this is called virtual inheritance. Virtual inheritance ensures that only one instance of a base class exists in the inheritance graph, avoiding some of the ambiguity problems of multiple inheritance.

Multiple inheritance is another controversial C++ feature. Multiple inheritance allows a class to derive from more than one base class; this can result in a complicated graph of inheritance relationships. For example, a "Flying Cat" class can inherit from both "Cat" and "Flying Mammal". Some other languages, such as Java, accomplish something similar by allowing inheritance of multiple interfaces while restricting the number of base classes to one (interfaces, unlike classes, provide no implementation of function members).

[edit] Polymorphism C++ supports several kinds of static (compile-time) and dynamic (run-time) polymorphism. Compile-time polymorphism does not allow for certain run-time decisions, while run-time polymorphism typically incurs more of a performance penalty.

[edit] Static polymorphism [edit] Function overloading Function overloading allows programs to declare multiple functions with the same name. The functions are distinguished by the number and types of their formal parameters. Thus, the same function name can refer to different functions depending on the context in which it is used.

[edit] Operator overloading Similarly, operator overloading allows programs to define certain operators (such as +, !=, <, or &) to result in a function call that depends on the types of the operands they are used on.

[edit] Template functions and classes Templates in C++ provide a sophisticated mechanism for writing generic, polymorphic code.

[edit] Dynamic polymorphism [edit] Polymorphism through inheritance Variable pointers (and references) of a base class type in C++ can refer to objects of any derived classes of that type in addition to objects exactly matching the variable type. This allows arrays or other containers of a given type of object to hold multiple types of objects within it, which cannot be done otherwise in C++. Because assignment of values to variables usually occurs at run-time, this is necessarily a run-time phenomenon.

C++ also provides a dynamic_cast operator, which allows the program to safely attempt conversion of an object into an object of a more specific object type (as opposed to conversion to a more general type, which is always allowed). This feature relies run-time type information. Objects known to be of a certain specific type can also be cast to that type without dynamic_cast, which is less safe but does not require compiler support for run-time type information.

[edit] Virtual member functions Through virtual member functions, different objects that share a common base class may all support an operation in different ways. The member functions implemented by the derived class are said to override the same member functions of the base class. In contrast with function overloading, the parameters for a given member function are always exactly the same number and type. Only the type of the object for which this method is called varies. In addition to standard member functions, operator overloads and destructors can also be virtual.

By virtue of inherited objects being polymorphic, it may not be possible for the compiler to determine the type of the object at compile time. The decision is therefore put off until runtime, and is called dynamic dispatch. In this way, the most specific implementation of the function is called, according to the actual run-time type of the object. In C++, this is commonly done using virtual function tables. This may sometimes be bypassed by prepending a fully qualified class name before the function call, but calls to virtual functions are in general always resolved at run time.

[edit] An example
 * 1) include

class Bird                // the "generic" base class { public: virtual void OutputName {std::cout << "a bird";} };

class Swan : public Bird  // Swan derives from Bird { public: void OutputName {std::cout << "a swan";} // overrides virtual function };

int main { Bird* myBird = new Swan; // Declares a pointer to a generic Bird, // and sets it pointing to a newly-created Swan.

myBird->OutputName;   // This will output "a swan", not "a bird".

delete myBird;

return 0; } This example program makes use of virtual functions, polymorphism, and inheritance to derive new, more specific objects from a base class. In this case the base class is a Bird, and the more specific Swan is made.

[edit] Criticism This article or section recently underwent a major revision or rewrite and needs further review. You can help!

C++ has received a large amount of criticism and is the subject of much debate:

Since C++ is based on, and largely compatible with, C, it also inherits most of the criticisms levelled at that language. Taken as a whole C++ is a large and complicated language, and so is difficult to fully master. This is somewhat mitigated by the fact that programmers are free to use a small subset of C++ features that they're comfortable with, adding new features to their repertoire only as required and at their own pace. It is extremely difficult to write a good C++ parser. Because of that, there are very few tools for programmers that help analyzing and performing non-trivial transformations(e.g., refactoring) of existing code. C++ libraries dealing with useful programming tasks are not easily available. Windows (or non-Windows) user interface Networking: sockets, HTTP, peer-to-peer, SMTP, POP3 C++ is sometimes compared unfavourably with 'pure' object-oriented languages such as Java, on the basis that it allows programmers to mix and match object-oriented and procedural programming, rather than strictly enforcing a single style. This is part of a wider debate on the relative merits of the two programming styles. The abundance of language features can encourage overkill programming in over-enthusiastic programmers, where advanced features of C++ are unnecessarily brought to bear on simple problems. [edit] See also Wikibooks has more about this subject: Programming:C plus plusLook up C++ in Wiktionary, the free dictionary.C++ structure Comparison of Java and C++ Design pattern List of C++ compilers and integrated development environments Template metaprogramming Name mangling OpenC++ Operators in C and C++ Programming paradigm Significantly Prettier and Easier C++ Syntax [edit] References Abrahams, David; Aleksey Gurtovoy. C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond, Addison-Wesley. ISBN 0-321-22725-5. Alexandrescu, Andrei (2001). Modern C++ Design: Generic Programming and Design Patterns Applied, Addison-Wesley. ISBN 0-201-70431-5. Coplien, James O. (1992, reprinted with corrections 1994). Advanced C++: Programming Styles and Idioms. ISBN 0-201-54855-0. Dewhurst, Stephen C. (2005). C++ Common Knowledge: Essential Intermediate Programming, Addison-Wesley. ISBN 0-321-32192-8. Information Technology Industry Council (2003-10-15). Programming languages — C++, Second edition, Geneva: ISO/IEC. 14882:2003(E). Josuttis, Nicolai M. The C++ Standard Library, Addison-Wesley. ISBN 0-201-37926-0. Koenig, Andrew; Barbara E. Moo (2000). Accelerated C++ - Practical Programming by Example, Addison-Wesley. ISBN 0-201-70353-X. Malik, D. S. C++ Programming: From Problem Analysis to Program Design, Course Technology. ISBN 0-619-06213-4. Oualline, Steve. How Not to Program in C++, No Starch Press. ISBN 1-886411-95-6. Oualline, Steve. Practical C++ Programming, O'Reilly. ISBN 0-596-00419-2. Stroustrup, Bjarne (1994). The Design and Evolution of C++, Addison-Wesley. ISBN 0-201-54330-3. Stroustrup, Bjarne (2000). The C++ Programming Language, Special Edition, Addison-Wesley. ISBN 0-201-70073-5. Sutter, Herb. Exceptional C++ Style, Addison-Wesley. ISBN 0-201-76042-8. Vandevoorde, David; Nicolai M. Josuttis (2003). C++ Templates: The complete Guide, Addison-Wesley. ISBN 0-201-73484-2. A Critique of C++. Ian Joyner. URL accessed on March 31, 2006. [edit] External links [edit] Materials Dinkumware's C++ Library Reference Manual C/C++ Reference Standards Committee Page: JTC1/SC22/WG21 - C++ [edit] Tutorials C++ Programming Tutorial About.com The C++ Annotations Cplusplus.com tutorial, with complete code examples along with the results from each example shown side by side. Also includes a section on using C++ compilers from different vendors A Tutorail from C (1) A Tutorial to C++ (2) [edit] Electronic Books Free book "C++ In Action" by Bartosz Milewski Free book "Thinking in C++" by Bruce Eckel Free book "C++: A Dialog" by Steve Heller Computer-Books.us Collection of online C++ books. C++Course The well-known book of A.B. Downey as an HTMLHelp based eBook [edit] Support C++ FAQ Lite by Marshall Cline Newsgroups "comp.lang.c++" "comp.lang.c++.moderated" "comp.std.c++" C++ Forum at Cprogramming.com C and C++ at Daniweb [edit] Other links to C++ Tools Internet sites and files of interest to C++ users, A categorised list of C++ related links. [edit] Libraries and code repositories Portable foundation classes from GNU Boost.org: C++ high quality libraries Planet Source Code with several thousand code samples

Major programming languages (more/edit) Industrial: ABAP | Ada | AWK | Assembly | C | C++ | C# | COBOL | Common Lisp | ColdFusion | Delphi | Eiffel | Fortran | Java | JavaScript | Limbo | Lua | Objective-C | Pascal | Perl | PHP | Python | RPG | Ruby | SQL | Tcl | Visual Basic | VB.NET | Visual FoxPro

Academic: APL/J | OCaml | Haskell | Scheme | Smalltalk | Logo | ML | Prolog | MATLAB Other: ALGOL | BASIC | Clipper | Forth | Modula-2/Modula-3 | MUMPS | PL/I | Simula }

Hardy-Weinberg-Castle Law
Hardy-Weinberg-Castle LawHardy-Weinberg principle From Wikipedia, the free encyclopedia Jump to: navigation, search Hardy–Weinberg principle for two alleles: the horizontal axis shows the two allele frequencies p and q, the vertical axis shows the genotype frequencies and the three possible genotypes are represented by the different glyphs. The Hardy–Weinberg principle (HWP) (also Hardy–Weinberg equilibrium (HWE), or Hardy–Weinberg law) states that, under certain conditions, after one generation of random mating, the genotype frequencies at a single gene locus will become fixed at a particular equilibrium value. It also specifies that those equilibrium frequencies can be represented as a simple function of the allele frequencies at that locus.

In the simplest case of a single locus with two alleles A and a with allele frequencies of p and q, respectively, the HWP predicts that the genotypic frequencies for the AA homozygote to be p2, the Aa heterozygote to be 2pq and the other aa homozygote to be q2. The Hardy–Weinberg principle is an expression of the notion of a population in "genetic equilibrium" and is a basic principle of population genetics.

Contents [hide] 1 Assumptions 2 Derivation 3 Deviations from Hardy-Weinberg equilibrium 4 Sex linkage 5 Generalizations 5.1 Generalization for more than two alleles 5.2 Generalization for polyploidy 5.3 Complete generalization 6 Applications 6.1 Application to cases of complete dominance 7 Significance tests for deviation 7.1 Example χ2 test for deviation 7.2 Fisher's exact test (probability test) 8 Inbreeding Coefficient 9 History 10 References

[edit] Assumptions The original assumptions for Hardy–Weinberg equilibrium (HWE) were that the organism under consideration is:

Diploid, and the trait under consideration is not on a chromosome that has different copy numbers for different sexes, such as the X chromosome in humans Sexually reproducing, either monoecious or dioecious Discrete generations In addition, the population under consideration is idealised, that is:

Random mating within a single population Infinite (or sufficiently large so as to minimise the effect of genetic drift) and experiences:

No selection No mutation No migration (gene flow) The first group of assumptions are required for the mathematics involved. It is relatively easy to expand the definition of HWE to include modifications of these, such as for sex-linked traits. The other assumptions are inherent in the Hardy-Weinberg principle.

A Hardy-Weinberg population is used as a reference population when discussing various factors. It is not surprising that these populations are static.

[edit] Derivation A better, but equivalent, probabilistic description for the HWP is that the alleles for the next generation for any given individual are chosen randomly and independent of each other. Consider two alleles, A and a, with frequencies p and q, respectively, in the population. The different ways to form new genotypes can be derived using a Punnett square, where the fraction in each cell is equal to the product of the row and column probabilities.

Table 1: Punnett square for Hardy–Weinberg equilibrium Females A (p) a (q) Males A (p) AA (p2) Aa (pq) a (q) aA (qp) aa (q2)

The final three possible genotypic frequencies in the offspring become:

This is achieved in one generation. Sometimes, a population is created by bringing together males and females with different allele frequencies. In this case, the assumption of a single population is violated until after the first generation, so the first generation will not have Hardy-Weinberg equilibrium. Successive generations will have Hardy-Weinberg equilibrium.

[edit] Deviations from Hardy-Weinberg equilibrium Violations of the Hardy–Weinberg assumptions can cause deviations from expectation. How this effects the population depends on the assumptions that are violated.

Random mating. The HWP states the population will have the given genotypic frequencies (called Hardy-Weinberg proportions) after a single generation of random mating within the population. When violations of this provision occur, the population will not have Hardy-Weinberg proportions. Three such violations are: Inbreeding, which causes an increase in homozygosity for all genes. Assortative mating, which causes an increase in homozygosity only for those genes involve with the trait that is assortatively mated (and genes in linkage disequilibrium with them). Small population size, which causes a random change in genotypic frequencies, particularly if the population is very small. This is due to a sampling effect, and is called genetic drift. The remaining assumptions affect the allele frequencies, but do not, in themselves, affect random mating. If a population violates one of these, the population will continue to have Hardy-Weinberg proportions each generation, but the allele frequencies will change with that force.

Selection, in general, causes allele frequencies to change, often quite rapidly. While directional selection eventually leads to the loss of all alleles except the favored one, some forms of selection, such as balancing selection, lead to equilibrium without loss of alleles. Mutation will have a very subtle effect on allele frequencies. Mutation rates are of the order 10-4 to 10-8, and the change in allele frequency will be, at most, the same order. Recurrent mutation will maintain alleles in the population, even if there is strong selection against them. Migration genetically links two or more populations together. In general, allele frequencies will become more homogeneous among the populations. Some models for migration inherently include nonrandom mating (Wahlund effect, for example). For those models, the Hardy-Weinberg proportions will normally not be valid. How these violations affect formal statistical tests for HWE is discussed later.

Unfortunately, violations of assumptions in the Hardy-Weinberg principle does not mean the population will violate HWE. For example, balancing selection leads to an equilibrium population with Hardy-Weinberg proportions.

[edit] Sex linkage Where the A gene is sex-linked, the heterogametic sex (e.g., human males) have only one copy of the gene (and are termed hemizygous), while the homogametic sex (e.g., human females) have two copies. The genotype frequencies at equilibrium are p and q for the heterogametic sex but p2, 2pq and q2 for the homogametic sex.

For example, in humans red-green colorblindness is an X-linked recessive trait. In western Europlean males, the trait affects about 1 in 12, (q = 0.083) whereas it affects about 1 in 200 females (0.005, compared to q2 = 0.0070), very close to Hardy-Weinberg proportions.

If a population is brought together with males and females with different allele frequencies, the allele frequency of the male population follows that of the female population because each receives its X chromosome from its mother. The population converges on equilibrium very quickly.

[edit] Generalizations The simple derivation above can be generalized for more than two alleles and polyploidy.

[edit] Generalization for more than two alleles Consider an extra allele frequency, r. The two-allele case is the binomial expansion of (p + q)2, and thus the three-allele case is the trinomial expansion of (p + q + r)2.

(p + q + r)2 = p2 + r2 + q2 + 2pq + 2pr + 2qr More generally, consider the alleles A1, ... Ai given by the allele frequencies p1 to pi;

giving for all homozygotes:

and for all heterozygotes:

f(AiAj) = 2pipj [edit] Generalization for polyploidy The Hardy–Weinberg principle may also be generalized to polyploid systems, that is, for organisms that have more than two copies of each chromosome. Consider again only two alleles. The diploid case is the binomial expansion of:

(p + q)2 and therefore the polyploid case is the binomial expansion of:

(p + q)c where c is the ploidy, for example with tetraploid (c = 4):

Table 2: Expected genotype frequencies for tetraploidy Genotype Frequency p4 4p3q 6p2q2 4pq3 q4

Depending on whether the organism is a 'true' tetraploid or an amphidiploid will determine how long it will take for the population to reach Hardy-Weinberg equilibrium.

[edit] Complete generalization The completely generalized formula is the multinomial expansion of :

[edit] Applications The Hardy–Weinberg principle may be applied in two ways, either a population is assumed to be in Hardy–Weinberg proportions, in which the genotype frequencies can be calculated, or if the genotype frequencies of all three genotypes are known, they can be tested for deviations that are statistically significant.

[edit] Application to cases of complete dominance Suppose that the phenotypes of AA and Aa are indistinguishable i.e. that there is complete dominance. Assuming that the Hardy–Weinberg principle applies to the population, then q can still be calculated from f(aa):

and p can be calculated from q. And thus an estimate of f(AA) and f(Aa) derived from p2 and 2pq respectively. Note however, such a population cannot be tested for equilibrium using the significance tests below because it is assumed a priori.

[edit] Significance tests for deviation Testing deviation from the HWP is generally performed using Pearson's chi-squared test, using the observed genotype frequencies obtained from the data and the expected genotype frequencies obtained using the HWP. For systems where there are large numbers of alleles, this may result in data with many empty possible genotypes and low genotype counts, because there are often not enough individuals present in the sample to adequately represent all genotype classes. If this is the case, then the asymptotic assumption of the chi-square distribution, will no longer hold, and it may be necessary to use a form of Fisher's exact test, which requires a computer to solve.

[edit] Example χ2 test for deviation These data are from E.B. Ford (1971) on the Scarlet tiger moth, for which the phenotypes of a sample of the population were recorded. Genotype-phenotype distinction is assumed to be negligibly small. The null hypothesis is that the population is in Hardy–Weinberg proportions, and the alternative hypothesis is that the population is not in Hardy–Weinberg proportions.

Table 3: Example Hardy–Weinberg principle calculation Genotype White-spotted (AA) Intermediate (Aa) Little spotting (aa) Total Number 1469 138 5 1612

From which allele frequencies can be calculated:

p = 0.954

and

q = 1 − p = 1 − 0.954 = 0.046

So the Hardy–Weinberg expectation is:

Pearson's chi-square test states:

χ2 = 0.001 + 0.073 + 0.756 = 0.83

There is 1 degree of freedom. (degrees of freedom for χ2 squared tests are normally n − 1, where n is the number of genotype classes. However, an extra degree of freedom is lost because the expected values were estimated from the observed values). The 5% significance level for 1 degree of freedom is 3.84, and since the χ2 value is less than this, the null hypothesis that the population is in Hardy–Weinberg equilibrium is not rejected.

[edit] Fisher's exact test (probability test) Fisher's exact test can be applied to testing for Hardy-Weinberg proportions. Since the test is conditional on the allele frequencies, p and q, the problem can be viewed as testing for the proper number of heterozygotes. In this way, the hypothesis of Hardy-Weinberg proportions is rejected if the number of heterozygotes are too large or too small. The conditional probabilities for the heterozygote, given the allele frequencies are given in Emigh (1980) as

where n11, n12, n22 are the observed numbers of the three genotypes, AA, Aa, and aa, respectively, and n1 is the number of A alleles.

An Example Using one of the examples from Emigh (1980), we can consider the case where n = 100, and p = 0.34. The possible observed heterozygotes and their exact significance level is given in Table 4.

Table 4: Example of Fisher's Exact Test Number of Heterozygotes Significance Level 0 0.000 2 0.000 4 0.000 6 0.000 8 0.000 10 0.000 12 0.000 14 0.000 16 0.000 18 0.001 20 0.007 22 0.034 34 0.067 24 0.151 32 0.291 26 0.474 30 0.730 28 1.000

Using this table, you look up the significance level of the test based on the observed number of heterozygotes. For example, if you observed 20 heterozygotes, the significance level for the test is 0.007. As is typical for Fisher's exact test for small samples, the gradation of significance levels is quite coarse.

Unfortunately, you have to create a table like this for every experiment, since the tables are dependent on both n and p.

[edit] Inbreeding Coefficient In inbreeding coefficient, F (see also F-statistics), is one minus the observed frequency of heterozygotes over that expected from Hardy–Weinberg equilibrium.

where the expected value from Hardy–Weinberg equilibrium is given by

For example, for Ford's data above;

For two alleles, the Chi Square Goodness of Fit test for Hardy-Weinberg proportions is equivalent to the test for inbreeding, F = 0.

[edit] History Godfrey Harold Hardy Wilhelm WeinbergMendelian genetics were rediscovered in 1900. However, it remained somewhat controversial for several years as it was not then known how it could cause continuous characters. Udny Yule (1902) argued against Mendelism because he thought that dominant alleles would increase in the population. The American William E. Castle (1903) showed that without selection, the genotype frequencies would remain stable. Karl Pearson (1903) found one equilibrium position with values of p = q = 0.5. Reginald Punnett, unable to counter Yule's point, introduced the problem to G. H. Hardy, a British mathematician, with whom he played cricket. Hardy was a pure mathematician and held applied mathematics in some contempt; his view of biologists' use of mathematics comes across in his 1908 paper where he describes this as "very simple".

To the Editor of Science: I am reluctant to intrude in a discussion concerning matters of which I have no expert knowledge, and I should have expected the very simple point which I wish to make to have been familiar to biologists. However, some remarks of Mr. Udny Yule, to which Mr. R. C. Punnett has called my attention, suggest that it may still be worth making... Suppose that Aa is a pair of Mendelian characters, A being dominant, and that in any given generation the number of pure dominants (AA), heterozygotes (Aa), and pure recessives (aa) are as p:2q:r. Finally, suppose that the numbers are fairly large, so that mating may be regarded as random, that the sexes are evenly distributed among the three varieties, and that all are equally fertile. A little mathematics of the multiplication-table type is enough to show that in the next generation the numbers will be as (p+q)2:2(p+q)(q+r):(q+r)2, or as p1:2q1:r1, say. The interesting question is — in what circumstances will this distribution be the same as that in the generation before? It is easy to see that the condition for this is q2 = pr. And since q12 = p1r1, whatever the values of p, q, and r may be, the distribution will in any case continue unchanged after the second generation The principle was thus known as Hardy's law in the English-speaking world until Curt Stern (1943) pointed out that it had first been formulated independently in 1908 by the German physician Wilhelm Weinberg (see Crow 1999). Others have tried to associate Castle's name with the Law because of his work in 1903, but it is only rarely seen as the Hardy-Weinberg-Castle Law.

[edit] References Castle, W. E. (1903). The laws of Galton and Mendel and some laws governing race improvement by selection. Proc. Amer. Acad. Arts Sci.. 35: 233–242. Crow, J.F. (1999). Hardy, Weinberg and language impediments. Genetics 152: 821-825. link Emigh, T.H. (1980). A comparison of tests for Hardy-Weinberg equilibrium. Biometrics 36: 627 – 642. Ford, E.B. (1971). Ecological Genetics, London. Hardy, G. H. (1908). "Mendelian proportions in a mixed population". Science 28: 49–50. ESP copy Pearson, K. (1904). Mathematical contributions to the theory of evolution. XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society of London, Ser. A 200: 1–66. Stern, C. (1943). "The Hardy–Weinberg law". Science 97: 137–138. JSTOR stable url Weinberg, W. (1908). "Über den Nachweis der Vererbung beim Menschen". Jahreshefte des Vereins für vaterländische Naturkunde in Württemberg 64: 368–382. Yule, G. U. (1902). Mendel's laws and their probable relation to intra-racial heredity. New Phytol. 1: 193–207, 222–238.