Lexical hypothesis

In personality psychology, the lexical hypothesis (also known as the fundamental lexical hypothesis, lexical approach, or sedimentation hypothesis ) generally includes two postulates:

1. Those personality characteristics that are important to a group of people will eventually become a part of that group's language.

and that therefore:

2. More important personality characteristics are more likely to be encoded into language as a single word.

With origins during the late 19th century, use of the lexical hypothesis began to flourish in English and German psychology during the early 20th century. The lexical hypothesis is a major basis of the study of the Big Five personality traits, the HEXACO model of personality structure and the 16PF Questionnaire and has been used to study the structure of personality traits in a number of cultural and linguistic settings.

Early estimates
Sir Francis Galton was one of the first scientists to apply the lexical hypothesis to the study of personality, stating:

"I tried to gain an idea of the number of the more conspicuous aspects of the character by counting in an appropriate dictionary the words used to express them... I examined many pages of its index here and there as samples of the whole, and estimated that it contained fully one thousand words expressive of character, each of which has a separate shade of meaning, while each shares a large part of its meaning with some of the rest."

Despite Galton's early ventures into the lexical study of personality, more than two decades passed before English-language scholars continued his work. A 1910 study by George E. Partridge listed approximately 750 English adjectives used to describe mental states, while a 1926 study of Webster's New International Dictionary by M. L. Perkins provided an estimate of 3,000 such terms. These early explorations and estimates were not limited to the English-speaking world, with philosopher and psychologist Ludwig Klages stating in 1929 that the German language contains approximately 4,000 words to describe inner states.

Allport & Odbert
Nearly half a century after Galton first investigated the lexical hypothesis, Franziska Baumgarten published the first psycholexical classification of personality-descriptive terms. Using dictionaries and characterology publications, Baumgarten identified 1,093 separate terms in the German language used for the description of personality and mental states. Although this number is similar in size to the German and English estimates offered by earlier researchers, Gordon Allport and Henry S. Odbert revealed this to be a severe underestimate in a 1936 study. Similar to the earlier work of M. L. Perkins, they used Webster's New International Dictionary as their source. From this list of approximately 400,000 words, Allport and Odbert identified 17,953 unique terms used to describe personality or behavior.

This is one of the most influential psycholexical studies in the history of trait psychology. Not only was it the longest, most exhaustive list of personality-descriptive words at the time, it was also one of the earliest attempts at classifying English-language terms with the use of psychological principles. Using their list of nearly 18,000 terms, Allport and Odbert separated these into four categories or "columns":


 * Column I: This group contains 4,504 terms that describe or are related to personality traits. Being the most important of the four columns to Allport and Odbert and future psychologists, its terms most closely relate to those used by modern personality psychologists (e.g., aggressive, introverted, sociable). Allport and Odbert suggested that this column represented a minimum rather than final list of trait terms. Because of this, they recommended that other researchers consult the remaining three columns in their studies.


 * Column II: In contrast with the more stable dispositions described by terms in Column I, this group includes terms describing present states, attitudes, emotions, and moods (e.g., rejoicing, frantic). As a result of this emphasis of temporary states, present participles represent the majority of the 4,541 terms in Column II.


 * Column III: The largest of the four groups, Column III contains 5,226 words related to social evaluations of an individual person's character (e.g., worthy, insignificant). Unlike the previous two columns, this group does not refer to internal psychological attributes of a person. As such, Allport and Odbert acknowledged that Column III did not meet their definition of trait-related terms. Predating the person-situation debate by more than 30 years, Allport and Odbert included this group to appease researchers of social psychology, sociology, and ethics.


 * Column IV: The last of Allport and Odbert's four columns contained 3,682 words. Termed the "miscellaneous column" by the authors, Column IV contains important personality-descriptive terms that did not seem appropriate for the other three columns. Allport and Odbert offered potential subgroups for terms describing behaviors (e.g., pampered, crazed), physical qualities associated with psychological traits (e.g., lean, roly-poly), and talents or abilities (e.g., gifted, prolific). However, they noted that these subdivisions were not necessarily accurate, as: (i) innumerable subgroups were possible, (ii) these subgroups would not incorporate all of the miscellaneous terms, and (iii) further editing might reveal that these terms could be used for the other three columns.

Allport and Odbert did not present these four columns as representing orthogonal concepts. Many of their nearly 18,000 terms could have been classified differently or put into multiple categories, particularly those in Columns I and II. Although the authors attempted to remedy this with the aid of three other editors, the average degree of agreement between these independent reviewers was approximately 47%. Noting that each outside reviewer seemed to have a preferred column, the authors decided to present the classifications performed by Odbert. Rather than try to rationalize this decision, Allport and Odbert presented the results of their study as somewhat arbitrary and unfinished.

Warren Norman
Throughout the 1940s, researchers such as Raymond Cattell and Donald Fiske used factor analysis to explore the more general structure of the trait terms in Allport and Odbert's Column I. Rather than rely on the factors obtained by these researchers, Warren Norman performed an independent analysis of Allport and Odbert's terms in 1963. Despite finding a five-factor structure similar to Fiske's, Norman decided to use Allport and Odbert's original list to create a more precise and better-structured taxonomy of terms. Using the 1961 edition of Webster's International Dictionary, Norman added relevant terms and removed those from Allport and Odbert's list that were no longer in use. This resulted in a source list of approximately 40,000 potential trait-descriptive terms. Using this list, Norman then removed terms that were deemed archaic or obsolete, solely evaluative, overly obscure, dialect-specific, loosely related to personality, and purely physical. By doing so, Norman reduced his original list to 2,797 unique trait-descriptive terms. Norman's work would eventually serve as the basis for Dean Peabody and Lewis Goldberg's explorations of the "Big Five" personality traits.

Juri Apresjan and the Moscow Semantic School
During the 1970s, Juri Apresjan, a founder of the Moscow Semantic School, developed the systemic, or systematic, method of lexicography which utilizes the concept of the language picture of the world. This concept is also termed the naive picture of the world in order to stress the non-scientific description of the world which is found in natural language. In his book "Systematic Lexicography", which was published in English in 2000, J.D.Apresjan puts forward the idea of building dictionaries on the basis of "reconstructing the so-called naive picture of the world, or the "world-view", underlying the partly universal and partly language specific pattern of conceptualizations inherent in any natural language". In his opinion, the general world-view can be fragmented into different more local pictures of reality, such as naive geometry, naive physics, naive psychology, and so forth. In particular, one chapter of the book Apresjan allots to the description of lexicographic reconstruction of the language picture of the human being in the Russian language. Later, Apresjan's work was the basis for Sergey Golubkov's further attempts to build "the language personality theory"  which would be different from other lexically-based personality theories (e.g. by Allport, Cattell, Eysenck, etc.) due to its meronomic (partonomic) nature versus the taxonomic nature of the previously mentioned personality theories.

Psycholexical studies of values
In addition to research on personality, the psycholexical method has also been applied to the study of values in multiple languages, providing a contrast with theory-driven approaches such as Schwartz's Theory of Basic Human Values.

Philosophy
Concepts similar to the lexical hypothesis are basic to ordinary language philosophy. Similar to the use of the lexical hypothesis to understand personality, ordinary language philosophers propose that philosophical problems can be solved or better understood by an examination of everyday language. In his essay "A Plea for Excuses," J. L. Austin cited three main justifications for this method: words are tools, words are not only facts or objects, and commonly used words "embod[y] all the distinctions men have found worth drawing...we are using a sharpened awareness of words to sharpen our perception of, though not as the final arbiter of, the phenomena".

Criticism
Despite its widespread use for the study of personality, the lexical hypothesis has been challenged for a number of reasons. The following list describes some of the major critiques of the lexical hypothesis and personality models based on psycholexical studies.
 * The use of verbal descriptors as material for analysis brings a pro-social bias of language into the resulting models.  Experiments using the lexical hypothesis  indeed demonstrated that the use of lexical material skews the resulting dimensionality according to a sociability bias of language and a negativity bias of emotionality, grouping all evaluations around these two dimensions. This means that the two largest dimensions in the Big Five model of personality (i.e., Extraversion and Neuroticism) might be just an artifact of the lexical method that this model employed.
 * Many traits of psychological importance are too complex to be encoded into single terms or used in everyday language. In fact, an entire text may be the only way to accurately capture and reflect some important personality characteristics.
 * Laypeople use personality-descriptive terms in an ambiguous manner. Similarly, many of the terms used in psycholexical studies are too ambiguous to be useful in a psychological context.
 * The lexical hypothesis relies on terms that were not developed by experts. As such, any models developed with the lexical hypothesis represent lay perceptions rather than expert psychological knowledge.
 * Language accounts for a minority of communication and is inadequate to describe much of human experience.
 * The mechanisms that resulted in the development of personality lexicons are poorly understood.
 * Personality-descriptive terms change over time and differ in meaning across dialects, languages, and cultures.
 * The methods used to test the lexical hypothesis are unscientific.
 * Personality-descriptive language is too general to be represented by a single word class, yet psycholexical studies of personality largely rely on adjectives.