User:Domer64/sandbox

There are three overarching forms of validity; content, criterion, and construct validity. Construct validity refers to the validity of inferences that observations or measurement tools actually represent or measure the construct being investigated. In lay terms, construct validity examines the question: Does the measure behave like the theory says a measure of that construct should behave? Constructs are abstractions that are deliberately created by researchers in order to conceptualize the latent variable, which is the cause of scores on a given measure (although it is not directly observable). Construct validity is essential to the perceived overall validity of the test. Construct validity is particularly important in the social sciences, psychology, psychometrics and language studies. In modern psychometrics the validity of a psychological test is interpreted within the frame of the construct validity. Psychologists such as Samuel Messick (1989) have pushed for a unified view of construct validity “…as an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores…” Key to construct validity are the theoretical ideas behind the trait under consideration, i.e. the concepts that organize how aspects of personality, intelligence, etc. are viewed. Paul Meehl states that "The best construct is the one around which we can build the greatest number of inferences, in the most direct fashion."

Convergent and Discriminant Validity
Convergent and discriminant validity are the two subtypes of validity that make up construct validity. Convergent validity refers to the degree to which two measures of constructs that theoretically should be related, are in fact related. In contrast discriminant validity tests whether concepts or measurements that are supposed to be unrelated are, in fact, unrelated. Take, for example, a construct of general happiness. If a measure of general happiness had convergent validity then constructs similar to happiness (satisfaction, contentment, cheerfulness, etc.) should relate closely to the measure of general happiness. If this measure has discriminate validity than constructs that are not supposed to be related to general happiness (sadness, depression, despair, etc.) should not relate to the measure of general happiness. Measures can have one of the subtypes of construct validity and not the other. Using the example of general happiness, a researcher could create an inventory where there is a very high correlation between general happiness and contentment, but if there is also a significant correlation between happiness and depression, then the measures construct validity is called into question. The test has convergent validity but not discriminate validity.

History
The term construct validity was first coined by Paul Meehl and Lee Conbach in their seminal article Construct Validity In Psychological Tests. They noted idea of construct validly was not new at that point. Rather it was a combinations of many different theorized types of validity dealing with theoretical concepts. They proposed the following three steps to evaluate construct validity:

1.) articulating a set of theoretical concepts and their interrelations 2.) developing ways to measure the hypothetical constructs proposed by the theory 3.) empirically testing the hypothesized relations

Many psychologist note that an important role of construct validation in psychometrics was that it place more emphasis on theory as opposed to validation. The core issue with validation was that a test could be validated, but that did not necessarily show that it measured the theoretical construct it purported to measure. Construct validity has three aspects or components: the substantive component, structural component, and external component. They are related close to three stages in the test construction process: constitution of the pool of items, analysis and selection of the internal structure of the pool of items, and correlation of test scores with criteria and other variables.

In the 1970s there was growing debate between theorist who began to see construct validity as the dominant model pushing towards a more unified theory of validity and those who continued to work from multiple validity frameworks. Many psychologist and education researchers saw “predictive, concurrent, and content validities as essentially ad hoc, construct validity was the whole of validity from a scientific point of view” In the 1974 version The Standards for Educational and Psychological Testing the inter-relatedness of the three different aspects of validity was recognized: "These aspects of validity can be discussed independently, but only for convenience. They are interrelated operationally and logically; only rarely is one of them alone important in a particular situation"  In 1989 Messick presented a new conceptualization of construct validity as a unified and milt-faceted concept. Under this framework, all forms of validity are connected to and are dependent on the quality of the construct. He claimed that a unified theory was not his own idea, but rather the culmination of debate and discussion within the scientific community over the preceding decades. There are six aspects of construct validity in Messick’s Unified Theory of Construct Validity. They examine six questions that measure the quality of a test’s construct validity:

1.)Consequential- What are the potential risks if the scores are, in actuality, invalid or inappropriately interpreted? Is the test still worthwhile given the risks? 2.)Content- Do test items appear to be measuring the construct of interest? 3.)Substantive- Is the theoretical foundation underlying the construct of interest sound? 4.)Structural- Do the interrelationships of dimensions measured by the test correlate with the construct of interest and test scores? 5.)External- Does the test have convergent, discriminant, and predictive qualities? 6.)Generalizability- Does the test generalize across different groups, settings and tasks?

How construct validity should be properly viewed is still a subject of debate for validity theorists. The core of the difference lies in an epistemological difference between Positivist and Postpositivist theorists.

Evaluation
Evaluation of construct validity requires that the correlations of the measure be examined in regards to variables that are known to be related to the construct (purportedly measured by the instrument being evaluated or for which there are theoretical grounds for expecting it to be related). This is consistent with the multitrait-multimethod matrix (MTMM) of examining construct validity described in Campbell and Fiske's landmark paper (1959). There are other method to evaluate construct validity besides MTMM. It can be evaluated through different forms of factor analysis, structural equation modeling (SEM), and other statistical evaluations. It is important to note single study does not prove construct validity. Rather it is a continuous process of evaluation, reevaluation, refinement, and development. Correlations that fit the expected pattern contribute evidence of construct validity. Construct validity is a judgment based on the accumulation of correlations from numerous studies using the instrument being evaluated.

Most researchers attempt to test the construct validity before the main research. To do this pilot studies may be utilized. Pilot studies are small scale preliminary studies aimed at testing the feasibility of a full-scale test. These pilot studies establish the strength of their research and allow them to make any necessary adjustments. Another method is the known-groups technique, which involves administering the measurement instrument to groups expected to differ due to known characteristics. Hypothesized relationship testing involves logical analysis based on theory or prior research. Intervention studies are yet another method of evaluating construct validity. Intervention studies where a group with low scores in the construct is tested, taught the construct, and then re-measured can demonstrate a tests construct validity. If there is a significant difference pre-test and post-test, which are analyzed by statistical tests, then this may demonstrate good construct validity.

Nomological Network
Paul Meehl and Lee Cronbach (1957) proposed that the development of a nomological net was essential to measurement of a tests construct validity. A nomological network defines a construct by illustrating its relation to other constructs and behaviors. It is a representation of the concepts (constructs) of interest in a study, their observable manifestations and the interrelationship among them. It examines whether the relationships between similar construct are considered with relationships between the observed measures of the constructs. Thorough observation of constructs relationships to each other it can generate new constructs. For example, intelligence and working memory are considered highly related constructs. Through the observation of their underlying components psychologists developed new theoretical constructs such as: controlled attention and short term loading. Creating a nomological net can also make the observation and measurement of existing constructs more efficient by pinpointing errors. Researchers have found that studying the dimensions of the human skull (Phrenology) are not indicators of intelligence. By removing the theory of Phernology from the nomological net of intelligence, testing constructs of intelligence is made more efficient. The weaving of all of these interrelated concepts and their observable traits creates a “net” that supports their theoretical concept. For example, in the nomological network for academic achievement, we would expect observable traits of academic achievement (i.e. GPA, SAT, and ACT scores) to relate to the observable traits for studiousness (hours spent studying, attentiveness in class, detail of notes). If they do not then there is a problem with the measurement of academic achievement or studiousness. If they are indicators of one another then the nomological network, and therefore the constructed theory, of academic achievement is strengthened. Although the nomological network proposed a theory of how to strengthen constructs, it doesn't tell us how we can assess the construct validity in a study.

Multitrait-Multimethod Matrix
The multitrait-multimethod matrix (MTMM)is an approach to examining Construct Validity developed by Campbell and Fiske(1959). This model examines convergence (evidence that different measurement methods of a construct give similar results) and discriminability (ability to differentiate the construct from other related constructs).It measures six traits: the evaluation of convergent validity, the evaluation of discriminant (divergent) validity, trait-method units, multitrait-multimethods, truly different methodologies, and trait characteristics. This design allows investigators to test for: “convergence across different measures…of the same ‘thing’…and for divergence between measures…of related but conceptually distinct ‘things'.

Threats to Construct Validity
Since Construct Validity attempts to create a universal cohesion, there are many possible threats to it. Construct validity is threatened by participant re-activity to the study situation (eg the Hawthorne effect), altered behavior due to the novelty of a new treatment, researcher expectations, and diffusion or contamination of the treatment conditions. Developing a poor construct can be an issue. If a construct is too broad or too narrow it can invalidate an entire experiment. For example, a researcher might try to use job satisfaction to define overall happiness. This is too narrow, as somebody may love their job but have an unhappy life outside the workplace. Likewise, using general happiness to measure happiness at work is too broad. Construct confounding is another threat. Construct confounding occurs when other constructs effect the measured construct. For example, self-worth is confounded by self-esteem and self-confidence. One’s perception of self-worth is effected by their state of self-esteem or self-confidence. Another threat is hypothesis guessing. If a subject makes assumption of the aims of the research their behavior changes and it effects construct validity. Similar, if an individual becomes apprehensive during an experiment, it could affect his or her performance. Researches themselves, can be threats to construct validity. Researcher’s expediencies and biases, whether overt or unintentional, can lower construct validity by clouding the effect of the research variable. To avoid these effects interaction should be minimized and double-blind experiments should be used. Variance in scores can show weak construct validity. For example if native English speakers score higher than individuals who speak English a second language on a test written in English trying to measure intelligence, than the test has poor construct validity. It measures language ability rather than intelligence.