User:Sdraaijer/standard setting

The process of delimiting the level of knowledge and skill required and then identifying a score on the examination score scale that corresponds to that level is commonly referred to as standard setting (i.e., determining the passing score on examinations). Standard setting  is a term for establishing  cutscores for a test. A cutscore, also known as cut-off score, passing score, pass-fail score or passing point, is a point on a score continuum that differentiates between classifications along the continuum. The most common cutscore is a score that differentiates between the classifications of "pass" and "fail" on a professional or educational test.

Many tests with low stakes set cutscores arbitrarily; for example, an elementary school teacher my require students to correctly answer 60% of the items on a test to pass, or believe that 5 incorrect answers must lead to a fail, independent of the number of items on a test.

However, there are a number of other methods to determine or set cut scores. Cut score setting is often referred to as Standard Setting. These methods vary in complexity and in argumentation for their appropriateness or applicability.

The main categorization of methods is that in Absolute methodes, Relative methods and Compromise methods.

Absolute methods
In absolute methods, cut scores are set at a certain pre-determined point on the score scale. Everyone below that point fails, everyone above passes, irrespective of the percentage of students that pass or fail. The assumption to use absolute methods is that the subject matter that is queried in a test does not differ from test to test and that there is fully (or at least) reasonable control over the level of difficulty of the items in a test. Absolute methods are related to criterion-referenced testing.


 * 60% Method (absolute method): In the 60% method (common in higher education in the Netherlands ) the cut score is set to a fixed percentage of the maximum score for a test. Typically this cut score is set to 60% (hence the name of the method). The main reasoning behind this standard is that a teacher believes that for students to qualify for a pass, a little bit more proficiency has to be demonstrated that just answering half of the items of a test correctly.

Relative methods
In relative methods, cut scores are set in such a way that a pre-determined percentage of test-takers passes the test, irrespective of the point on the scoring scale for the cut score. The argumentation to use such methods is that in the practice of education, the score of the group of students is influenced by the accidental quality of the previous education and the accidental level of difficulty of the test or other accidental circumstances. The assumption is that each student group is in principal equally able and therefore, the standard should be set related to the distribution of the scores a group achieves. Relative methods are related to norm referenced testing.


 * Core Item (relative method): In the core-item method, a teacher identifies a limited set of items that are regarded the most important for the subject matter. This set is smaller than the set of items for the whole test. Now, the average score on the items of the core set (often adjusted for guessing) is used as the cut score average needed to pass for the whole test.


 * Wijnen (Relative method): In the Wijnen method, the cutscore is set a the average score for all students on a test, compensated for the unreliability of the test. A teacher can choose to lower the cut score with (arbitrarily) one or two times the standard error of the test. The reasoning behind this method is that the number of students that fail incorrectly because of measurement error (false negative error) should be minimised.


 * Grading on the Curve: Grading on a curve (also known as curved grading, bell curving, or simply curving) is a statistical method of assigning grades designed to yield a pre-determined distribution of grades among the student population. This ratio can be adapted on the basis of the reliability of the test (Wijnen Method). For example, the pass-fail ratio is fixed at 60% of the student population and the cut score is set to match that ratio. Grades are then calculated accordingly.

Compromise methods
Compromise methods combine the judgment of the standard setters or absolute setting with information about the realities and consequences of different pass rates or score ranges. For example, when it turns out that given a certain cutscore, the percentage of failing students is very high, a correction for the cutscore can be proposed.


 * Beuk (compromise method): In the Beuk method, each judge in the cut score study is asked to estimate what passing rate should be expected for the exam. This question is posed and answered only after the judges have provided their estimates of the difficulty level for each test question.


 * Hofstee (compromise method): In the Hofstee method, raters or judges define the highest acceptable cut score, the lowest acceptable cut score, highest acceptable fail rate, and the lowest acceptable fail rate. These are plotted against a curve of participants’ score data, and the intersection is used as a cut score.


 * Cohen and Van der Vleuten (compromise method): The Cohen and Van der Vleuten method is a standard setting method with the best performing students as point of reference. In the method, the score of the best student or an averaged score of for example the best 5 students, serves as score to award the maximum grade. This score is in general lower than the score that could be maximally awarded for a test. In relation to this lower maximum score, the cutscore for the test is set relatively lower also. The reasoning behind the validity of applying this method is that it must be assumed that the education for a student is not-perfect (a non-perfect teacher, non-perfect instructional materials), a not-perfect test (which is alway true for educational tests) or too high standards that have (implicitely) been set by the teacher. Yet, it should be expected that in a group of students it must be possible to achieve the maximum grade. Therefore, grades and the cutscore should be adapted accordingly.


 * De Gruijter:


 * Others

See for some discussion: http://www.act.org/research/researchers/reports/pdf/ACT_RR89-2.pdf

Discussion
There is no single 'correct' method for standard setting. Standard setting can be done with great rigour in the form of standard setting studies or with less rigour to provide practical methods for everyday use by teachers in secondary or higher education.

Standard setting is related to grading which involves translating a position on a score continuum into a grade. Obviously, the cut score has an influence on the resulting grade given a specific score.