ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	5

Descriptor

Test Items	24
Test Theory	24
Testing Problems	24
Test Construction	7
Test Reliability	7
Test Validity	7
Criterion Referenced Tests	6
Item Analysis	6
Mathematical Models	6
Measurement Techniques	6
Statistical Analysis	5
Comparative Analysis	4
Difficulty Level	4
Error of Measurement	4
Latent Trait Theory	4
Multiple Choice Tests	4
Test Bias	4
Test Interpretation	4
Achievement Tests	3
Adaptive Testing	3
Computer Assisted Testing	3
Culture Fair Tests	3
Guessing (Tests)	3
Psychometrics	3
Scoring Formulas	3
More ▼

Source

Journal of Educational…	2
Instructional Science	1
International Journal of…	1
Journal of Educational and…	1
Performance and Instruction	1
Rehabilitation Research,…	1
Review of Research in…	1
School Psychology Review	1

Publication Type

Reports - Research	11
Journal Articles	9
Reports - Evaluative	6
Speeches/Meeting Papers	6
Reports - Descriptive	5
Opinion Papers	3
Books	1
Collected Works - General	1
Guides - Classroom - Learner	1
Guides - Classroom - Teacher	1

Education Level

Adult Education	1
Elementary Secondary Education	1
Higher Education	1

Audience

Researchers	4
Practitioners	1
Students	1

Location

United Kingdom	1
United States	1

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

SAT (College Admission Test)	2
Expressive One Word Picture…	1
Graduate Management Admission…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Test Equity for People Who Are Deaf or Hard-of-Hearing: Commission on Rehabilitation Counselor Certification Steps for Implementation

Peer reviewed

Direct link

Saladin, Shawn P.; Reid, Christine; Shiels, John – Rehabilitation Research, Policy, and Education, 2011

The Commission on Rehabilitation Counselor Certification (CRCC) has taken a proactive stance on perceived test inequities of the Certified Rehabilitation Counselor (CRC) exam as it relates to people who are prelingually deaf and hard of hearing. This article describes the process developed and implemented by the CRCC to help maximize test equity…

Descriptors: Test Items, Rehabilitation Counseling, Counselor Certification, Deafness

Theory of Test Translation Error

Peer reviewed

Direct link

Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009

In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…

Descriptors: Test Items, Investigations, Semantics, Translation

Detecting Differential Speededness in Multistage Testing

Peer reviewed

Direct link

van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007

A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…

Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time

On the Direct Measurement of Face Validity: A Comment on Nevo.

Peer reviewed

Secolsky, Charles – Journal of Educational Measurement, 1987

For measuring the face validity of a test, Nevo suggested that test takers and nonprofessional users rate items on a five point scale. This article questions the ability of those raters and the credibility of the aggregated judgment as evidence of the validity of the test. (JAZ)

Descriptors: Content Validity, Measurement Techniques, Rating Scales, Test Items

The Attenuation Paradox of Traditional Test Theory as a Breakdown of Local Independence in Person-Item Response Theory.

Andrich, David – 1984

Both the attenuation paradox of traditional test theory and the assumption of local independence in person-item response theory have caused problems in interpretation. This paper demonstrates that the two are related concepts, and, through this demonstration, both are clarified. It is demonstrated that the breakdown of local independence leads to…

Descriptors: Latent Trait Theory, Test Interpretation, Test Items, Test Reliability

An Alternative Interpretation of Three Stability Models. Measurement and Methodology, Work Unit 2: Technical Adequacy of Tests.

Wilcox, Rand R. – 1978

Two fundamental problems in mental test theory are to estimate true score and to estimate the amount of error when testing an examinee. In this report, three probability models which characterize a single test item in terms of a population of examinees are described. How these models may be modified to characterize a single examinee in terms of an…

Descriptors: Achievement Tests, Comparative Analysis, Error of Measurement, Mathematical Models

Searching for Better Scoring of Multiple-Choice Tests: Proper Treatment of Misinformation, Guessing and Partial Knowledge.

Zin, Than Than; Williams, John – 1991

Brief explanations are presented of some of the different methods used to score multiple-choice tests; and some studies of partial information, guessing strategies, and test-taking behaviors are reviewed. Studies are grouped in three categories of effort to improve scoring: (1) those that require extra effort from the examinee to answer…

Descriptors: Educational Research, Estimation (Mathematics), Guessing (Tests), Literature Reviews

A Study of Hypotheses Basic to the Use of Rights and Formula Scores. Phase I--Based on Experimental Administration of College Board Tests [and] Phase II--Based on Operational Administration of the GMAT.

Angoff, William H.; Schrader, William B. – 1982

In a study to determine whether a shift from Formula scoring to Rights scoring can be made without causing a discontinuity in the test scale, the analysis of special administrations of the Scholastic Aptitude Test and Chemistry Achievement Test and the variable section of an operational form of the Graduate Management Admission Test (GMAT) is…

Descriptors: Comparative Analysis, Equated Scores, Guessing (Tests), Higher Education

Test Length and Validity: An Application of Test Theory to a Finite World.

Myers, Charles T. – 1978

The viewpoint is expressed that adding to test reliability by either selecting a more homogeneous set of items, restricting the range of item difficulty as closely as possible to the most efficient level, or increasing the number of items will not add to test validity and that there is considerable danger that efforts to increase reliability may…

Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Test Construction

Test Design Project: Studies in Test Adequacy. Annual Report.

Download full text

Wilcox, Rand R. – 1981

These studies in test adequacy focus on two problems: procedures for estimating reliability, and techniques for identifying ineffective distractors. Fourteen papers are presented on recent advances in measuring achievement (a response to Molenaar); "an extension of the Dirichlet-multinomial model that allows true score and guessing to be…

Descriptors: Achievement Tests, Criterion Referenced Tests, Guessing (Tests), Mathematical Models

Using Cognitive Science to Assign Test Weights.

Peer reviewed

Bhaskar, R.; Dillard, Jesse F. – Instructional Science, 1983

Description of an objective method for assigning weights to questions on examinations includes discussions of classical test theory, knowledge organization, and how task analysis can be used to identify knowledge elements required to solve specific problems, rank them, and assign objective weights to exam questions using a Pareto distribution (7…

Descriptors: Accounting, Epistemology, Evaluation Methods, Item Analysis

What Counts as Evidence of Educational Achievement? The Role of Constructs in the Pursuit of Equity in Assessment

Peer reviewed

Direct link

Wiliam, Dylan – Review of Research in Education, 2010

The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…

Descriptors: Educational Assessment, Validity, Inferences, Construct Validity

Adjusting Scores on Examinations Offering a Choice of Questions.

Download full text

Livingston, Samuel A. – 1986

This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…

Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models

A Discussion of the Expressive One-Word Picture Vocabulary Test.

Peer reviewed

Altepeter, Tom – School Psychology Review, 1983

A critical review of the Expressive One-Word Picture Vocabulary Test (Gardner) is offered. The reviewer feels that the instrument cannot be recommended in its present form. Further research concerning the manual, and theoretical issues, (particularly test-retest stability) is strongly recommended. (Author/PN)

Descriptors: Error of Measurement, Intelligence Tests, Item Analysis, Pictorial Stimuli

Previous Page | Next Page »

Pages: 1 | 2

Wilcox, Rand R.	2
Altepeter, Tom	1
Andrich, David	1
Angoff, William H.	1
Backhoff, Eduardo	1
Bhaskar, R.	1
Breithaupt, Krista	1
Broussard, Rolland L.	1
Chuah, Siang Chee	1
Contreras-Nino, Luis Angel	1
Dillard, Jesse F.	1
Hambleton, Ronald K.	1
Hathaway, Walter	1
Janda, Louis H.	1
Kiely, Gerard L.	1
Livingston, Samuel A.	1
Longford, Nicholas T.	1
Myers, Charles T.	1
Quellmalz, Edys S.	1
Reid, Christine	1
Rogers, H. Jane	1
Saladin, Shawn P.	1
Sarvela, Paul D.	1
Schrader, William B.	1
More ▼