ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

Test Items	14
Test Reliability	14
Weighted Scores	14
Test Validity	8
Item Analysis	6
Multiple Choice Tests	6
Scoring	6
Test Construction	5
Achievement Tests	4
Correlation	4
Higher Education	4
Scores	4
Scoring Formulas	4
Statistical Analysis	4
Comparative Analysis	3
Guessing (Tests)	3
Psychometrics	3
Test Format	3
Advanced Placement Programs	2
Comparative Testing	2
Content Validity	2
Difficulty Level	2
Error of Measurement	2
Foreign Countries	2
Grade Point Average	2
More ▼

Source

Applied Psychological…	3
College Board	2
Applied Measurement in…	1
Educational and Psychological…	1
Evaluation and the Health…	1
International Association for…	1
PROFILE: Issues in Teachers'…	1

Publication Type

Reports - Research	8
Journal Articles	5
Speeches/Meeting Papers	2
Books	1
Collected Works - General	1
Non-Print Media	1
Reference Materials - General	1
Reports - Evaluative	1

Education Level

Higher Education	2
Postsecondary Education	2
Adult Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 4	1
High Schools	1
Intermediate Grades	1
Secondary Education	1

Audience

Location

Colombia

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
International Association for…	1
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Reliability and Validity of International Large-Scale Assessment: Understanding IEA's Comparative Studies of Student Achievement. IEA Research for Education. Volume 10

Download full text

Wagemaker, Hans, Ed. – International Association for the Evaluation of Educational Achievement, 2020

Although International Association for the Evaluation of Educational Achievement-pioneered international large-scale assessment (ILSA) of education is now a well-established science, non-practitioners and many users often substantially misunderstand how large-scale assessments are conducted, what questions and challenges they are designed to…

Descriptors: International Assessment, Achievement Tests, Educational Assessment, Comparative Analysis

Aligning English Language Testing with Curriculum

Peer reviewed
PDF on ERIC

Download full text

Palacio, Marcela; Gaviria, Sandra; Brown, James Dean – PROFILE: Issues in Teachers' Professional Development, 2016

Frustrations with traditional testing led a group of teachers at the English for adults program at Universidad EAFIT (Colombia) to design tests aligned with the institutional teaching philosophy and classroom practices. This article reports on a study of an item-by-item evaluation of a series of English exams for validity and reliability in an…

Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Second Language Instruction

Developing Form Assembly Specifications for Exams with Multiple Choice and Constructed Response Items: Balancing Reliability and Validity Concerns

Download full text

Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010

The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…

Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity

Biserial Weights: A New Approach to Test Item Option Weighting

Peer reviewed

Claudy, John G. – Applied Psychological Measurement, 1978

Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)

Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items

Biweight Estimates of Latent Ability.

Peer reviewed

MisLevy, Robert J.; Bock, R. Darrell – Educational and Psychological Measurement, 1982

An alternative biweight estimator based on Tukey's is examined in which (1) test disturbances are not assumed to be the same for all subjects, (2) each response is utilized proportional to its value, and (3) the biweight and maximum likelihood estimate agree when no disturbances are present. Smaller mean-squared errors are shown. (Author/CM)

Descriptors: Error of Measurement, Estimation (Mathematics), Guessing (Tests), Latent Trait Theory

The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring

Peer reviewed

Kane, Michael; Moloney, James – Applied Psychological Measurement, 1978

The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)

Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests

An Evaluation of "Polyweighting" in Domain-Referenced Testing.

Sympson, J. Bradford; Haladyna, Thomas M. – 1988

A new approach to polychotomous scoring of test items, similar to "max-alpha" scaling (MAS) and known as polyweighting, has been developed. Unlike MAS, this new method of polychotomous scoring provides scoring weights for a given item that are independent of the difficulty of other items in the analysis. Moreover, the scoring weights are…

Descriptors: Computer Software, Difficulty Level, Item Analysis, Latent Trait Theory

Weighting Constructed-Response Items in IRT-Based Exams

Peer reviewed

Direct link

Sykes, Robert C.; Hou, Liling – Applied Measurement in Education, 2003

Weighting responses to Constructed-Response (CR) items has been proposed as a way to increase the contribution these items make to the test score when there is insufficient testing time to administer additional CR items. The effect of various types of weighting items of an IRT-based mixed-format writing examination was investigated.…

Descriptors: Item Response Theory, Weighted Scores, Responses, Scores

Evaluating Student Multiple-Choice Responses: Effects of Coded and Free Formats.

Peer reviewed

Harasym, P. H.; And Others – Evaluation and the Health Professions, 1980

Coded, as opposed to free response items, in a multiple choice physiology test had a cueing effect which raised students' scores, especially for lower achievers. Reliability of coded items was also lower. Item format and scoring method had an effect on test results. (GDC)

Descriptors: Achievement Tests, Comparative Testing, Cues, Higher Education

Item Option Weighting of Achievement Tests: Comparative Study of Methods.

Download full text

Downey, Ronald G.

Previous research has studied the effects of different methods of item option weighting on the reliability and concurrent and predictive validity of achievement tests. Increases in reliability are generally found, but with mixed results for validity. Several methods of producing option weights, (i.e., Guttman internal and external weights and…

Descriptors: Achievement Tests, Comparative Analysis, Correlation, Grade Point Average

The Effect of Using Different Weights for Multiple-Choice and Free-Response Item Sections

Download full text

Hendrickson, Amy; Patterson, Brian; Melican, Gerald – College Board, 2008

Presented at the Annual National Council on Measurement in Education (NCME) in New York in March 2008. This presentation explores how different item weighting can affect the effective weights, validity coefficents and test reliability of composite scores among test takers.

Descriptors: Multiple Choice Tests, Test Format, Test Validity, Test Reliability

Item-Option Weighting of Achievement Tests: Comparative Study of Methods.

Peer reviewed

Downey, Ronald G. – Applied Psychological Measurement, 1979

This research attempted to interrelate several methods of producing option weights (i.e., Guttman internal and external weights and judges' weights) and examined their effects on reliability and on concurrent, predictive, and face validity. It was concluded that option weighting offered limited, if any, improvement over unit weighting. (Author/CTM)

Descriptors: Achievement Tests, Answer Keys, Comparative Testing, High Schools

Content Specifications of a Test and Generalizability Theory.

Gonzalez-Tamayo, Eulogio – 1987

The concepts of universe of admissible observation and universe of generalization from the generalizability theory were applied to calculate the intraclass correlation coefficient of a licensure test. The internal consistency coefficient of a dichotomously scored test is identical to the intraclass correlation coefficient of a two-facet design.…

Descriptors: Adults, Analysis of Variance, Content Validity, Criterion Referenced Tests

Scoreing and Analyzing Confidence Tests. Final Report.

Download full text

Rippey, Robert M. – 1971

Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)

Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level

Downey, Ronald G.	2
Hendrickson, Amy	2
Patterson, Brian	2
Bock, R. Darrell	1
Brown, James Dean	1
Claudy, John G.	1
Ewing, Maureen	1
Gaviria, Sandra	1
Gonzalez-Tamayo, Eulogio	1
Haladyna, Thomas M.	1
Harasym, P. H.	1
Hou, Liling	1
Kane, Michael	1
Melican, Gerald	1
MisLevy, Robert J.	1
Moloney, James	1
Palacio, Marcela	1
Rippey, Robert M.	1
Sykes, Robert C.	1
Sympson, J. Bradford	1
Wagemaker, Hans, Ed.	1
More ▼