ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	0
Since 2017 (last 10 years)	1
Since 2007 (last 20 years)	6

Descriptor

Comparative Testing	12
Item Analysis	12
Test Reliability	12
Test Construction	5
Test Validity	5
Difficulty Level	4
Test Interpretation	4
Test Items	4
Aptitude Tests	3
Multiple Choice Tests	3
Test Format	3
Test Theory	3
Academic Standards	2
Achievement Tests	2
Adults	2
Analysis of Variance	2
Cognitive Ability	2
Comparative Analysis	2
Error of Measurement	2
Foreign Countries	2
Higher Education	2
Measurement Techniques	2
Psychometrics	2
Reading Tests	2
Replication (Evaluation)	2
More ▼

Source

Advances in Health Sciences…	2
Applied Measurement in…	1
Applied Psychological…	1
GED Testing Service	1
International Journal of…	1
Journal of Education for…	1

Publication Type

Reports - Research	9
Journal Articles	6
Reports - Evaluative	2
Speeches/Meeting Papers	2
Tests/Questionnaires	1

Education Level

Higher Education	3
Elementary Secondary Education	2
Grade 10	1
Grade 4	1
Grade 7	1
Postsecondary Education	1

Audience

Researchers

Location

Finland

Laws, Policies, & Programs

Assessments and Surveys

General Educational…

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Multiple True-False Items: A Comparison of Scoring Algorithms

Peer reviewed

Direct link

Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018

Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…

Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests

Does MTV Really Do a Good Job of Evaluating Professors? An Empirical Test of the Internet Site Ratemyprofessors.com

Peer reviewed

Direct link

Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016

Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…

Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis

Are Multiple Choice Tests Fair to Medical Students with Specific Learning Disabilities?

Peer reviewed

Direct link

Ricketts, Chris; Brice, Julie; Coombes, Lee – Advances in Health Sciences Education, 2010

The purpose of multiple choice tests of medical knowledge is to estimate as accurately as possible a candidate's level of knowledge. However, concern is sometimes expressed that multiple choice tests may also discriminate in undesirable and irrelevant ways, such as between minority ethnic groups or by sex of candidates. There is little literature…

Descriptors: Medical Students, Testing Accommodations, Ethnic Groups, Learning Disabilities

Stability of Rasch Scales over Time

Peer reviewed

Direct link

Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…

Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis

Tests in Europe: Where We Are and Where We Should Go

Peer reviewed

Direct link

Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012

Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…

Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries

Reliability Analysis for the Internationally Administered 2002 Series GED Tests. GED Testing Service[R] Research Studies, 2009-3

Download full text

Setzer, J. Carl; He, Yi – GED Testing Service, 2009

Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…

Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability

Internal-Structure Analysis of Analytical Reasoning Worksamples 244 D and E and Development of Form H. Technical Report 1992-1.

Download full text

Bethscheider, Janine K. – 1992

Standard and experimental forms of the Johnson O'Connor Research Foundations Analytical Reasoning test were administered to 1,496 clients of the Foundation (persons seeking information about aptitude for educational and career decisions). The objectives were to develop a new form of the test and to better understand what makes some items more…

Descriptors: Adults, Aptitude Tests, Career Choice, Comparative Testing

An Investigation of the Relationship between Item Arrangement and Test Performance.

Chissom, Brad; Chukabarah, Prince C. O. – 1985

The comparative effects of various sequences of test items were examined for over 900 graduate students enrolled in an educational research course at The University of Alabama, Tuscaloosa. experiment, which was conducted a total of four times using four separate tests, presented three different arrangements of 50 multiple-choice items: (1)…

Descriptors: Analysis of Variance, Comparative Testing, Difficulty Level, Graduate Students

Item-Option Weighting of Achievement Tests: Comparative Study of Methods.

Peer reviewed

Downey, Ronald G. – Applied Psychological Measurement, 1979

This research attempted to interrelate several methods of producing option weights (i.e., Guttman internal and external weights and judges' weights) and examined their effects on reliability and on concurrent, predictive, and face validity. It was concluded that option weighting offered limited, if any, improvement over unit weighting. (Author/CTM)

Descriptors: Achievement Tests, Answer Keys, Comparative Testing, High Schools

A Comparison of Traditional Approaches and Item Response Approaches to the Problem of Item Selection for Criterion-Referenced Measurement.

Download full text

Silva, Sharron J. – 1985

Test item selection techniques based on traditional item analysis methods were compared to techniques based on item response theory. The consistency of mastery classifications in criterion referenced reading tests was examined. Pretest and posttest data were available for 945 first and second grade students and for 1796 fourth to sixth grade…

Descriptors: Analysis of Variance, Comparative Testing, Criterion Referenced Tests, Elementary Education

The Ability to Structure Acoustic Material as a Measure of Musical Aptitude. 4. Experiences with Modifications of the Acoustic Structuring Test. Research Bulletin. No. 51.

Karma, Kai – 1978

Four new versions of an acoustic structuring test were developed, administered, and analyzed in order to produce better tests and to contribute to better understanding of the abilities measured by these tests. The tests consist of tape recordings of patterns of musical notes played on an electric organ or an acoustic guitar. Item analyses and…

Descriptors: Adults, Aptitude, Aptitude Tests, Cognitive Ability

Language Testing: The Construction and Use of Foreign Language Tests. A Teacher's Book.

Lado, Robert – 1961

Intended as a comprehensive introduction to the construction and use of foreign language tests, this book utilizes modern linguistic knowledge as a base for scientific language testing. Major attention in testing is focused on such integrated language skills as auditory and reading comprehension, speaking, writing, translation, and over-all…

Descriptors: Achievement Tests, Aptitude Tests, Comparative Testing, Cultural Education

Bauer, Daniel	1
Bethscheider, Janine K.	1
Brice, Julie	1
Chissom, Brad	1
Chukabarah, Prince C. O.	1
Coombes, Lee	1
Downey, Ronald G.	1
Elosua, Paula	1
Fischer, Martin R.	1
Guttormsen, Sissel	1
He, Yi	1
Huwendiek, Sören	1
Iliescu, Dragos	1
Karma, Kai	1
Krebs, René	1
Lado, Robert	1
Lahner, Felicitas-Maria	1
Lee, Yoonsun	1
Lörwald, Andrea Carolin	1
Murray, Keith B.	1
Nouns, Zineb Miriam	1
Ricketts, Chris	1
Setzer, J. Carl	1
Silva, Sharron J.	1
Taylor, Catherine S.	1
More ▼