Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 10 |
Descriptor
Computation | 12 |
Test Items | 12 |
Test Theory | 12 |
Item Response Theory | 6 |
Difficulty Level | 4 |
Scores | 4 |
Statistical Analysis | 4 |
College Entrance Examinations | 3 |
Comparative Analysis | 3 |
Multiple Choice Tests | 3 |
Reliability | 3 |
More ▼ |
Source
Author
Ketterlin-Geller, Leanne R. | 2 |
Liu, Kimy | 2 |
Tindal, Gerald | 2 |
Almehrizi, Rashid S. | 1 |
Andrich, David | 1 |
DeCarlo, Lawrence T. | 1 |
Deng, Nina | 1 |
Dimitrov, Dimiter M. | 1 |
Haberman, Shelby J. | 1 |
Harrison, Michael | 1 |
Jung, Eunju | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 8 |
Numerical/Quantitative Data | 2 |
Reports - Evaluative | 2 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 4 |
Higher Education | 4 |
Postsecondary Education | 4 |
Grade 3 | 2 |
Grade 4 | 2 |
Grade 5 | 2 |
Grade 6 | 2 |
Grade 7 | 2 |
Grade 8 | 2 |
Middle Schools | 2 |
Early Childhood Education | 1 |
More ▼ |
Audience
Practitioners | 1 |
Researchers | 1 |
Location
Australia | 1 |
Sweden | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Law School Admission Test | 1 |
National Assessment of… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Stemler, Steven E.; Naples, Adam – Practical Assessment, Research & Evaluation, 2021
When students receive the same score on a test, does that mean they know the same amount about the topic? The answer to this question is more complex than it may first appear. This paper compares classical and modern test theories in terms of how they estimate student ability. Crucial distinctions between the aims of Rasch Measurement and IRT are…
Descriptors: Item Response Theory, Test Theory, Ability, Computation
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Raykov, Tenko; Dimitrov, Dimiter M.; Marcoulides, George A.; Harrison, Michael – Educational and Psychological Measurement, 2019
Building on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person…
Descriptors: True Scores, Item Response Theory, Test Items, Test Theory
Kogar, Hakan – International Journal of Assessment Tools in Education, 2018
The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…
Descriptors: Simulation, Context Effect, Computation, Statistical Analysis
Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017
This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…
Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification
Andrich, David; Kreiner, Svend – Applied Psychological Measurement, 2010
Models of modern test theory imply statistical independence among responses, generally referred to as "local independence." One violation of local independence occurs when the response to one item governs the response to a subsequent item. Expanding on a formulation of this kind of violation as a process in the dichotomous Rasch model,…
Descriptors: Test Theory, Item Response Theory, Test Items, Correlation
Jung, Eunju; Liu, Kimy; Ketterlin-Geller, Leanne R.; Tindal, Gerald – Behavioral Research and Teaching, 2008
The purpose of this study was to develop general outcome measures (GOM) in mathematics so that teachers could focus their instruction on needed prerequisite skills. We describe in detail, the manner in which content-related evidence was established and then present a number of statistical analyses conducted to evaluate the technical adequacy of…
Descriptors: Item Analysis, Test Construction, Test Theory, Mathematics Tests
Liu, Kimy; Sundstrom-Hebert, Krystal; Ketterlin-Geller, Leanne R.; Tindal, Gerald – Behavioral Research and Teaching, 2008
The purpose of this study was to document the instrument development of maze measures for grades 3-8. Each maze passage contained twelve omitted words that students filled in by choosing the best-fit word from among the provided options. In this technical report, we describe the process of creating, reviewing, and pilot testing the maze measures.…
Descriptors: Test Construction, Cloze Procedure, Multiple Choice Tests, Reading Tests
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation
Schmidt, Hans-Jurgen – 1988
This study assumes that multiple choice test items generally provide the testee with several solutions, one of which is correct and the others of which are wrong. If pupils are unable to answer a question, one would expect that the wrong choices have equal chances of being selected. In many multiple choice items on stoichiometric calculation which…
Descriptors: Behavior Patterns, Chemistry, Computation, Performance