Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 8 |
Descriptor
Comparative Analysis | 8 |
Computation | 8 |
Test Theory | 8 |
Item Response Theory | 5 |
Reliability | 4 |
Scores | 4 |
Scaling | 3 |
Test Items | 3 |
Bias | 2 |
College Entrance Examinations | 2 |
Error of Measurement | 2 |
More ▼ |
Source
Applied Measurement in… | 2 |
Applied Psychological… | 2 |
ACT, Inc. | 1 |
Educational and Psychological… | 1 |
Journal of Educational and… | 1 |
ProQuest LLC | 1 |
Author
Almehrizi, Rashid S. | 1 |
Beretvas, S. Natasha | 1 |
Cui, Zhongmin | 1 |
Culpepper, Steven Andrew | 1 |
Deng, Nina | 1 |
Fang, Yu | 1 |
Haberman, Shelby | 1 |
Larkin, Kevin | 1 |
Murphy, Daniel L. | 1 |
Puhan, Gautam | 1 |
Ramsay, James O. | 1 |
More ▼ |
Publication Type
Journal Articles | 6 |
Reports - Research | 4 |
Reports - Evaluative | 2 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Elementary Education | 2 |
Early Childhood Education | 1 |
Grade 2 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Intermediate Grades | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017
This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…
Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013
Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…
Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling
Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015
This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…
Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory
Xu, Ting; Stone, Clement A. – Educational and Psychological Measurement, 2012
It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…
Descriptors: Educational Research, Monte Carlo Methods, Measures (Individuals), Item Response Theory
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification