ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	9

Descriptor

Error of Measurement	16
Scoring	16
Statistical Analysis	16
Scores	4
Test Interpretation	4
Test Reliability	4
Testing	4
Comparative Analysis	3
Accuracy	2
Classification	2
Equated Scores	2
Evaluation Methods	2
Interrater Reliability	2
Item Response Theory	2
Maximum Likelihood Statistics	2
Reliability	2
Scoring Rubrics	2
Simulation	2
Testing Problems	2
Testing Programs	2
Tests	2
True Scores	2
Verbal Tests	2
Achievement Tests	1
Adaptive Testing	1
More ▼

Source

Journal of Educational…	3
ETS Research Report Series	2
ProQuest LLC	2
Applied Measurement in…	1
Audio-Visual Language Journal	1
Canadian Journal of School…	1
Educational and Psychological…	1
Psychological Assessment	1
Psychometrika	1

Publication Type

Journal Articles	9
Reports - Research	6
Reports - Evaluative	5
Dissertations/Theses -…	2
Speeches/Meeting Papers	2
ERIC Digests in Full Text	1
ERIC Publications	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Postsecondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

National Longitudinal Survey…	1
Praxis Series	1
SAT (College Admission Test)	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Improving Methods for Propensity Score Analysis with Mismeasured Variables by Incorporating Background Variables with Moderated Nonlinear Factor Analysis

Direct link

Greifer, Noah – ProQuest LLC, 2018

There has been some research in the use of propensity scores in the context of measurement error in the confounding variables; one recommended method is to generate estimates of the mis-measured covariate using a latent variable model, and to use those estimates (i.e., factor scores) in place of the covariate. I describe a simulation study…

Descriptors: Evaluation Methods, Probability, Scores, Statistical Analysis

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Choosing among Tucker or Chained Linear Equating in Two Testing Situations: Rater Comparability Scoring and Randomly Equivalent Groups with an Anchor

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2012

Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…

Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis

Robust Structural Equation Modeling with Missing Data and Auxiliary Variables

Peer reviewed

Direct link

Yuan, Ke-Hai; Zhang, Zhiyong – Psychometrika, 2012

The paper develops a two-stage robust procedure for structural equation modeling (SEM) and an R package "rsem" to facilitate the use of the procedure by applied researchers. In the first stage, M-estimates of the saturated mean vector and covariance matrix of all variables are obtained. Those corresponding to the substantive variables…

Descriptors: Structural Equation Models, Tests, Federal Aid, Psychometrics

Oral Performace Scoring Using Generalizability Theory and Many-Facet Rasch Measurement: A Comparison Study

Direct link

Alkahtani, Saif F. – ProQuest LLC, 2012

The principal aim of the present study was to better guide the Quranic recitation appraisal practice by presenting an application of Generalizability theory and Many-facet Rasch Measurement Model for assessing the dependability and fit of two suggested rubrics. Recitations of 93 students were rated holistically and analytically by 3 independent…

Descriptors: Generalizability Theory, Item Response Theory, Verbal Tests, Islam

Administration and Scoring Errors of Graduate Students Learning the WISC-IV: Issues and Controversies

Peer reviewed

Direct link

Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012

A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…

Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring

DIF Trees: Using Classification Trees to Detect Differential Item Functioning

Peer reviewed

Direct link

Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010

A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)

Subscores for Institutions. Research Report. ETS RR-06-13

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J.; Sinharay, Sadip; Puhan, Gautam – ETS Research Report Series, 2006

Recently, there has been an increasing level of interest in reporting subscores. This paper examines the issue of reporting subscores at an aggregate level, especially at the level of institutions that the examinees belong to. A series of statistical analyses is suggested to determine when subscores at the institutional level have any added value…

Descriptors: Scores, Statistical Analysis, Error of Measurement, Reliability

A Reassessment of Standard Error of Measurement.

Download full text

Klaas, Alan C. – 1975

Current usage and theory of standard error of measurement calls for one standard error of measurement figure to be used across all levels of scoring. The study revealed that scoring variance across scoring levels is not constant. As scoring ability increases scoring variance decreases. The assertion that low and high scoring subjects will…

Descriptors: Error of Measurement, Guessing (Tests), Scoring, Statistical Analysis

Cut Scores and Testing: Statistics, Judgment, Truth, and Error.

Peer reviewed

Dwyer, Carol Anne – Psychological Assessment, 1996

The uses and abuses of cut scores are examined. The article demonstrates (1) that cut scores always entail judgment; (2) that cut scores inherently result in misclassification; (3) that cut scores impose an artificial dichotomy on an essentially continuous distribution of knowledge, skill, or ability; and (4) that no true cut scores exist. (SLD)

Descriptors: Classification, Cutting Scores, Educational Testing, Error of Measurement

Five Common Misuses of Tests. ERIC Digest No. 108.

Download full text

Gardner, Eric – 1989

Five of the common misuses of tests are reviewed: (1) acceptance of the test title as an accurate and complete description of the variable being measured (failure to examine the manual and the items carefully to know the specific aspects to be tested can result in misuse through selection of an inappropriate test for a particular purpose or…

Descriptors: Error of Measurement, Evaluation Problems, Examiners, Scoring

Statistics for the Non-statistical

Simpson, J. D. – Audio-Visual Language Journal, 1974

Some basic statistical concepts relevant to the teacher--mean scores, standard deviation, normal and skewed distributions, z scores, item analysis, standard error of measurement, reliability--and their use by the teacher are explained. (RM)

Descriptors: Error of Measurement, Evaluation Methods, Norm Referenced Tests, Scoring

Criterion-Referenced Testing: Comments on Reliability

Peer reviewed

Shavelson, Richard J.; And Others – Journal of Educational Measurement, 1972

In this comment a recent attempt by Samuel A. Livingston to develop a theory of reliability for criterion-referenced measures is critiqued. For Livingston's rejoinder see TM 500 560. (Authors/MB)

Descriptors: Criterion Referenced Tests, Error of Measurement, Measurement Techniques, Response Style (Tests)

Obtaining Maximum Likelihood Trait Estimates from Number-Correct Scores for the Three-Parameter Logistic Model.

Peer reviewed

Yen, Wendy M. – Journal of Educational Measurement, 1984

A procedure for obtaining maximum likelihood trait estimates from number-correct (NC) scores for the three-parameter logistic model is presented. It produces an NC score to trait estimate conversion table. Analyses in the estimated true score metric confirm the conclusions made in the trait metric. (Author/DWH)

Descriptors: Achievement Tests, Error of Measurement, Estimation (Mathematics), Latent Trait Theory

Previous Page | Next Page »

Pages: 1 | 2

Puhan, Gautam	2
Alkahtani, Saif F.	1
Angoff, William H.	1
Barford, Sean W.	1
Carol Eckerly	1
Dombrowski, Stefan C.	1
Dwyer, Carol Anne	1
Gardner, Eric	1
Greifer, Noah	1
Haberman, Shelby J.	1
Janzen, Troy M.	1
John R. Donoghue	1
Kim, Sooyeon	1
Klaas, Alan C.	1
Krawchuk, Lindsey L.	1
Livingston, Samuel A.	1
Mrazik, Martin	1
Shavelson, Richard J.	1
Simpson, J. D.	1
Sinharay, Sadip	1
Vaughn, Brandon K.	1
Wang, Qiu	1
Yen, Wendy M.	1
Yuan, Ke-Hai	1
More ▼