ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	11

Descriptor

Classification	16
Error of Measurement	16
Reliability	16
Item Response Theory	6
Scores	6
Accuracy	3
Probability	3
Psychometrics	3
Test Length	3
True Scores	3
Academic Achievement	2
Benchmarking	2
Comparative Analysis	2
Correlation	2
Foreign Countries	2
High Stakes Tests	2
Item Analysis	2
Measurement	2
Measures (Individuals)	2
Monte Carlo Methods	2
National Curriculum	2
Patients	2
Science Tests	2
Test Interpretation	2
Test Items	2
More ▼

Source

Journal of Educational…	3
ACT, Inc.	1
Developmental Medicine &…	1
ETS Research Report Series	1
Educational Research	1
Educational and Psychological…	1
International Journal of…	1
Journal of Personnel…	1
Measurement:…	1
Psychological Methods	1
Research Papers in Education	1
More ▼

Publication Type

Journal Articles	12
Reports - Evaluative	9
Reports - Research	6
Reports - Descriptive	2
Speeches/Meeting Papers	2
Guides - General	1
Numerical/Quantitative Data	1

Education Level

Adult Education	1
Elementary Education	1
Elementary Secondary Education	1
High School Equivalency…	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

United Kingdom (England)	2
Kentucky	1
North Carolina	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Work Keys (ACT)	2
ACT Assessment	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Misclassification Error, Binary Regression Bias, and Reliability in Multidimensional Poverty Measurement: An Estimation Approach Based on Bayesian Modelling

Peer reviewed

Direct link

Najera, Hector – Measurement: Interdisciplinary Research and Perspectives, 2023

Measurement error affects the quality of population orderings of an index and, hence, increases the misclassification of the poor and the non-poor groups and affects statistical inferences from binary regression models. Hence, the conclusions about the extent, profile, and distribution of poverty are likely to be misleading. However, the size and…

Descriptors: Poverty, Error of Measurement, Classification, Statistical Inference

IRT Approaches to Modeling Scores on Mixed-Format Tests

Peer reviewed

Direct link

Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020

This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…

Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests

Benchmark Keystroke Biometrics Accuracy from High-Stakes Writing Tasks. Research Report. ETS RR-21-15

Peer reviewed
PDF on ERIC

Download full text

Choi, Ikkyu; Hao, Jiangang; Deane, Paul; Zhang, Mo – ETS Research Report Series, 2021

"Biometrics" are physical or behavioral human characteristics that can be used to identify a person. It is widely known that keystroke or typing dynamics for short, fixed texts (e.g., passwords) could serve as a behavioral biometric. In this study, we investigate whether keystroke data from essay responses can lead to a reliable…

Descriptors: Accuracy, High Stakes Tests, Writing Tests, Benchmarking

Development and Monte Carlo Study of a Procedure for Correcting the Standardized Mean Difference for Measurement Error in the Independent Variable

Peer reviewed

Direct link

Nugent, William Robert; Moore, Matthew; Story, Erin – Educational and Psychological Measurement, 2015

The standardized mean difference (SMD) is perhaps the most important meta-analytic effect size. It is typically used to represent the difference between treatment and control population means in treatment efficacy research. It is also used to represent differences between populations with different characteristics, such as persons who are…

Descriptors: Error of Measurement, Error Correction, Predictor Variables, Monte Carlo Methods

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

An Investigation of Measurement Invariance of the Key Stage 2 National Curriculum Science Sampling Test in England

Peer reviewed

Direct link

He, Qingping; Anwyll, Steve; Glanville, Matthew; Opposs, Dennis – Research Papers in Education, 2014

Since 2010, the whole national cohort Key Stage 2 (KS2) National Curriculum test in science in England has been replaced with a sampling test taken by pupils at the age of 11 from a nationally representative sample of schools annually. The study reported in this paper compares the performance of different subgroups of the samples (classified by…

Descriptors: National Curriculum, Sampling, Foreign Countries, Factor Analysis

Correcting Fallacies in Validity, Reliability, and Classification

Peer reviewed

Direct link

Sijtsma, Klaas – International Journal of Testing, 2009

This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…

Descriptors: Construct Validity, Reliability, Classification, Test Theory

A Response to an Article Published in "Educational Research"'s Special Issue on Assessment (June 2009). What Can Be Inferred about Classification Accuracy from Classification Consistency?

Peer reviewed

Direct link

Bramley, Tom – Educational Research, 2010

Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…

Descriptors: National Curriculum, Educational Research, Testing, Measurement

Rating Scales for Dystonia in Cerebral Palsy: Reliability and Validity

Peer reviewed

Direct link

Monbaliu, E.; Ortibus, E.; Roelens, F.; Desloovere, K.; Deklerck, J.; Prinzie, P.; De Cock, P.; Feys, H. – Developmental Medicine & Child Neurology, 2010

Aim: This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Method: Three raters independently scored videotapes of 10 patients (five males, five females;…

Descriptors: Content Validity, Cerebral Palsy, Validity, Interrater Reliability

The Impact of Performance Level Misclassification on the Accuracy and Precision of Percent at Performance Level Measures

Peer reviewed

Direct link

Betebenner, Damian W.; Shang, Yi; Xiang, Yun; Zhao, Yan; Yue, Xiaohui – Journal of Educational Measurement, 2008

No Child Left Behind (NCLB) performance mandates, embedded within state accountability systems, focus school AYP (adequate yearly progress) compliance squarely on the percentage of students at or above proficient. The singular importance of this quantity for decision-making purposes has initiated extensive research into percent proficient as a…

Descriptors: Classification, Error of Measurement, Statistics, Reliability

On the Consistency of Individual Classification Using Short Scales

Peer reviewed

Direct link

Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007

Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…

Descriptors: Psychiatry, Patients, Error of Measurement, Test Length

Measurement Error or Meaningful Change? The Consistency of School Achievement in Two School-Based Performance Award Programs.

Peer reviewed

Milanowski, Anthony T. – Journal of Personnel Evaluation in Education, 1999

Describes the temporal consistency of school classification observed in the Kentucky, and secondarily in the Charlotte-Mecklinberg (North Carolina), school-based performance award programs. Data from the Kentucky Department of Education show the extent to which temporal inconsistency could be due to measurement error. (SLD)

Descriptors: Academic Achievement, Achievement Gains, Classification, Error of Measurement

Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

Download full text

Livingston, Samuel A.; Lewis, Charles – 1993

This paper presents a method for estimating the accuracy and consistency of classifications based on test scores. The scores can be produced by any scoring method, including the formation of a weighted composite. The estimates use data from a single form. The reliability of the score is used to estimate its effective test length in terms of…

Descriptors: Classification, Error of Measurement, Estimation (Mathematics), Reliability

Psychometric Properties of Scale Scores and Performance Levels for Performance Assessments Using Polytomous IRT.

Peer reviewed

Wang, Tianyou; Kolen, Michael J.; Harris, Deborah J. – Journal of Educational Measurement, 2000

Describes procedures for calculating conditional standard error of measurement (CSEM) and reliability of scale scores and classification of consistency of performance levels. Applied these procedures to data from the American College Testing Program's Work Keys Writing Assessment with sample sizes of 7,097, 1,035, and 1,793. Results show that the…

Descriptors: Adults, Classification, Error of Measurement, Item Response Theory

The Criterion-Referenced Reliability of a Single Score. Report 76-01.

Livingston, Samuel A. – 1976

A distinction is made between reliability of measurement and reliability of classification; the "criterion-referenced reliability coefficient" describes the former. Application of this coefficient to the probability distribution of possible scores for a single student yields a meaningful way to describe the reliability of a single score. (Author)

Descriptors: Classification, Criterion Referenced Tests, Error of Measurement, Measurement

Previous Page | Next Page »

Pages: 1 | 2

Harris, Deborah J.	2
Livingston, Samuel A.	2
Sijtsma, Klaas	2
Wang, Tianyou	2
Anwyll, Steve	1
Betebenner, Damian W.	1
Bramley, Tom	1
Choi, Ikkyu	1
Choi, Jiwon	1
De Cock, P.	1
Deane, Paul	1
Deklerck, J.	1
Desloovere, K.	1
Emons, Wilco H. M.	1
Feys, H.	1
Glanville, Matthew	1
Hao, Jiangang	1
He, Qingping	1
Kang, Yujin	1
Kim, Stella Y.	1
Kolen, Michael J.	1
Lee, Won-Chan	1
Lewis, Charles	1
Li, Dongmei	1
More ▼