ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	17

Descriptor

Error of Measurement	23
Psychometrics	23
Test Construction	7
Scores	6
Test Items	6
Computation	5
Item Response Theory	5
Measures (Individuals)	5
Scoring	5
Test Reliability	5
Test Validity	5
Goodness of Fit	4
Measurement Techniques	4
Reliability	4
Academic Standards	3
Cutting Scores	3
Evaluation Methods	3
Interrater Reliability	3
Models	3
Raw Scores	3
Sampling	3
Standard Setting	3
Student Evaluation	3
Testing	3
Adults	2
More ▼

Source

Educational Measurement:…	3
Psychometrika	3
Journal of Educational and…	2
New Mexico Public Education…	2
Alberta Journal of…	1
Applied Psychological…	1
Behavioral Research and…	1
GED Testing Service	1
Health Education Research	1
International Journal of…	1
Journal of Educational…	1
Journal of Speech, Language,…	1
Measurement and Evaluation in…	1
Oxford Review of Education	1
Structural Equation Modeling:…	1
Teachers College Record	1
More ▼

Publication Type

Reports - Descriptive	23
Journal Articles	18
Numerical/Quantitative Data	3
Reports - Evaluative	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
High Schools	1
Kindergarten	1

Audience

Researchers

Location

New Mexico

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
General Educational…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Digital Module 18: Automated Scoring

Peer reviewed

Direct link

Lottridge, Sue; Burkhardt, Amy; Boyer, Michelle – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows…

Descriptors: Computer Assisted Testing, Scoring, Automation, Educational Assessment

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Evaluating Evidence Regarding Relationships with Criteria

Peer reviewed

Direct link

Balkin, Richard S. – Measurement and Evaluation in Counseling and Development, 2017

An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…

Descriptors: Evidence, Academic Standards, Test Construction, Evaluation Criteria

The Cut-Score Operating Function: A New Tool to Aid in Standard Setting

Peer reviewed

Direct link

Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017

In this essay, we describe the construction and use of the Cut-Score Operating Function in aiding standard setting decisions. The Cut-Score Operating Function shows the relation between the cut-score chosen and the consequent error rate. It allows error rates to be defined by multiple loss functions and will show the behavior of each loss…

Descriptors: Cutting Scores, Standard Setting (Scoring), Decision Making, Error Patterns

The Effect of Error Correlation on Interfactor Correlation in Psychometric Measurement

Peer reviewed

Direct link

Westfall, Peter H.; Henning, Kevin S. S.; Howell, Roy D. – Structural Equation Modeling: A Multidisciplinary Journal, 2012

This article shows how interfactor correlation is affected by error correlations. Theoretical and practical justifications for error correlations are given, and a new equivalence class of models is presented to explain the relationship between interfactor correlation and error correlations. The class allows simple, parsimonious modeling of error…

Descriptors: Psychometrics, Correlation, Error of Measurement, Structural Equation Models

Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Peer reviewed

Direct link

Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores

On the Conceptualisation of Measurement Error

Peer reviewed

Direct link

Hutchison, Dougal – Oxford Review of Education, 2008

There is a degree of instability in any measurement, so that if it is repeated, it is possible that a different result may be obtained. Such instability, generally described as "measurement error", may affect the conclusions drawn from an investigation, and methods exist for allowing it. It is less widely known that different disciplines, and…

Descriptors: Measurement Techniques, Data Analysis, Error of Measurement, Test Reliability

Validating Clusters with the Lower Bound for Sum-of-Squares Error

Peer reviewed

Direct link

Steinley, Douglas – Psychometrika, 2007

Given that a minor condition holds (e.g., the number of variables is greater than the number of clusters), a nontrivial lower bound for the sum-of-squares error criterion in K-means clustering is derived. By calculating the lower bound for several different situations, a method is developed to determine the adequacy of cluster solution based on…

Descriptors: Multivariate Analysis, Least Squares Statistics, Error of Measurement, Psychometrics

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Reliability and Validity Evidence for the GED[R] English as a Second Language Test. GED Testing Service[R] Research Studies, 2009-4

Download full text

Setzer, J. Carl – GED Testing Service, 2009

The GED[R] English as a Second Language (GED ESL) Test was designed to serve as an adjunct to the GED test battery when an examinee takes either the Spanish- or French-language version of the tests. The GED ESL Test is a criterion-referenced, multiple-choice instrument that assesses the functional, English reading skills of adults whose first…

Descriptors: Language Tests, High School Equivalency Programs, Psychometrics, Reading Skills

Some Standard Errors in Item Response Theory.

Peer reviewed

Thissen, David; Wainer, Howard – Psychometrika, 1982

The mathematics required to calculate the asymptotic standard errors of the parameters of three commonly used logistic item response models is described and used to generate values for common situations. Difficulties in using maximum likelihood estimation with the three parameter model are discussed. (Author/JKS)

Descriptors: Error of Measurement, Item Analysis, Latent Trait Theory, Maximum Likelihood Statistics

The Applicability of Deadline Models: Comment on Glickman, Gray, and Morales (2005)

Peer reviewed

Direct link

Rouder, Jeffrey N. – Psychometrika, 2005

Glickman, Gray, and Morales (this issue) propose a statistical model for measuring the unobserved latency of stimulus-controlled processes. The model accounts for both speed and accuracy and does so by assuming that participants set an internal deadline. If a stimulus-controlled response is not produced by the deadline, the participant then…

Descriptors: Models, Statistical Analysis, Stimuli, Response Style (Tests)

Estimating Generalizability to a Latent Variable Common to All of a Scale's Indicators: A Comparison of Estimators for Omega[subscript h]

Peer reviewed

Direct link

Zinbarg, Richard E.; Yovel, Iftah; Revelle, William; McDonald, Roderick P. – Applied Psychological Measurement, 2006

The extent to which a scale score generalizes to a latent variable common to all of the scale's indicators is indexed by the scale's general factor saturation. Seven techniques for estimating this parameter--omega[hierarchical] (omega[subscript h])--are compared in a series of simulated data sets. Primary comparisons were based on 160 artificial…

Descriptors: Computation, Factor Analysis, Reliability, Correlation

Language, Dialect, and Register: Sociolinguistics and the Estimation of Measurement Error in the Testing of English Language Learners

Peer reviewed

Direct link

Solano-Flores, Guillermo – Teachers College Record, 2006

This article examines the intersection of psychometrics and sociolinguists in the testing of English language learners (ELLs); it discusses language, dialect, and register as sources of measurement error. Research findings show that the dialect of the language in which students are tested (e.g., local or standard English) is as important as…

Descriptors: Second Language Learning, Test Construction, Sociolinguistics, Psychometrics

Improving Measurement in Health Education and Health Behavior Research Using Item Response Modeling: Introducing Item Response Modeling

Peer reviewed

Direct link

Wilson, Mark; Allen, Diane D.; Li, Jun Corser – Health Education Research, 2006

This paper is the first of several papers designed to demonstrate how the application of item response models in the behavioral sciences can be used to enhance the conceptual and technical toolkit of researchers and developers and to understand better the psychometric properties of psychosocial measures. The papers all use baseline data from the…

Descriptors: Health Education, Self Efficacy, Health Behavior, Behavior Modification

Previous Page | Next Page »

Pages: 1 | 2

Kolen, Michael J.	2
McDonald, Roderick P.	2
Wainer, Howard	2
Allen, Diane D.	1
Alonzo, Julie	1
Anderson, Dan	1
Balkin, Richard S.	1
Boyer, Michelle	1
Burkhardt, Amy	1
Cook, Linda L.	1
Foster, Jeff L.	1
Grabovsky, Irina	1
Griph, Gerald W.	1
Harris, Deborah J.	1
Henning, Kevin S. S.	1
Howell, Roy D.	1
Hutchison, Dougal	1
Kolstad, Andrew	1
Lee, Won-Chan	1
Li, Jun Corser	1
Liu, Kimy	1
Lottridge, Sue	1
Meyer, Kevin D.	1
Nandur, Vuday	1
Neustel, Sandra	1
More ▼