ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	15

Descriptor

Error of Measurement	21
Psychometrics	21
Scoring	21
Test Validity	9
Test Items	8
Test Reliability	8
Item Response Theory	7
Goodness of Fit	6
Interrater Reliability	6
Computer Assisted Testing	5
Correlation	4
Item Analysis	4
Mathematics Achievement	4
Computation	3
Cutting Scores	3
Difficulty Level	3
Elementary School Students	3
Measurement Techniques	3
Reliability	3
Scores	3
Student Evaluation	3
Test Construction	3
Testing	3
Academic Standards	2
At Risk Students	2
More ▼

Source

Grantee Submission	3
Educational Measurement:…	2
Educational and Psychological…	2
New Mexico Public Education…	2
Advances in Health Sciences…	1
Assessment for Effective…	1
Educational Assessment	1
Educational Testing Service	1
Journal of Educational and…	1
Journal of Psychoeducational…	1
Language, Speech, and Hearing…	1
Psychometrika	1
More ▼

Publication Type

Reports - Research	13
Journal Articles	12
Reports - Descriptive	5
Numerical/Quantitative Data	2
Speeches/Meeting Papers	2
Opinion Papers	1
Reports - Evaluative	1

Education Level

Elementary Education	4
Elementary Secondary Education	2
Intermediate Grades	2
Early Childhood Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Middle Schools	1
Primary Education	1

Audience

Researchers

Location

New Mexico

Laws, Policies, & Programs

Assessments and Surveys

Medical College Admission Test	1
National Assessment of…	1
National Longitudinal Survey…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Resolving and Re-Scoring Constructed Response Items in Mixed-Format Assessments: An Exploration of Three Approaches

Peer reviewed

Direct link

Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024

We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…

Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners

Digital Module 18: Automated Scoring

Peer reviewed

Direct link

Lottridge, Sue; Burkhardt, Amy; Boyer, Michelle – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows…

Descriptors: Computer Assisted Testing, Scoring, Automation, Educational Assessment

Modeling of Item Response Functions under the D-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2020

This study presents new models for item response functions (IRFs) in the framework of the D-scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM…

Descriptors: Item Response Theory, Scoring, True Scores, Scaling

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022

Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

The Cut-Score Operating Function: A New Tool to Aid in Standard Setting

Peer reviewed

Direct link

Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017

In this essay, we describe the construction and use of the Cut-Score Operating Function in aiding standard setting decisions. The Cut-Score Operating Function shows the relation between the cut-score chosen and the consequent error rate. It allows error rates to be defined by multiple loss functions and will show the behavior of each loss…

Descriptors: Cutting Scores, Standard Setting (Scoring), Decision Making, Error Patterns

Psychometric Report on the Knowledge for Teaching Elementary Fractions Test Administered to Elementary Educators in Six States in Spring 2017. Research Report No. 2018-13

Download full text

Schoen, Robert C.; Yang, Xiaotong; Paek, Insu – Grantee Submission, 2018

This report provides evidence of the substantive and structural validity of the Knowledge for Teaching Elementary Fractions Test. Field-test data were gathered with a sample of 241 elementary educators, including teachers, administrators, and instructional support personnel, in spring 2017, as part of a larger study involving a multisite…

Descriptors: Psychometrics, Pedagogical Content Knowledge, Mathematics Tests, Mathematics Instruction

Psychometric Report for the Early Fractions Test (Version 2.2) Administered with Third- and Fourth-Grade Students in Spring 2017. Research Report No. 2017-11

Download full text

Schoen, Robert C.; Yang, Xiaotong; Liu, Sicong; Paek, Insu – Grantee Submission, 2017

The Early Fractions Test v2.2 is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test v2.2 is to serve as a measure of student outcomes in a randomized trial designed to estimate the effect of an educational…

Descriptors: Psychometrics, Mathematics Tests, Mathematics Achievement, Fractions

Evaluating Procedures for Reducing Measurement Error in Math Curriculum-Based Measurement Probes

Peer reviewed

Direct link

Methe, Scott A.; Briesch, Amy M.; Hulac, David – Assessment for Effective Intervention, 2015

At present, it is unclear whether math curriculum-based measurement (M-CBM) procedures provide a dependable measure of student progress in math computation because support for its technical properties is based largely upon a body of correlational research. Recent investigations into the dependability of M-CBM scores have found that evaluating…

Descriptors: Measurement Techniques, Error of Measurement, Mathematics Curriculum, Curriculum Based Assessment

Internet Administration of the Paper-and-Pencil Gifted Rating Scale: Assessing Psychometric Equivalence

Peer reviewed

Direct link

Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015

The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…

Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing

Robust Structural Equation Modeling with Missing Data and Auxiliary Variables

Peer reviewed

Direct link

Yuan, Ke-Hai; Zhang, Zhiyong – Psychometrika, 2012

The paper develops a two-stage robust procedure for structural equation modeling (SEM) and an R package "rsem" to facilitate the use of the procedure by applied researchers. In the first stage, M-estimates of the saturated mean vector and covariance matrix of all variables are obtained. Those corresponding to the substantive variables…

Descriptors: Structural Equation Models, Tests, Federal Aid, Psychometrics

Optimization of Answer Keys for Script Concordance Testing: Should We Exclude Deviant Panelists, Deviant Responses, or Neither?

Peer reviewed

Direct link

Gagnon, Robert; Lubarsky, Stuart; Lambert, Carole; Charlin, Bernard – Advances in Health Sciences Education, 2011

The Script Concordance Test (SCT) uses a panel-based, aggregate scoring method that aims to capture the variability of responses of experienced practitioners to particular clinical situations. The use of this type of scoring method is a key determinant of the tool's discriminatory power, but deviant answers could potentially diminish the…

Descriptors: Expertise, Oncology, Scoring, Error of Measurement

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Latent Trait Estimation: Theory vs. Practice.

Download full text

Kolakowski, Donald – 1972

Empirical results are presented as regards the implementation of a latent-trait psychometric model by means of conditional maximum likelihood estimation. Items are scored polychotomously into varying numbers of nominal categories and the test and item characteristic curves and information functions are examined. It is concluded that scoring items…

Descriptors: Error of Measurement, Item Analysis, Item Sampling, Measurement Techniques

Optimal Detection of Inappropriate Test Scores.

Drasgow, Fritz; Levine, Michael V. – 1985

Optimal appropriateness indices, recently introduced by Levine and Drasgow (1984), provide the highest rates of detection of aberrant response patterns that can be obtained from item responses. These optimal appropriateness indices are used to study three important problems in appropriateness measurement. First, the maximum detection rates of two…

Descriptors: Error of Measurement, Latent Trait Theory, Mathematical Models, Maximum Likelihood Statistics

Previous Page | Next Page »

Pages: 1 | 2

Anna-Maria Fall	2
Beula M. Magimairaj	2
Greg Roberts	2
Paek, Insu	2
Philip Capin	2
Ronald B. Gillam	2
Sandra L. Gillam	2
Schoen, Robert C.	2
Sharon Vaughn	2
Yang, Xiaotong	2
Anderson, Dan	1
Boyer, Michelle	1
Brennan, Robert L.	1
Briesch, Amy M.	1
Burkhardt, Amy	1
Charlin, Bernard	1
Davey, Tim	1
Dimitrov, Dimiter M.	1
Drasgow, Fritz	1
Gagnon, Robert	1
Grabovsky, Irina	1
Griph, Gerald W.	1
Herbert, Erin	1
Hulac, David	1
More ▼