ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	18

Descriptor

Error of Measurement	26
Scoring	26
Test Items	26
Item Response Theory	9
Psychometrics	8
Test Reliability	7
Mathematics Tests	6
Comparative Analysis	5
Difficulty Level	5
Test Construction	5
Test Validity	5
Accuracy	4
Adaptive Testing	4
Computation	4
Computer Assisted Testing	4
English	4
Goodness of Fit	4
Interrater Reliability	4
Item Analysis	4
Simulation	4
Statistical Bias	4
Cutting Scores	3
Equated Scores	3
Generalizability Theory	3
Mathematics Achievement	3
More ▼

Source

Educational and Psychological…	5
Grantee Submission	3
Applied Measurement in…	2
Educational Measurement:…	2
Educational Testing Service	2
New Mexico Public Education…	2
Applied Psychological…	1
ETS Research Report Series	1
ProQuest LLC	1

Publication Type

Reports - Research	17
Journal Articles	11
Reports - Descriptive	5
Speeches/Meeting Papers	5
Reports - Evaluative	3
Numerical/Quantitative Data	2
Dissertations/Theses -…	1

Education Level

Elementary Education	4
Early Childhood Education	3
Primary Education	3
Elementary Secondary Education	2
Grade 3	2
Grade 4	2
Grade 2	1
Grade 5	1
Intermediate Grades	1

Audience

Researchers

Location

New Mexico	2
Florida	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Medical College Admission Test	1
National Assessment of…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

A Multidimensional Item Response Theory Model for Continuous and Graded Responses with Error in Persons and Items

Peer reviewed

Direct link

Ferrando, Pere J.; Navarro-González, David – Educational and Psychological Measurement, 2021

Item response theory "dual" models (DMs) in which both items and individuals are viewed as sources of differential measurement error so far have been proposed only for unidimensional measures. This article proposes two multidimensional extensions of existing DMs: the M-DTCRM (dual Thurstonian continuous response model), intended for…

Descriptors: Item Response Theory, Error of Measurement, Models, Factor Analysis

Modeling of Item Response Functions under the D-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2020

This study presents new models for item response functions (IRFs) in the framework of the D-scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM…

Descriptors: Item Response Theory, Scoring, True Scores, Scaling

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Peer reviewed

Direct link

Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…

Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests

Imputation Methods to Deal with Missing Responses in Computerized Adaptive Multistage Testing

Peer reviewed

Direct link

Cetin-Berber, Dee Duygu; Sari, Halil Ibrahim; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019

Routing examinees to modules based on their ability level is a very important aspect in computerized adaptive multistage testing. However, the presence of missing responses may complicate estimation of examinee ability, which may result in misrouting of individuals. Therefore, missing responses should be handled carefully. This study investigated…

Descriptors: Computer Assisted Testing, Adaptive Testing, Error of Measurement, Research Problems

A Fair Comparison of the Performance of Computerized Adaptive Testing and Multistage Adaptive Testing

Direct link

Wang, Keyin – ProQuest LLC, 2017

The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same…

Descriptors: Comparative Analysis, Computer Assisted Testing, Adaptive Testing, Test Items

Development and Initial Field Test of the 2016 K-TEEM (Knowledge for Teaching Early Elementary Mathematics) Test. Research Report No. 2019-01

Download full text

Direct link

Schoen, Robert C.; Yang, Xiaotong; Tazaz, Amanda M.; Bray, Wendy S.; Farina, Kristy – Grantee Submission, 2019

The "2016 Knowledge for Teaching Early Elementary Mathematics" (2016 K-TEEM) test measures teachers' mathematical knowledge for teaching early elementary mathematics. The 2016 K-TEEM is the third version of the K-TEEM (Schoen, Bray, Wolfe, Tazaz, & Nielsen, 2017). In this report, we present results of the first large-scale field test…

Descriptors: Test Construction, Elementary School Mathematics, Elementary School Teachers, Knowledge Base for Teaching

Psychometric Report on the Knowledge for Teaching Elementary Fractions Test Administered to Elementary Educators in Six States in Spring 2017. Research Report No. 2018-13

Download full text

Schoen, Robert C.; Yang, Xiaotong; Paek, Insu – Grantee Submission, 2018

This report provides evidence of the substantive and structural validity of the Knowledge for Teaching Elementary Fractions Test. Field-test data were gathered with a sample of 241 elementary educators, including teachers, administrators, and instructional support personnel, in spring 2017, as part of a larger study involving a multisite…

Descriptors: Psychometrics, Pedagogical Content Knowledge, Mathematics Tests, Mathematics Instruction

Psychometric Report for the Early Fractions Test (Version 2.2) Administered with Third- and Fourth-Grade Students in Spring 2017. Research Report No. 2017-11

Download full text

Schoen, Robert C.; Yang, Xiaotong; Liu, Sicong; Paek, Insu – Grantee Submission, 2017

The Early Fractions Test v2.2 is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test v2.2 is to serve as a measure of student outcomes in a randomized trial designed to estimate the effect of an educational…

Descriptors: Psychometrics, Mathematics Tests, Mathematics Achievement, Fractions

Rater Language Background as a Source of Measurement Error in the Testing of English Language Learners

Peer reviewed

Direct link

Kachchaf, Rachel; Solano-Flores, Guillermo – Applied Measurement in Education, 2012

We examined how rater language background affects the scoring of short-answer, open-ended test items in the assessment of English language learners (ELLs). Four native English and four native Spanish-speaking certified bilingual teachers scored 107 responses of fourth- and fifth-grade Spanish-speaking ELLs to mathematics items administered in…

Descriptors: Error of Measurement, English Language Learners, Scoring, Bilingual Teachers

Single- versus Double-Scoring of Trend Responses in Trend Score Equating with Constructed-Response Tests. Research Report. ETS RR-10-12

Download full text

Tan, Xuan; Ricker, Kathryn L.; Puhan, Gautam – Educational Testing Service, 2010

This study examines the differences in equating outcomes between two trend score equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor…

Descriptors: Equated Scores, Scoring, Responses, Test Items

DIF Trees: Using Classification Trees to Detect Differential Item Functioning

Peer reviewed

Direct link

Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010

A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Effects of Assigning Raters to Items

Peer reviewed

Direct link

Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008

Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…

Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring

Previous Page | Next Page »

Pages: 1 | 2

Schoen, Robert C.	3
Yang, Xiaotong	3
Lee, Won-Chan	2
Paek, Insu	2
Allen, Sally	1
Anderson, Dan	1
Bejar, Isaac I.	1
Bray, Wendy S.	1
Brennan, Robert L.	1
Bulut, Okan	1
Carlson, James E.	1
Cetin-Berber, Dee Duygu	1
Davey, Tim	1
Dimitrov, Dimiter M.	1
Farina, Kristy	1
Ferrando, Pere J.	1
Gorgun, Guher	1
Griph, Gerald W.	1
Herbert, Erin	1
Huggins-Manley, Anne Corinne	1
Ito, Kyoko	1
Kachchaf, Rachel	1
Kim, Sooyeon	1
Kim, Stella Yun	1
More ▼