ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	9

Descriptor

Error of Measurement	11
Scaling	11
Test Reliability	11
Test Validity	8
Item Response Theory	6
Test Construction	5
Data Collection	4
Equated Scores	4
Scoring	4
Test Bias	4
Academic Achievement	3
Achievement Tests	3
Common Core State Standards	3
English	3
Grade 3	3
Grade 4	3
Grade 5	3
Grade 6	3
Grade 7	3
Grade 8	3
Language Arts	3
Language Tests	3
Mathematics Tests	3
Test Items	3
Testing	3
More ▼

Source

New York State Education…	3
Educational Measurement:…	2
EURASIA Journal of…	1
International Journal of…	1
Measurement:…	1
ProQuest LLC	1

Author

Daud, Muslem	1
Derek C. Briggs	1
Douglass, James B.	1
Elosua, Paula	1
Iliescu, Dragos	1
Jones, Eric D.	1
Kolen, Michael J.	1
Kuo, Bor-Chen	1
Laurie Davis	1
Sanford R. Student	1
Tong, Ye	1
Topczewski, Anna Marie	1
Tülin Otbiçer Acar	1
Yang, Chih-Wei	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	5
Reports - Descriptive	4
Numerical/Quantitative Data	3
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Secondary Education	6
Junior High Schools	5
Middle Schools	5
Elementary Education	4
Early Childhood Education	3
Grade 3	3
Grade 4	3
Grade 5	3
Grade 6	3
Grade 7	3
Grade 8	3
Intermediate Grades	3
Primary Education	3
Elementary Secondary Education	2
High Schools	2
Grade 9	1
Higher Education	1
More ▼

Audience

Location

New York	3
Indonesia	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

California Achievement Tests	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Comparing Measurement Reliability Estimation Techniques: Correlation Coefficient vs. Bland-Altman Plot

Peer reviewed

Direct link

Tülin Otbiçer Acar – Measurement: Interdisciplinary Research and Perspectives, 2024

The aim of this study is to compare the results of correlation coefficient estimation of reliability with those obtained through the Bland-Altman plot technique. The scale was first divided into two halves using three different approaches. A linear and high-level relationship was found between the scale scores obtained from the halved forms.…

Descriptors: High School Students, Measurement Techniques, Psychometrics, Comparative Testing

Growth across Grades and Common Item Grade Alignment in Vertical Scaling Using the Rasch Model

Peer reviewed

Direct link

Sanford R. Student; Derek C. Briggs; Laurie Davis – Educational Measurement: Issues and Practice, 2025

Vertical scales are frequently developed using common item nonequivalent group linking. In this design, one can use upper-grade, lower-grade, or mixed-grade common items to estimate the linking constants that underlie the absolute measurement of growth. Using the Rasch model and a dataset from Curriculum Associates' i-Ready Diagnostic in math in…

Descriptors: Elementary School Mathematics, Elementary School Students, Middle School Mathematics, Middle School Students

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

Multidimensional Computerized Adaptive Testing for Indonesia Junior High School Biology

Peer reviewed

Direct link

Kuo, Bor-Chen; Daud, Muslem; Yang, Chih-Wei – EURASIA Journal of Mathematics, Science & Technology Education, 2015

This paper describes a curriculum-based multidimensional computerized adaptive test that was developed for Indonesia junior high school Biology. In adherence to the Indonesian curriculum of different Biology dimensions, 300 items was constructed, and then tested to 2238 students. A multidimensional random coefficients multinomial logit model was…

Descriptors: Secondary School Science, Science Education, Science Tests, Computer Assisted Testing

New York State Testing Program 2016: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2016

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2016 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

New York State Testing Program 2015: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2015

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2015 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

New York State Testing Program 2014: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2014

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2014 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

Scaling: An Items Module

Peer reviewed

Direct link

Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010

"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…

Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores

Tests in Europe: Where We Are and Where We Should Go

Peer reviewed

Direct link

Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012

Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…

Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries

A Process for Testing a Mathematical Model for the Solution of a Practical Problem: Application to Test Equating. LES Paper on Learning and Teaching. Paper #79.

Douglass, James B. – 1979

A general process for testing the feasibility of applying alternative mathematical or statistical models to the solution of a practical problem is presented and flowcharted. The system is used to develop a plan to compare models for test equating. The five alternative models to be considered for equating are: (1) anchor test equating using…

Descriptors: Equated Scores, Error of Measurement, Latent Trait Theory, Mathematical Models

Out-of-Level Testing for Special Education Students with Mild Learning Handicaps.

Download full text

Jones, Eric D.; And Others – 1983

The purpose of this study was to evaluate the utility of out-of-level testing (OLT) when it is applied to the assessment of special education students with mild learning handicaps. This evaluation of OLT involved testing hypotheses related to: (1) the adequacy of vertical scaling, (2) the reliability and (3) the validity of OLT scores. Fifty-eight…

Descriptors: Educational Diagnosis, Error of Measurement, Guessing (Tests), Intermediate Grades