ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	8

Descriptor

Achievement Tests	18
Item Response Theory	18
Testing Programs	18
Test Construction	9
Test Validity	9
Equated Scores	8
Scaling	8
Mathematics Tests	6
Academic Achievement	5
Elementary Secondary Education	5
Scoring	5
Test Reliability	5
Data Collection	4
Educational Assessment	4
Grade 8	4
Language Tests	4
National Programs	4
State Programs	4
Test Bias	4
Testing Problems	4
Common Core State Standards	3
Comparative Analysis	3
English	3
Error of Measurement	3
Grade 3	3
More ▼

Source

New York State Education…	3
Applied Psychological…	2
Educational Measurement:…	2
Educational and Psychological…	1
Grantee Submission	1
Journal of Applied Testing…	1
Journal of Educational…	1

Publication Type

Journal Articles	7
Reports - Evaluative	7
Numerical/Quantitative Data	5
Reports - Descriptive	4
Reports - Research	4
Books	2
Information Analyses	2
Speeches/Meeting Papers	2
Collected Works - General	1
Opinion Papers	1
Tests/Questionnaires	1
More ▼

Education Level

Early Childhood Education	3
Elementary Education	3
Grade 3	3
Grade 4	3
Grade 5	3
Grade 6	3
Grade 7	3
Grade 8	3
Intermediate Grades	3
Junior High Schools	3
Middle Schools	3
Primary Education	3
Secondary Education	3
Elementary Secondary Education	2
Higher Education	2
More ▼

Audience

Location

New York

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	3
National Longitudinal Study…	1
North Carolina End of Course…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Limited-Information Goodness-of-Fit Testing of Diagnostic Classification Item Response Models

Peer reviewed
PDF on ERIC

Download full text

Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016

Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…

Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics

New York State Testing Program 2016: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2016

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2016 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

Applying Multidimensional Item Response Theory Models in Validating Test Dimensionality: An Example of K-12 Large-Scale Science Assessment

Peer reviewed

Direct link

Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012

This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…

Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)

New York State Testing Program 2015: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2015

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2015 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

New York State Testing Program 2014: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2014

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2014 Operational Tests. This report includes information about test content and test development, item (i.e.,…

Descriptors: Testing Programs, English, Language Arts, Mathematics Tests

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

A Discussion of Population Invariance

Peer reviewed

Direct link

Brennan, Robert L. – Applied Psychological Measurement, 2008

The discussion here covers five articles that are linked in the sense that they all treat population invariance. This discussion of population invariance is a somewhat broader treatment of the subject than simply a discussion of these five articles. In particular, occasional reference is made to publications other than those in this issue. The…

Descriptors: Advanced Placement, Law Schools, Science Achievement, Achievement Tests

A Discussion of Population Invariance of Equating

Peer reviewed

Direct link

Petersen, Nancy S. – Applied Psychological Measurement, 2008

This article discusses the five studies included in this issue. Each article addressed the same topic, population invariance of equating. They all used data from major standardized testing programs, and they all used essentially the same statistics to evaluate their results, namely, the root mean square difference and root expected mean square…

Descriptors: Testing Programs, Standardized Tests, Equated Scores, Evaluation Methods

A Comparison of Developmental Scales Based on Thurstone Methods and Item Response Theory.

Peer reviewed

Williams, Valerie S. L.; Pommerich, Mary; Thissen, David – Journal of Educational Measurement, 1998

Created a developmental scale for the North Carolina End-of-Grade Mathematics Tests using a subset of identical test forms administered to adjacent grade levels with Thurstone scaling and Item Response Theory methods. Discusses differences in patterns produced. (Author/SLD)

Descriptors: Achievement Tests, Child Development, Comparative Analysis, Elementary Secondary Education

Technical Issues in Linking Assessments across Languages.

Download full text

Sireci, Stephen G. – 1996

Test developers continue to struggle with the technical and logistical problems inherent in assessing achievement across different languages. Many testing programs offer separate language versions of a test to evaluate the achievement of examinees in different language groups. However, comparison of individuals who took different language versions…

Descriptors: Achievement Tests, Bilingual Education, Comparative Analysis, Educational Assessment

The NAEP Proficiency Scales: Do They Yield Valid Criterion-Referenced Interpretations? Iowa Testing Programs Occasional Papers Number 35, May 1990.

Download full text

Forsyth, Robert A. – 1990

The validity of criterion-referenced interpretations of the proficiency scales of the National Assessment of Educational Progress (NAEP) is discussed. A major goal of the NAEP scales is to describe student achievement in specific content areas from grade 3 (age 9 years) through grade 11 (age 17 years). The numerical values for NAEP scales are…

Descriptors: Academic Achievement, Achievement Tests, Criterion Referenced Tests, Elementary Secondary Education

Probability in the Measure of Achievement.

Ingebo, George S. – 1997

This book shows the advantages of Rasch measurement (G. Rasch) for school district testing programs. The results of Rasch methods are contrasted with conventional statistics for assessing student responses to basic skills testing. Chapter 1 shows how the Rasch probability-based method produces measures that are more useful for students, parents,…

Descriptors: Academic Achievement, Achievement Tests, Elementary Secondary Education, Item Banks

Do NAEP Scales Yield Valid Criterion-Referenced Interpretations?

Peer reviewed

Forsyth, Robert A. – Educational Measurement: Issues and Practice, 1991

The scales of the National Assessment of Educational Progress (NAEP), as constructed, do not yield meaningful criterion-referenced interpretations. Poorly defined NAEP goals and the present knowledge base do not allow the measurement of what examinees can and cannot do. Inappropriate interpretations of NAEP data are discussed, with specific…

Descriptors: Achievement Tests, Criterion Referenced Tests, Educational Assessment, Item Response Theory

Effects of Item Order and Context on Estimation of NAEP Reading Proficiency.

Peer reviewed

Zwick, Rebecca – Educational Measurement: Issues and Practice, 1991

Item parameter estimates derived through item response theory methods have been considered relatively robust to changes in item position and context, but the anomaly in reading scores from the 1986 National Assessment of Educational Progress (NAEP) illustrates problems with common population equating procedures when there are test form changes.…

Descriptors: Achievement Tests, Context Effect, Equated Scores, Estimation (Mathematics)

Setting Standards for Performance Levels Using the Student-Based Constructed-Response Method.

Download full text

Kahl, Stuart R.; And Others – 1995

The assessment instruments of the Maine Educational Assessment emphasize extended constructed-response questions. The results from these assessments are reported in terms of percentages of students at four performance levels. The Student-Based Constructed-Response Method was used to establish performance standards for these levels on the…

Descriptors: Academic Standards, Achievement Tests, Constructed Response, Cutting Scores

Previous Page | Next Page »

Pages: 1 | 2

Forsyth, Robert A.	2
Brennan, Robert L.	1
Cai, Li	1
Carvajal, Jorge	1
Hansen, Mark	1
Ingebo, George S.	1
Jiao, Hong	1
Kahl, Stuart R.	1
Klein, Thomas W.	1
Li, Ying	1
Li, Zhen	1
Linn, Robert L., Ed.	1
Lissitz, Robert W.	1
Monroe, Scott	1
Petersen, Nancy S.	1
Pommerich, Mary	1
Rock, Donald A.	1
Sireci, Stephen G.	1
Skorupski, William P.	1
Thissen, David	1
Williams, Valerie S. L.	1
Zwick, Rebecca	1
More ▼