Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 8 |
Descriptor
Achievement Tests | 18 |
Item Response Theory | 18 |
Testing Programs | 18 |
Test Construction | 9 |
Test Validity | 9 |
Equated Scores | 8 |
Scaling | 8 |
Mathematics Tests | 6 |
Academic Achievement | 5 |
Elementary Secondary Education | 5 |
Scoring | 5 |
More ▼ |
Source
New York State Education… | 3 |
Applied Psychological… | 2 |
Educational Measurement:… | 2 |
Educational and Psychological… | 1 |
Grantee Submission | 1 |
Journal of Applied Testing… | 1 |
Journal of Educational… | 1 |
Author
Forsyth, Robert A. | 2 |
Brennan, Robert L. | 1 |
Cai, Li | 1 |
Carvajal, Jorge | 1 |
Hansen, Mark | 1 |
Ingebo, George S. | 1 |
Jiao, Hong | 1 |
Kahl, Stuart R. | 1 |
Klein, Thomas W. | 1 |
Li, Ying | 1 |
Li, Zhen | 1 |
More ▼ |
Publication Type
Education Level
Early Childhood Education | 3 |
Elementary Education | 3 |
Grade 3 | 3 |
Grade 4 | 3 |
Grade 5 | 3 |
Grade 6 | 3 |
Grade 7 | 3 |
Grade 8 | 3 |
Intermediate Grades | 3 |
Junior High Schools | 3 |
Middle Schools | 3 |
More ▼ |
Audience
Location
New York | 3 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 3 |
National Longitudinal Study… | 1 |
North Carolina End of Course… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016
Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
New York State Education Department, 2016
This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2016 Operational Tests. This report includes information about test content and test development, item (i.e.,…
Descriptors: Testing Programs, English, Language Arts, Mathematics Tests
Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012
This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…
Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)
New York State Education Department, 2015
This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2015 Operational Tests. This report includes information about test content and test development, item (i.e.,…
Descriptors: Testing Programs, English, Language Arts, Mathematics Tests
New York State Education Department, 2014
This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2014 Operational Tests. This report includes information about test content and test development, item (i.e.,…
Descriptors: Testing Programs, English, Language Arts, Mathematics Tests
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Brennan, Robert L. – Applied Psychological Measurement, 2008
The discussion here covers five articles that are linked in the sense that they all treat population invariance. This discussion of population invariance is a somewhat broader treatment of the subject than simply a discussion of these five articles. In particular, occasional reference is made to publications other than those in this issue. The…
Descriptors: Advanced Placement, Law Schools, Science Achievement, Achievement Tests
Petersen, Nancy S. – Applied Psychological Measurement, 2008
This article discusses the five studies included in this issue. Each article addressed the same topic, population invariance of equating. They all used data from major standardized testing programs, and they all used essentially the same statistics to evaluate their results, namely, the root mean square difference and root expected mean square…
Descriptors: Testing Programs, Standardized Tests, Equated Scores, Evaluation Methods

Williams, Valerie S. L.; Pommerich, Mary; Thissen, David – Journal of Educational Measurement, 1998
Created a developmental scale for the North Carolina End-of-Grade Mathematics Tests using a subset of identical test forms administered to adjacent grade levels with Thurstone scaling and Item Response Theory methods. Discusses differences in patterns produced. (Author/SLD)
Descriptors: Achievement Tests, Child Development, Comparative Analysis, Elementary Secondary Education
Sireci, Stephen G. – 1996
Test developers continue to struggle with the technical and logistical problems inherent in assessing achievement across different languages. Many testing programs offer separate language versions of a test to evaluate the achievement of examinees in different language groups. However, comparison of individuals who took different language versions…
Descriptors: Achievement Tests, Bilingual Education, Comparative Analysis, Educational Assessment
Forsyth, Robert A. – 1990
The validity of criterion-referenced interpretations of the proficiency scales of the National Assessment of Educational Progress (NAEP) is discussed. A major goal of the NAEP scales is to describe student achievement in specific content areas from grade 3 (age 9 years) through grade 11 (age 17 years). The numerical values for NAEP scales are…
Descriptors: Academic Achievement, Achievement Tests, Criterion Referenced Tests, Elementary Secondary Education
Ingebo, George S. – 1997
This book shows the advantages of Rasch measurement (G. Rasch) for school district testing programs. The results of Rasch methods are contrasted with conventional statistics for assessing student responses to basic skills testing. Chapter 1 shows how the Rasch probability-based method produces measures that are more useful for students, parents,…
Descriptors: Academic Achievement, Achievement Tests, Elementary Secondary Education, Item Banks

Forsyth, Robert A. – Educational Measurement: Issues and Practice, 1991
The scales of the National Assessment of Educational Progress (NAEP), as constructed, do not yield meaningful criterion-referenced interpretations. Poorly defined NAEP goals and the present knowledge base do not allow the measurement of what examinees can and cannot do. Inappropriate interpretations of NAEP data are discussed, with specific…
Descriptors: Achievement Tests, Criterion Referenced Tests, Educational Assessment, Item Response Theory

Zwick, Rebecca – Educational Measurement: Issues and Practice, 1991
Item parameter estimates derived through item response theory methods have been considered relatively robust to changes in item position and context, but the anomaly in reading scores from the 1986 National Assessment of Educational Progress (NAEP) illustrates problems with common population equating procedures when there are test form changes.…
Descriptors: Achievement Tests, Context Effect, Equated Scores, Estimation (Mathematics)
Kahl, Stuart R.; And Others – 1995
The assessment instruments of the Maine Educational Assessment emphasize extended constructed-response questions. The results from these assessments are reported in terms of percentages of students at four performance levels. The Student-Based Constructed-Response Method was used to establish performance standards for these levels on the…
Descriptors: Academic Standards, Achievement Tests, Constructed Response, Cutting Scores
Previous Page | Next Page ยป
Pages: 1 | 2