NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 35 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021
In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…
Descriptors: Scaling, Ability, Computation, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Carlson, James E. – ETS Research Report Series, 2017
In this paper, I consider a set of test items that are located in a multidimensional space, S[subscript M], but are located along a curved line in S[subscript M] and can be scaled unidimensionally. Furthermore, I am demonstrating a case in which the test items are administered across 6 levels, such as occurs in K-12 assessment across 6 grade…
Descriptors: Test Items, Item Response Theory, Difficulty Level, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Andrich, David; Marais, Ida; Humphry, Stephen Mark – Educational and Psychological Measurement, 2016
Recent research has shown how the statistical bias in Rasch model difficulty estimates induced by guessing in multiple-choice items can be eliminated. Using vertical scaling of a high-profile national reading test, it is shown that the dominant effect of removing such bias is a nonlinear change in the unit of scale across the continuum. The…
Descriptors: Guessing (Tests), Statistical Bias, Item Response Theory, Multiple Choice Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014
Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…
Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations
Peer reviewed Peer reviewed
Direct linkDirect link
France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Hopfenbeck, Therese N.; Lenkeit, Jenny; El Masri, Yasmine; Cantrell, Kate; Ryan, Jeanne; Baird, Jo-Anne – Scandinavian Journal of Educational Research, 2018
International large-scale assessments are on the rise, with the Programme for International Student Assessment (PISA) seen by many as having strategic prominence in education policy debates. The present article reviews PISA-related English-language peer-reviewed articles from the programme's first cycle in 2000 to its most current in 2015. Five…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Ye, Meng; Xin, Tao – Educational and Psychological Measurement, 2014
The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…
Descriptors: Scaling, Test Items, Equated Scores, Achievement Gains
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Hee-Sun; Liu, Ou Lydia; Pallant, Amy; Roohr, Katrina Crotts; Pryputniewicz, Sarah; Buck, Zoë E. – Journal of Research in Science Teaching, 2014
Though addressing sources of uncertainty is an important part of doing science, it has largely been neglected in assessing students' scientific argumentation. In this study, we initially defined a scientific argumentation construct in four structural elements consisting of claim, justification, uncertainty qualifier, and uncertainty…
Descriptors: Persuasive Discourse, Student Evaluation, High School Students, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012
The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…
Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling
Peer reviewed Peer reviewed
Direct linkDirect link
Hartig, Johannes; Frey, Andreas; Nold, Gunter; Klieme, Eckhard – Educational and Psychological Measurement, 2012
The article compares three different methods to estimate effects of task characteristics and to use these estimates for model-based proficiency scaling: prediction of item difficulties from the Rasch model, the linear logistic test model (LLTM), and an LLTM including random item effects (LLTM+e). The methods are applied to empirical data from a…
Descriptors: Item Response Theory, Models, Methods, Computation
Irvin, P. Shawn; Saven, Jessica L.; Alonzo, Julie; Park, Bitnara Jasmine; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2012
The results of formative assessments are regularly used to inform important instructional decisions (e.g., targeted intervention) within a response to intervention (RTI) system of teaching and learning. The validity of such instructional decision-making depends, in part, on the alignment between formative measures and the academic content…
Descriptors: Elementary School Mathematics, Curriculum Based Assessment, Mathematics Tests, Academic Standards
Irvin, P. Shawn; Saven, Jessica L.; Alonzo, Julie; Park, Bitnara Jasmine; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2012
The results of formative assessments are regularly used to inform important instructional decisions (e.g., targeted intervention) within a response to intervention (RTI) system of teaching and learning. The validity of such instructional decision-making depends, in part, on the alignment between formative measures and the academic content…
Descriptors: Elementary School Mathematics, Curriculum Based Assessment, Mathematics Tests, Academic Standards
Saven, Jessica L.; Irvin, P. Shawn; Park, Bitnara Jasmine; Alonzo, Julie; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2012
The results of formative assessments are regularly used to inform important instructional decisions (e.g., targeted intervention) within a response to intervention (RTI) system of teaching and learning. The validity of such instructional decision-making depends, in part, on the alignment between formative measures and the academic content…
Descriptors: Elementary School Mathematics, Curriculum Based Assessment, Mathematics Tests, Academic Standards
Saven, Jessica L.; Irvin, P. Shawn; Park, Bitnara Jasmine; Alonzo, Julie; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2012
The results of formative assessments are regularly used to inform important instructional decisions (e.g., targeted intervention) within a response to intervention (RTI) system of teaching and learning. The validity of such instructional decision-making depends, in part, on the alignment between formative measures and the academic content…
Descriptors: Elementary School Mathematics, Curriculum Based Assessment, Mathematics Tests, Academic Standards
Irvin, P. Shawn; Saven, Jessica L.; Alonzo, Julie; Park, Bitnara Jasmine; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2012
The results of formative assessments are regularly used to inform important instructional decisions (e.g., targeted intervention) within a response to intervention (RTI) system of teaching and learning. The validity of such instructional decision-making depends, in part, on the alignment between formative measures and the academic content…
Descriptors: Elementary School Mathematics, Curriculum Based Assessment, Mathematics Tests, Academic Standards
Previous Page | Next Page »
Pages: 1  |  2  |  3