NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 16 to 30 of 1,166 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Alison M. Hardy; J. E. Miller; Sarah R. Powell; Nancy Scammacca – Psychology in the Schools, 2025
Students encounter hundreds of word problems throughout the elementary grades and on standardized assessments through high school. To demonstrate proficiency on these measures of mathematics competency, students must be skilled in solving word problems. Early detection of word-problem difficulty is essential, and screeners play an important role…
Descriptors: Elementary School Students, Word Problems (Mathematics), Mathematics Education, Secondary School Students
Yvette Jackson – ProQuest LLC, 2023
Rater-mediated activities in educational research occur when an expert judge or rater utilizes an instrument to judge persons or items and generates scale scores. Scale scores are from a subjective judgment and must undergo a quality control measure called rating quality. Rating quality in this study is broadly defined as the extent to which…
Descriptors: Educational Research, Evaluators, Test Theory, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Basman, Munevver – International Journal of Assessment Tools in Education, 2023
To ensure the validity of the tests is to check that all items have similar results across different groups of individuals. However, differential item functioning (DIF) occurs when the results of individuals with equal ability levels from different groups differ from each other on the same test item. Based on Item Response Theory and Classic Test…
Descriptors: Test Bias, Test Items, Test Validity, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Xian Wu; Yaoguang Li; Sanjay N. Rebello – Physical Review Physics Education Research, 2025
The Energy and Momentum Conceptual Survey (EMCS) is an instrument designed to assess students' understanding of key physics concepts in introductory mechanics. In this study, we applied a combination of classical test theory (CTT), item response theory (IRT), and exploratory factor analysis (EFA) to a dataset of over 10 000 students collected…
Descriptors: Physics, Scientific Concepts, Concept Formation, Energy
Peer reviewed Peer reviewed
Direct linkDirect link
Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022
Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…
Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Mehmet Fatih Doguyurt; Seref Tan – International Journal of Assessment Tools in Education, 2025
This study investigates the impact of violating the local item independence assumption by loading certain items onto a second dimension on test equating errors in unidimensional and dichotomous tests. The research was designed as a simulation study, using data generated based on the PISA 2018 mathematics exam. Analyses were conducted under 36…
Descriptors: Equated Scores, Test Items, Mathematics Tests, International Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kartianom Kartianom; Heri Retnawati; Kana Hidayati – Journal of Pedagogical Research, 2024
Conducting a fair test is important for educational research. Unfair assessments can lead to gender disparities in academic achievement, ultimately resulting in disparities in opportunities, wages, and career choice. Differential Item Function [DIF] analysis is presented to provide evidence of whether the test is truly fair, where it does not harm…
Descriptors: Foreign Countries, Test Bias, Item Response Theory, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Diana Muela-Bermejo; Irene Mendoza-Cercadillo; Lucía Hernández-Heras – Journal of Adolescent & Adult Literacy, 2024
This study involves translating, cross-culturally adapting, and validating the "Literary Response Questionnaire" (LRQ) for 413 Spanish adolescents. It explores the evolution of literary education in Spain and its alignment with the Reading Responses paradigm. The LRQ, adapted across various locations, is validated in Spanish through…
Descriptors: Reader Response, Adolescents, Questionnaires, Translation
Peer reviewed Peer reviewed
Direct linkDirect link
Richards, Adam S. – Communication Teacher, 2021
Course: Communication Research Methods. Objectives: This activity provides students with an experiential introduction to measurement theory and the methods for assessing measurement reliability. First, multiple measurements of a person's height are interpreted according to classical test theory. Second, the measurement of human height is used as…
Descriptors: Body Height, Measurement, Communication Research, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Gayle Geschwind; Michael Vignal; Marcos D. Caballero; H.? J. Lewandowski – Physical Review Physics Education Research, 2024
The Survey of Physics Reasoning on Uncertainty Concepts in Experiments (SPRUCE) was designed to measure students' proficiency with measurement uncertainty concepts and practices across ten different assessment objectives to help facilitate the improvement of laboratory instruction focused on this important topic. To ensure the reliability and…
Descriptors: Measurement, Ambiguity (Context), Scientific Concepts, Physics
Peer reviewed Peer reviewed
Direct linkDirect link
Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020
The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…
Descriptors: Test Bias, Interrater Reliability, Responses, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Nicolas Rochat; Laurent Lima; Pascal Bressoux – Journal of Psychoeducational Assessment, 2025
Inference is considered an important factor in comprehension models and has been described as a causal factor in predicting comprehension. To date, specific tests for inference are rare and often rely on specific thematic texts. This reliance on thematic inference may raise some concerns as inference is related to prior text-specific knowledge.…
Descriptors: Inferences, Reading Comprehension, Reading Tests, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Ser Ming Mark Lee; Wei Cheng Liu – Asia Pacific Journal of Education, 2024
Programme evaluation has developed tremendously over the past 50 years, with a proliferation of evaluation research, an increase in the institutionalization of evaluation, and growth in the professionalization of evaluation. However, existing research and developments are still largely in North America, Europe, Australia, and New Zealand, with…
Descriptors: Foreign Countries, Evaluation Research, Evaluation Methods, Evaluation Criteria
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Polat, Murat; Turhan, Nihan S.; Toraman, Cetin – Pegem Journal of Education and Instruction, 2022
Testing English writing skills could be multi-dimensional; thus, the study aimed to compare students' writing scores calculated according to Classical Test Theory (CTT) and Multi-Facet Rasch Model (MFRM). The research was carried out in 2019 with 100 university students studying at a foreign language preparatory class and four experienced…
Descriptors: Comparative Analysis, Test Theory, Item Response Theory, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  78