NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Practitioners1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 46 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023
Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…
Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Kelleher, Leila K.; Beach, Tyson A. C.; Frost, David M.; Johnson, Andrew M.; Dickey, James P. – Measurement in Physical Education and Exercise Science, 2018
The scoring scheme for the functional movement screen implicitly assumes that the factor structure is consistent, stable, and congruent across different populations. To determine if this is the case, we compared principal components analyses of three samples: a healthy, general population (n = 100), a group of varsity athletes (n = 101), and a…
Descriptors: Factor Structure, Test Reliability, Screening Tests, Motion
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Coniam, David; Lee, Tony; Milanovic, Michael; Pike, Nigel; Zhao, Wen – Language Education & Assessment, 2022
The calibration of test materials generally involves the interaction between empirical analysis and expert judgement. This paper explores the extent to which scale familiarity might affect expert judgement as a component of test validation in the calibration process. It forms part of a larger study that investigates the alignment of the…
Descriptors: Specialists, Language Tests, Test Validity, College Faculty
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Jing; Koenig, Kathleen; Cui, Lili; Fritchman, Joseph; Li, Dan; Sun, Wanyi; Fu, Zhao; Bao, Lei – Physical Review Physics Education Research, 2016
In a recent study, the 30-question Force Concept Inventory (FCI) was theoretically split into two 14-question "half-length" tests (HFCIs) covering the same set of concepts and producing mean scores that can be equated to those of the original FCI. The HFCIs require less administration time and reduce test-retest issues when different…
Descriptors: Physics, Scientific Concepts, Science Instruction, College Science
Peer reviewed Peer reviewed
Direct linkDirect link
Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…
Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size
Wagemaker, Hans, Ed. – International Association for the Evaluation of Educational Achievement, 2020
Although International Association for the Evaluation of Educational Achievement-pioneered international large-scale assessment (ILSA) of education is now a well-established science, non-practitioners and many users often substantially misunderstand how large-scale assessments are conducted, what questions and challenges they are designed to…
Descriptors: International Assessment, Achievement Tests, Educational Assessment, Comparative Analysis
Mitchell, Alison M.; Truckenmiller, Adrea; Petscher, Yaacov – Communique, 2015
As part of the Race to the Top initiative, the United States Department of Education made nearly 1 billion dollars available in State Educational Technology grants with the goal of ramping up school technology. One result of this effort is that states, districts, and schools across the country are using computerized assessments to measure their…
Descriptors: Computer Assisted Testing, Educational Technology, Testing, Efficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Hui-Chuan – International Journal of Mathematical Education in Science and Technology, 2014
This study examines students' procedural and conceptual achievement in fraction addition in England and Taiwan. A total of 1209 participants (561 British students and 648 Taiwanese students) at ages 12 and 13 were recruited from England and Taiwan to take part in the study. A quantitative design by means of a self-designed written test is adopted…
Descriptors: Comparative Analysis, Addition, Mathematics Instruction, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Hirai, Akiyo; Koizumi, Rie – Language Assessment Quarterly, 2013
In recognition of the rating scale as a crucial tool of performance assessment, this study aims to establish a rating scale suitable for a Story Retelling Speaking Test (SRST), which is a semidirect test of speaking ability in English as a foreign language for classroom use. To identify an appropriate scale, three rating scales, all of which have…
Descriptors: Test Validity, Rating Scales, Story Telling, Speech Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Carmichael, Jessica A.; Fraccaro, Rebecca L.; Nordstokke, David W. – Canadian Journal of School Psychology, 2014
Oral language skills are important to consider in school psychology practice, as they are directly tied to many areas of academic functioning. For example, research has demonstrated that oral language skills in early elementary school predict reading comprehension in later grades (Kendeou, van den Broek, White, & Lynch, 2009). With a…
Descriptors: Language Tests, Oral Language, Language Skills, School Psychology
Peer reviewed Peer reviewed
Direct linkDirect link
Newhouse, C. Paul – Technology, Pedagogy and Education, 2015
This paper reports on the outcomes of a three-year study investigating the use of digital technologies to increase the authenticity of high-stakes summative assessment in four Western Australian senior secondary courses. The study involved 82 teachers and 1015 students and a range of digital forms of assessment using computer-based exams, digital…
Descriptors: Educational Technology, High Stakes Tests, Summative Evaluation, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie – Measurement and Evaluation in Counseling and Development, 2013
Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
Descriptors: Item Response Theory, Test Theory, Measures (Individuals), Racial Identification
Havlin, Patricia J. – ProQuest LLC, 2013
Writing assessments have taken two primary forms in the past two decades: direct and indirect. Irrespective of type, either form needs to be anchored to making decisions in the classroom and predicting performance on high-stakes tests, particularly in a high-stakes environment with serious consequences. In this study, 11th-grade students were…
Descriptors: Writing Evaluation, Grade 11, High School Students, Writing Assignments
Peer reviewed Peer reviewed
Direct linkDirect link
Heldsinger, Sandra A.; Humphry, Stephen M. – Educational Research, 2013
Background: Many in education argue for the importance of incorporating teacher judgements in the assessment and reporting of student performance. Advocates of such an approach are cognisant, though, that obtaining a satisfactory level of consistency in teacher judgements poses a challenge. Purpose: This study investigates the extent to which the…
Descriptors: Evaluation Methods, Student Evaluation, Teacher Attitudes, Comparative Analysis
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4