NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Teachers2
Laws, Policies, & Programs
No Child Left Behind Act 20018
What Works Clearinghouse Rating
Showing 1 to 15 of 42 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ji, Xuejun Ryan; Wu, Amery D. – Educational Measurement: Issues and Practice, 2023
The Cross-Classified Mixed Effects Model (CCMEM) has been demonstrated to be a flexible framework for evaluating reliability by measurement specialists. Reliability can be estimated based on the variance components of the test scores. Built upon their accomplishment, this study extends the CCMEM to be used for evaluating validity evidence.…
Descriptors: Measurement, Validity, Reliability, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Gane, Brian D.; Israel, Maya; Elagha, Noor; Yan, Wei; Luo, Feiya; Pellegrino, James W. – Computer Science Education, 2021
Background & Context: We describe the rationale, design, and initial validation of computational thinking (CT) assessments to pair with curricular lessons that integrate fractions and CT. Objective: We used cognitive models of CT (learning trajectories; LTs) to design assessments and obtained evidence to support a validity argument. Method: We…
Descriptors: Test Construction, Test Validity, Evaluation Methods, Student Evaluation
Edward J. Kim – Annenberg Institute for School Reform at Brown University, 2022
This study introduces the signal weighted teacher value-added model (SW VAM), a value-added model that weights student-level observations based on each student's capacity to signal their assigned teacher's quality. Specifically, the model leverages the repeated appearance of a given student to estimate student reliability and sensitivity…
Descriptors: Value Added Models, Student Evaluation, Reliability, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Hautala, Jarkko; Heikkilä, Riikka; Nieminen, Lea; Rantanen, Vesa; Latvala, Juha-Matti; Richardson, Ulla – Journal of Educational Computing Research, 2020
Computerized game-based assessment (GBA) system for screening reading difficulties may provide substantial time and cost benefits over traditional paper-and-pencil assessment while providing means also to individually adapt learning content in educational games. To study the reliability and validity of a GBA system to identify struggling readers…
Descriptors: Reading Difficulties, Ability Identification, Evaluation Methods, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
L. Hannah; E. E. Jang; M. Shah; V. Gupta – Language Assessment Quarterly, 2023
Machines have a long-demonstrated ability to find statistical relationships between qualities of texts and surface-level linguistic indicators of writing. More recently, unlocked by artificial intelligence, the potential of using machines to identify content-related writing trait criteria has been uncovered. This development is significant,…
Descriptors: Validity, Automation, Scoring, Writing Assignments
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Desstya, Anatri; Prasetyo, Zuhdan Kun; Suyanta; Susila, Ihwan; Irwanto – International Journal of Instruction, 2019
This study aims to report the development an instrument that is standardized (reviewed by validity, reliability, and difficulty index) to detect science misconception in an elementary school teacher. This study used a 4-D model; defining, designing, developing, and disseminating. First, it was prepared with 47 opened-ended questions, and then it…
Descriptors: Elementary School Teachers, Misconceptions, Evaluation Methods, Teacher Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bailes, Lauren P.; Nandakumar, Ratna – International Journal of Education Policy and Leadership, 2020
High-quality measurement tools are critical to school improvement efforts. Education researchers frequently employ surveys in order to assess a host of variables associated with school improvement. This article asserts that Rasch modeling techniques enhance the quality of a measurement tool because they comprise elements of both qualitative and…
Descriptors: Surveys, Evaluation Methods, Item Response Theory, Administrator Role
Peer reviewed Peer reviewed
Direct linkDirect link
Dentzau, Michael W.; Martínez, Alejandro José Gallard – Environmental Education Research, 2016
A drawing assessment to gauge changes in fourth grade students' understanding of the essential components of the longleaf pine ecosystem was developed to support an out-of-school environmental education program. Pre- and post-attendance drawings were scored with a rubric that was determined to have content validity and reliability among users. In…
Descriptors: Grade 4, Elementary School Students, Environmental Education, Ecology
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Shin, Youngjoon; Seo, Hae-Ae; Hong, Jun-Euy – International Baltic Symposium on Science and Technology Education, 2019
This research aimed to develop an assessment tool for students' Positive Experiences about Science (PES). A preliminary version of PSE was developed through literature review, consisting of academic emotion, self-concept, learning motivation, career aspiration, and attitude in science. A pilot test was conducted with 198 students and a main test…
Descriptors: Positive Attitudes, Student Experience, Science Education, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Petersen, Douglas B.; Gragg, Shelbi L.; Spencer, Trina D. – Language, Speech, and Hearing Services in Schools, 2018
Purpose: The purpose of this study was to examine how well a kindergarten dynamic assessment of decoding predicts future reading difficulty at 2nd, 3rd, 4th, and 5th grade and to determine whether the dynamic assessment improves the predictive validity of traditional static kindergarten reading measures. Method: With a small variation in sample…
Descriptors: Kindergarten, Grade 2, Grade 3, Grade 4
Peer reviewed Peer reviewed
Direct linkDirect link
Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…
Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Rutkowski, David – Assessment in Education: Principles, Policy & Practice, 2018
In this article I advocate for a new discussion in the field of international large-scale assessments; one that calls for a reexamination of international large-scale assessments (ILSAs) and their use. Expanding on the high-quality work in this special issue I focus on three inherent limitations to international large-scale assessments noted by…
Descriptors: Grade 4, Foreign Countries, Achievement Tests, Reading Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Anderson, Daniel; Farley, Dan; Tindal, Gerald – Journal of Special Education, 2015
Students with significant cognitive disabilities present an assessment dilemma that centers on access and validity in large-scale testing programs. Typically, access is improved by eliminating construct-irrelevant barriers, while validity is improved, in part, through test standardization. In this article, one state's alternate assessment data…
Descriptors: Mental Retardation, Evaluation Methods, Student Evaluation, Standardized Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wallen, Victoria; Cunningham-Sabo, Leslie; Auld, Garry; Romaniello, Cathy – Journal of Nutrition Education and Behavior, 2011
Objective: Determine validity of Day in the Life Questionnaire-Colorado (DILQ-CO) as a dietary assessment tool for classroom-administered use. Methods: Agreement between DILQ-CO responses and weighed plate waste measured in 125 fourth-grade students in 2 low-income schools. Validity assessed by comparing reported school lunch items and portion…
Descriptors: Nutrition, Test Validity, Questionnaires, Low Income Groups
Peer reviewed Peer reviewed
Direct linkDirect link
Johansson, Stefan; Myrberg, Eva; Rosen, Monica – Educational Research and Evaluation, 2012
The purpose of the present study was to examine validity aspects of teachers' judgements of pupils' reading skills. Data come from Sweden's participation in the Progress in International Reading Literacy Study (PIRLS) 2001, for Grades 3 and 4. For pupils at the same achievement levels, as measured by PIRLS 2001 test, teachers' judgements of…
Descriptors: Reading Achievement, Reading Skills, Foreign Countries, Grade 3
Previous Page | Next Page »
Pages: 1  |  2  |  3