Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 33 |
Descriptor
Evaluation Methods | 42 |
Grade 4 | 32 |
Test Validity | 22 |
Student Evaluation | 21 |
Elementary School Students | 16 |
Grade 5 | 16 |
Validity | 16 |
Grade 3 | 12 |
State Standards | 12 |
Foreign Countries | 11 |
Grade 6 | 9 |
More ▼ |
Source
Author
Cronin, John | 5 |
Bowen, Natasha K. | 2 |
Sarouphim, Ketty M. | 2 |
Anderson, Daniel | 1 |
Archwamety, Teara | 1 |
Auld, Garry | 1 |
Bailes, Lauren P. | 1 |
Bain, Sherry K. | 1 |
Banbury, Mary M. | 1 |
Bergin, Christi | 1 |
Bergin, David A. | 1 |
More ▼ |
Publication Type
Journal Articles | 28 |
Reports - Research | 23 |
Reports - Evaluative | 13 |
Numerical/Quantitative Data | 5 |
Reports - Descriptive | 5 |
Speeches/Meeting Papers | 3 |
Dissertations/Theses -… | 1 |
Tests/Questionnaires | 1 |
Education Level
Grade 4 | 42 |
Elementary Education | 28 |
Grade 5 | 23 |
Grade 3 | 19 |
Grade 8 | 16 |
Grade 6 | 14 |
Intermediate Grades | 12 |
Grade 7 | 11 |
Elementary Secondary Education | 8 |
Grade 10 | 7 |
Middle Schools | 7 |
More ▼ |
Audience
Teachers | 2 |
Location
Arizona | 4 |
Canada | 2 |
Lebanon | 2 |
Colombia | 1 |
Colorado | 1 |
Finland | 1 |
Indonesia | 1 |
Maine | 1 |
Maryland | 1 |
Michigan | 1 |
Montana | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 8 |
Assessments and Surveys
Progress in International… | 4 |
Iowa Tests of Basic Skills | 2 |
Raven Progressive Matrices | 2 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Ji, Xuejun Ryan; Wu, Amery D. – Educational Measurement: Issues and Practice, 2023
The Cross-Classified Mixed Effects Model (CCMEM) has been demonstrated to be a flexible framework for evaluating reliability by measurement specialists. Reliability can be estimated based on the variance components of the test scores. Built upon their accomplishment, this study extends the CCMEM to be used for evaluating validity evidence.…
Descriptors: Measurement, Validity, Reliability, Models
Gane, Brian D.; Israel, Maya; Elagha, Noor; Yan, Wei; Luo, Feiya; Pellegrino, James W. – Computer Science Education, 2021
Background & Context: We describe the rationale, design, and initial validation of computational thinking (CT) assessments to pair with curricular lessons that integrate fractions and CT. Objective: We used cognitive models of CT (learning trajectories; LTs) to design assessments and obtained evidence to support a validity argument. Method: We…
Descriptors: Test Construction, Test Validity, Evaluation Methods, Student Evaluation
Edward J. Kim – Annenberg Institute for School Reform at Brown University, 2022
This study introduces the signal weighted teacher value-added model (SW VAM), a value-added model that weights student-level observations based on each student's capacity to signal their assigned teacher's quality. Specifically, the model leverages the repeated appearance of a given student to estimate student reliability and sensitivity…
Descriptors: Value Added Models, Student Evaluation, Reliability, Simulation
Hautala, Jarkko; Heikkilä, Riikka; Nieminen, Lea; Rantanen, Vesa; Latvala, Juha-Matti; Richardson, Ulla – Journal of Educational Computing Research, 2020
Computerized game-based assessment (GBA) system for screening reading difficulties may provide substantial time and cost benefits over traditional paper-and-pencil assessment while providing means also to individually adapt learning content in educational games. To study the reliability and validity of a GBA system to identify struggling readers…
Descriptors: Reading Difficulties, Ability Identification, Evaluation Methods, Reliability
L. Hannah; E. E. Jang; M. Shah; V. Gupta – Language Assessment Quarterly, 2023
Machines have a long-demonstrated ability to find statistical relationships between qualities of texts and surface-level linguistic indicators of writing. More recently, unlocked by artificial intelligence, the potential of using machines to identify content-related writing trait criteria has been uncovered. This development is significant,…
Descriptors: Validity, Automation, Scoring, Writing Assignments
Desstya, Anatri; Prasetyo, Zuhdan Kun; Suyanta; Susila, Ihwan; Irwanto – International Journal of Instruction, 2019
This study aims to report the development an instrument that is standardized (reviewed by validity, reliability, and difficulty index) to detect science misconception in an elementary school teacher. This study used a 4-D model; defining, designing, developing, and disseminating. First, it was prepared with 47 opened-ended questions, and then it…
Descriptors: Elementary School Teachers, Misconceptions, Evaluation Methods, Teacher Evaluation
Bailes, Lauren P.; Nandakumar, Ratna – International Journal of Education Policy and Leadership, 2020
High-quality measurement tools are critical to school improvement efforts. Education researchers frequently employ surveys in order to assess a host of variables associated with school improvement. This article asserts that Rasch modeling techniques enhance the quality of a measurement tool because they comprise elements of both qualitative and…
Descriptors: Surveys, Evaluation Methods, Item Response Theory, Administrator Role
Dentzau, Michael W.; Martínez, Alejandro José Gallard – Environmental Education Research, 2016
A drawing assessment to gauge changes in fourth grade students' understanding of the essential components of the longleaf pine ecosystem was developed to support an out-of-school environmental education program. Pre- and post-attendance drawings were scored with a rubric that was determined to have content validity and reliability among users. In…
Descriptors: Grade 4, Elementary School Students, Environmental Education, Ecology
Shin, Youngjoon; Seo, Hae-Ae; Hong, Jun-Euy – International Baltic Symposium on Science and Technology Education, 2019
This research aimed to develop an assessment tool for students' Positive Experiences about Science (PES). A preliminary version of PSE was developed through literature review, consisting of academic emotion, self-concept, learning motivation, career aspiration, and attitude in science. A pilot test was conducted with 198 students and a main test…
Descriptors: Positive Attitudes, Student Experience, Science Education, Evaluation Methods
Petersen, Douglas B.; Gragg, Shelbi L.; Spencer, Trina D. – Language, Speech, and Hearing Services in Schools, 2018
Purpose: The purpose of this study was to examine how well a kindergarten dynamic assessment of decoding predicts future reading difficulty at 2nd, 3rd, 4th, and 5th grade and to determine whether the dynamic assessment improves the predictive validity of traditional static kindergarten reading measures. Method: With a small variation in sample…
Descriptors: Kindergarten, Grade 2, Grade 3, Grade 4
Sandilands, Debra; Oliveri, Maria Elena; Zumbo, Bruno D.; Ercikan, Kadriye – International Journal of Testing, 2013
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future…
Descriptors: Validity, Measures (Individuals), International Studies, Foreign Countries
Rutkowski, David – Assessment in Education: Principles, Policy & Practice, 2018
In this article I advocate for a new discussion in the field of international large-scale assessments; one that calls for a reexamination of international large-scale assessments (ILSAs) and their use. Expanding on the high-quality work in this special issue I focus on three inherent limitations to international large-scale assessments noted by…
Descriptors: Grade 4, Foreign Countries, Achievement Tests, Reading Achievement
Anderson, Daniel; Farley, Dan; Tindal, Gerald – Journal of Special Education, 2015
Students with significant cognitive disabilities present an assessment dilemma that centers on access and validity in large-scale testing programs. Typically, access is improved by eliminating construct-irrelevant barriers, while validity is improved, in part, through test standardization. In this article, one state's alternate assessment data…
Descriptors: Mental Retardation, Evaluation Methods, Student Evaluation, Standardized Tests
Wallen, Victoria; Cunningham-Sabo, Leslie; Auld, Garry; Romaniello, Cathy – Journal of Nutrition Education and Behavior, 2011
Objective: Determine validity of Day in the Life Questionnaire-Colorado (DILQ-CO) as a dietary assessment tool for classroom-administered use. Methods: Agreement between DILQ-CO responses and weighed plate waste measured in 125 fourth-grade students in 2 low-income schools. Validity assessed by comparing reported school lunch items and portion…
Descriptors: Nutrition, Test Validity, Questionnaires, Low Income Groups
Johansson, Stefan; Myrberg, Eva; Rosen, Monica – Educational Research and Evaluation, 2012
The purpose of the present study was to examine validity aspects of teachers' judgements of pupils' reading skills. Data come from Sweden's participation in the Progress in International Reading Literacy Study (PIRLS) 2001, for Grades 3 and 4. For pupils at the same achievement levels, as measured by PIRLS 2001 test, teachers' judgements of…
Descriptors: Reading Achievement, Reading Skills, Foreign Countries, Grade 3