NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Pell Grant Program1
What Works Clearinghouse Rating
Does not meet standards1
Showing 1 to 15 of 71 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Weingarden, Merav; Heyd-Metzuyanim, Einat – Journal of Mathematics Teacher Education, 2023
In this study, we examine "what went wrong" in our professional development program for encouraging cognitively demanding instruction, focusing on the difficulties we encountered in using an observational tool for evaluating this type of instruction and reaching inter-rater reliability. We do so through the lens of a discursive theory of…
Descriptors: Mathematics Instruction, Interrater Reliability, Cognitive Processes, Difficulty Level
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Uyar, Seyma; Yayla, Onur; Zunber, Hidayet – International Journal of Assessment Tools in Education, 2022
The purpose of the current study is to examine the map reading skills of Social Studies pre-service teachers with orienteering, which is an activity-based and more active practice. To this end, a total of 10 students attending the Department of Social Studies Teaching in the Education Faculty of Burdur Mehmet Akif Ersoy University and taking the…
Descriptors: Map Skills, Navigation, Item Response Theory, Social Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Walker, Grant M.; Basilakos, Alexandra; Fridriksson, Julius; Hickok, Gregory – Journal of Speech, Language, and Hearing Research, 2022
Purpose: Meaningful changes in picture naming responses may be obscured when measuring accuracy instead of quality. A statistic that incorporates information about the severity and nature of impairments may be more sensitive to the effects of treatment. Method: We analyzed data from repeated administrations of a naming test to 72 participants with…
Descriptors: Naming, Change, Aphasia, Severity (of Disability)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Dempster, Edith R.; Kirby, Nicki F. – South African Journal of Education, 2018
Public perception of "declining standards" in school-leaving examinations often accompanies increases in pass rates in schoolleaving examinations. "Declining standards" to the public means easier examination papers. The present study evaluates a South African attempt to estimate the level of difficulty, as distinct from…
Descriptors: Foreign Countries, Interrater Reliability, Difficulty Level, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
Benton, Tom; Leech, Tony; Hughes, Sarah – Cambridge Assessment, 2020
In the context of examinations, the phrase "maintaining standards" usually refers to any activity designed to ensure that it is no easier (or harder) to achieve a given grade in one year than in another. Specifically, it tends to mean activities associated with setting examination grade boundaries. Benton et al (2020) describes a method…
Descriptors: Mathematics Tests, Equated Scores, Comparative Analysis, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022
Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…
Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Tengberg, Michael – Language Assessment Quarterly, 2018
Reading comprehension is often treated as a multidimensional construct. In many reading tests, items are distributed over reading process categories to represent the subskills expected to constitute comprehension. This study explores (a) the extent to which specified subskills of reading comprehension tests are conceptually conceivable to…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Results
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Derek N. Canning; Stuart McLean; Joseph P. Vitta – Vocabulary Learning and Instruction, 2022
The substantive component of construct validity requires a confrontation between empirical test results and content relevance. The Vocabulary Size Test (VST) has been extensively validated in terms of empirical results. Less is known, however, about expert judgments of content relevance. The VST was constructed and validated according to the…
Descriptors: Foreign Countries, Undergraduate Students, College Faculty, Vocabulary Skills
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lyness, Scott A.; Peterson, Kent; Yates, Kenneth – Education Sciences, 2021
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen's weighted kappa, the overall IRR estimate was 0.17 (poor strength of…
Descriptors: High Stakes Tests, Performance Based Assessment, Teacher Effectiveness, Academic Language
Smarter Balanced Assessment Consortium, 2016
The goal of this study was to gather comprehensive evidence about the alignment of the Smarter Balanced summative assessments to the Common Core State Standards (CCSS). Alignment of the Smarter Balanced summative assessments to the CCSS is a critical piece of evidence regarding the validity of inferences students, teachers and policy makers can…
Descriptors: Alignment (Education), Summative Evaluation, Common Core State Standards, Test Content
Peer reviewed Peer reviewed
Direct linkDirect link
Hung, Yu-Chen; Chan, Yi-Chih – Deafness & Education International, 2020
Unlike their peers with typical hearing, reading and speech challenges observed among children with hearing loss may not only be caused by developmental issues but also hearing-related problems. Although conventional oral reading assessments are useful for identifying children at risk of reading difficulties, they do not help examiners identify…
Descriptors: Test Construction, Test Validity, Oral Reading, Reading Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Desstya, Anatri; Prasetyo, Zuhdan Kun; Suyanta; Susila, Ihwan; Irwanto – International Journal of Instruction, 2019
This study aims to report the development an instrument that is standardized (reviewed by validity, reliability, and difficulty index) to detect science misconception in an elementary school teacher. This study used a 4-D model; defining, designing, developing, and disseminating. First, it was prepared with 47 opened-ended questions, and then it…
Descriptors: Elementary School Teachers, Misconceptions, Evaluation Methods, Teacher Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Applied Measurement in Education, 2018
This article discusses regression effects that are commonly observed in Angoff ratings where panelists tend to think that hard items are easier than they are and easy items are more difficult than they are in comparison to estimated item difficulties. Analyses of data from two credentialing exams illustrate these regression effects and the…
Descriptors: Regression (Statistics), Test Items, Difficulty Level, Licensing Examinations (Professions)
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5