NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)5
Since 2006 (last 20 years)14
Audience
Policymakers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 28 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Braumoeller, Bear F. – Sociological Methods & Research, 2017
Fuzzy-set qualitative comparative analysis (fsQCA) has become one of the most prominent methods in the social sciences for capturing causal complexity, especially for scholars with small- and medium-"N" data sets. This research note explores two key assumptions in fsQCA's methodology for testing for necessary and sufficient…
Descriptors: Qualitative Research, Comparative Analysis, Social Science Research, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Lundqvist, Lars-Olov; Lindner, Helen – Journal of Autism and Developmental Disorders, 2017
The Autism-Spectrum Quotient (AQ) is among the most widely used scales assessing autistic traits in the general population. However, some aspects of the AQ are questionable. To test its scale properties, the AQ was translated into Swedish, and data were collected from 349 adults, 130 with autism spectrum disorder (ASD) and 219 without ASD, and…
Descriptors: Autism, Pervasive Developmental Disorders, Adults, Comparative Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jin, Ying; Eason, Hershel – Journal of Educational Issues, 2016
The effects of mean ability difference (MAD) and short tests on the performance of various DIF methods have been studied extensively in previous simulation studies. Their effects, however, have not been studied under multilevel data structure. MAD was frequently observed in large-scale cross-country comparison studies where the primary sampling…
Descriptors: Test Bias, Simulation, Hierarchical Linear Modeling, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A.; Sireci, Stephen G. – International Journal of Testing, 2014
The International Test Commission's "Guidelines for Translating and Adapting Tests" (2010) provide important guidance on developing and evaluating tests for use across languages. These guidelines are widely applauded, but the degree to which they are followed in practice is unknown. The objective of this study was to perform a…
Descriptors: Guidelines, Translation, Adaptive Testing, Second Languages
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013
The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…
Descriptors: Test Format, Test Items, Responses, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Styles, Irene; Wildy, Helen; Pepper, Vivienne; Faulkner, Joanne; Berman, Ye'Elah – International Research in Early Childhood Education, 2014
The assessment of literacy and numeracy skills of students as they enter school for the first time is not yet established nation-wide in Australia. However, a large proportion of primary schools have chosen to assess their starting students on the Performance Indicators in Primary Schools-Baseline Assessment (PIPS-BLA). This series of three…
Descriptors: Foreign Countries, Indigenous Knowledge, Performance Based Assessment, Test Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2010
This report examines the consequences of differential item functioning (DIF) using simulated data. Its impact on total score, item response theory (IRT) ability estimate, and test reliability was evaluated in various testing scenarios created by manipulating the following four factors: test length, percentage of DIF items per form, sample sizes of…
Descriptors: Test Bias, Item Response Theory, Test Items, Scores
Peoples, Shelagh – ProQuest LLC, 2012
The purpose of this study was to determine which of three competing models will provide, reliable, interpretable, and responsive measures of elementary students' understanding of the nature of science (NOS). The Nature of Science Instrument-Elementary (NOSI-E), a 28-item Rasch-based instrument, was used to assess students' NOS…
Descriptors: Scientific Principles, Science Tests, Elementary School Students, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Jiang, Bo; Xu, Xiaoying; Garcia, Alicia; Lewis, Jennifer E. – Journal of Chemical Education, 2010
The Test of Logical Thinking (TOLT) and the Group Assessment of Logical Thinking (GALT) are two of the instruments most widely used by science educators and researchers to measure students' formal reasoning abilities. Based on Piaget's cognitive development theory, formal thinking ability has been shown to be essential for student achievement in…
Descriptors: Test Bias, Test Reliability, Chemistry, Logical Thinking
Peer reviewed Peer reviewed
Direct linkDirect link
Wuang, Yee-Pay; Wang, Li-Chen; Su, Chwen-Yng – Research in Developmental Disabilities: A Multidisciplinary Journal, 2010
The aim of this study was to examine the validation of the Hooper Visual Organization Test (HVOT) for use in children by testing for item fit, unidimensionality, item hierarchy, reliability, and screening capacity. A modified scoring system was devised for the HVOT so that children received some credit for being able to describe the function of…
Descriptors: Test Bias, Down Syndrome, Scoring, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Journal of Educational Measurement, 2008
This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically,…
Descriptors: Equated Scores, Sample Size, Test Reliability, Comparative Analysis
Herman, Geoffrey Lindsay – ProQuest LLC, 2011
Instructors in electrical and computer engineering and in computer science have developed innovative methods to teach digital logic circuits. These methods attempt to increase student learning, satisfaction, and retention. Although there are readily accessible and accepted means for measuring satisfaction and retention, there are no widely…
Descriptors: Grounded Theory, Delphi Technique, Concept Formation, Misconceptions
Angstadt, Al; And Others – Southern Journal of Educational Research, 1979
Seeking to compare the original Wechler Intelligence Scale (WISC) with its revised version, the WISC-R, this study compared WISC-R scores of 50 Black children with their WISC scores taken two years previously. Mean scores on the WISC-R were lower on the Verbal Scale, Performance Scale, and Full Scale. (DS)
Descriptors: Black Education, Black Students, Comparative Analysis, Elementary Education
Previous Page | Next Page »
Pages: 1  |  2