NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 49 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
A new index of item discrimination power (IDP), dimension-corrected Somers' D (D2) is proposed. Somers' D is one of the superior alternatives for item-total- (Rit) and item-rest correlation (Rir) in reflecting the real IDP with items with scales 0/1 and 0/1/2, that is, up to three categories. D also reaches the extreme value +1 and -1 correctly…
Descriptors: Item Analysis, Correlation, Test Items, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Heine, Jörg-Henrik; Robitzsch, Alexander – Large-scale Assessments in Education, 2022
Research Question: This paper examines the overarching question of to what extent different analytic choices may influence the inference about country-specific cross-sectional and trend estimates in international large-scale assessments. We take data from the assessment of PISA mathematics proficiency from the four rounds from 2003 to 2012 as a…
Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ganzfried, Sam; Yusuf, Farzana – Education Sciences, 2018
A problem faced by many instructors is that of designing exams that accurately assess the abilities of the students. Typically, these exams are prepared several days in advance, and generic question scores are used based on rough approximation of the question difficulty and length. For example, for a recent class taught by the author, there were…
Descriptors: Weighted Scores, Test Construction, Student Evaluation, Multiple Choice Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lu, Ying – ETS Research Report Series, 2017
For standard- or criterion-based assessments, the use of cut scores to indicate mastery, nonmastery, or different levels of skill mastery is very common. As part of performance summary, it is of interest to examine the percentage of examinees at or above the cut scores (PAC) and how PAC evolves across administrations. This paper shows that…
Descriptors: Cutting Scores, Evaluation Methods, Mastery Learning, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Undersander, Molly A.; Lund, Travis J.; Langdon, Laurie S.; Stains, Marilyne – Chemistry Education Research and Practice, 2017
The design of assessment tools is critical to accurately evaluate students' understanding of chemistry. Although extensive research has been conducted on various aspects of assessment tool design, few studies in chemistry have focused on the impact of the order in which questions are presented to students on the measurement of students'…
Descriptors: Test Construction, Scientific Concepts, Concept Formation, Science Education
Peer reviewed Peer reviewed
Direct linkDirect link
Wesolowski, Brian C. – International Journal of Music Education, 2017
The purpose of this study was to develop a valid and reliable rating scale to assess jazz rhythm sections in the context of jazz big band performance. The research questions that guided this study included: (a) what central factors contribute to the assessment of a jazz rhythm section? (b) what items should be used to describe and assess a jazz…
Descriptors: Test Construction, Rating Scales, Music, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
George, Ann Cathrice; Robitzsch, Alexander – Applied Measurement in Education, 2018
This article presents a new perspective on measuring gender differences in the large-scale assessment study Trends in International Science Study (TIMSS). The suggested empirical model is directly based on the theoretical competence model of the domain mathematics and thus includes the interaction between content and cognitive sub-competencies.…
Descriptors: Achievement Tests, Elementary Secondary Education, Mathematics Achievement, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Todd, Amber; Romine, William L.; Cook Whitt, Katahdin – Science Education, 2017
We describe the development, validation, and use of the "Learning Progression-Based Assessment of Modern Genetics" (LPA-MG) in a high school biology context. Items were constructed based on a current learning progression framework for genetics (Shea & Duncan, 2013; Todd & Kenyon, 2015). The 34-item instrument, which was tied to…
Descriptors: Genetics, Science Instruction, High School Students, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes; Edwards, Julianne M. – Educational and Psychological Measurement, 2016
Standard approaches for estimating item response theory (IRT) model parameters generally work under the assumption that the latent trait being measured by a set of items follows the normal distribution. Estimation of IRT parameters in the presence of nonnormal latent traits has been shown to generate biased person and item parameter estimates. A…
Descriptors: Item Response Theory, Computation, Nonparametric Statistics, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Do-Hong; Lambert, Richard G.; Durham, Sean; Burts, Diane C. – Early Education and Development, 2018
Research Findings: This study builds on prior work related to the assessment of young dual language learners (DLLs). The purposes of the study were to (a) determine whether latent subgroups of preschool DLLs would replicate those found previously and (b) examine the validity of GOLD® by Teaching Strategies with empirically derived subgroups.…
Descriptors: Preschool Education, Teaching Methods, Bilingualism, Bilingual Education
Thummaphan, Phonraphee – ProQuest LLC, 2017
The present study aimed to represent the innovative assessments that support students' learning in STEM education through using the integrative framework for Cognitive Diagnostic Modeling (CDM). This framework is based on three components, cognition, observation, and interpretation (National Research Council, 2001). Specifically, this dissertation…
Descriptors: STEM Education, Cognitive Processes, Observation, Psychometrics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Golovachyova, Viktoriya N.; Menlibekova, Gulbakhyt Zh.; Abayeva, Nella F.; Ten, Tatyana L.; Kogaya, Galina D. – International Journal of Environmental and Science Education, 2016
Using computer-based monitoring systems that rely on tests could be the most effective way of knowledge evaluation. The problem of objective knowledge assessment by means of testing takes on a new dimension in the context of new paradigms in education. The analysis of the existing test methods enabled us to conclude that tests with selected…
Descriptors: Expertise, Computer Assisted Testing, Student Evaluation, Knowledge Level
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4