NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Canada1
Turkey1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019
The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…
Descriptors: Test Items, Translation, Computer Software, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Descriptors: Cheating, Test Items, Mathematics, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Oliveri, Maria Elena; Lawless, Rene; Robin, Frederic; Bridgeman, Brent – Applied Measurement in Education, 2018
We analyzed a pool of items from an admissions test for differential item functioning (DIF) for groups based on age, socioeconomic status, citizenship, or English language status using Mantel-Haenszel and item response theory. DIF items were systematically examined to identify its possible sources by item type, content, and wording. DIF was…
Descriptors: Test Bias, Comparative Analysis, Item Banks, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Yan; Zumbo, Bruno D.; Gustafson, Paul; Huang, Yi; Kroc, Edward; Wu, Amery D. – Practical Assessment, Research & Evaluation, 2016
A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g.,…
Descriptors: Test Items, Test Bias, Probability, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Çetin, Sevda; Gelbal, Selahattin – Educational Sciences: Theory and Practice, 2013
In this research, the cut score of a foundation university was re-calculated with bookmark method and with Angoff method, each of which is a standard setting method; and the cut scores found were compared with the current proficiency score. Thus, the final cut score was found to be 27.87 with the cooperative work of 17 experts through the Angoff…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Cutting Scores, Correlation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Özyurt, Hacer; Özyurt, Özcan – Eurasian Journal of Educational Research, 2015
Problem Statement: Learning-teaching activities bring along the need to determine whether they achieve their goals. Thus, multiple choice tests addressing the same set of questions to all are frequently used. However, this traditional assessment and evaluation form contrasts with modern education, where individual learning characteristics are…
Descriptors: Probability, Adaptive Testing, Computer Assisted Testing, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Demars, Christine E. – Applied Measurement in Education, 2011
Three types of effects sizes for DIF are described in this exposition: log of the odds-ratio (differences in log-odds), differences in probability-correct, and proportion of variance accounted for. Using these indices involves conceptualizing the degree of DIF in different ways. This integrative review discusses how these measures are impacted in…
Descriptors: Effect Size, Test Bias, Probability, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Paek, Insu; Wilson, Mark – Educational and Psychological Measurement, 2011
This study elaborates the Rasch differential item functioning (DIF) model formulation under the marginal maximum likelihood estimation context. Also, the Rasch DIF model performance was examined and compared with the Mantel-Haenszel (MH) procedure in small sample and short test length conditions through simulations. The theoretically known…
Descriptors: Test Bias, Test Length, Statistical Inference, Geometric Concepts
Peer reviewed Peer reviewed
Direct linkDirect link
Thompson, Nathan A. – Practical Assessment, Research & Evaluation, 2011
Computerized classification testing (CCT) is an approach to designing tests with intelligent algorithms, similar to adaptive testing, but specifically designed for the purpose of classifying examinees into categories such as "pass" and "fail." Like adaptive testing for point estimation of ability, the key component is the…
Descriptors: Adaptive Testing, Computer Assisted Testing, Classification, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Huang, Sheng-Yun – Educational and Psychological Measurement, 2011
The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their…
Descriptors: Computer Assisted Testing, Classification, Item Analysis, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Atar, Burcu; Kamata, Akihito – Hacettepe University Journal of Education, 2011
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Descriptors: Test Bias, Sample Size, Monte Carlo Methods, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009
Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…
Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Three strategies for linking two consecutive assessments are investigated and compared by analyzing reading data for the National Assessment of Educational Progress (NAEP) using the general diagnostic model. These strategies are compared in terms of marginal and joint expectations of skills, joint probabilities of skill patterns, and item…
Descriptors: National Competency Tests, Probability, Reading Achievement, Test Items
Previous Page | Next Page »
Pages: 1  |  2