Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 16 |
Descriptor
Statistical Analysis | 36 |
Test Items | 36 |
Sampling | 26 |
Item Analysis | 15 |
Test Construction | 14 |
Difficulty Level | 12 |
Item Sampling | 12 |
Achievement Tests | 10 |
Foreign Countries | 8 |
Mathematical Models | 8 |
Comparative Analysis | 7 |
More ▼ |
Source
Author
Reckase, Mark D. | 2 |
Adeleke, A. A. | 1 |
Ainley, John, Ed. | 1 |
Babcock, Ben | 1 |
Baird, Jo-Anne | 1 |
Bashkov, Bozhidar M. | 1 |
Berk, Ronald A. | 1 |
Bock, R. Darrell | 1 |
Cantrell, Kate | 1 |
Cappaert, Kevin J. | 1 |
Chalmers, R. Philip | 1 |
More ▼ |
Publication Type
Reports - Research | 25 |
Journal Articles | 19 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 5 |
Reports - General | 3 |
Numerical/Quantitative Data | 2 |
Information Analyses | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Secondary Education | 3 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 3 |
Test of English as a Foreign… | 2 |
Armed Services Vocational… | 1 |
What Works Clearinghouse Rating
Marc Brysbaert – Cognitive Research: Principles and Implications, 2024
Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…
Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis
Heine, Jörg-Henrik; Robitzsch, Alexander – Large-scale Assessments in Education, 2022
Research Question: This paper examines the overarching question of to what extent different analytic choices may influence the inference about country-specific cross-sectional and trend estimates in international large-scale assessments. We take data from the assessment of PISA mathematics proficiency from the four rounds from 2003 to 2012 as a…
Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students
Cappaert, Kevin J.; Wen, Yao; Chang, Yu-Feng – Measurement: Interdisciplinary Research and Perspectives, 2018
Events such as curriculum changes or practice effects can lead to item parameter drift (IPD) in computer adaptive testing (CAT). The current investigation introduced a point- and weight-adjusted D[superscript 2] method for IPD detection for use in a CAT environment when items are suspected of drifting across test administrations. Type I error and…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Identification
Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019
Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…
Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy
Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Descriptors: Cheating, Test Items, Mathematics, Statistics
Khaksefidi, Saman – International Education Studies, 2017
This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Statistical Analysis
Chalmers, R. Philip; Counsell, Alyssa; Flora, David B. – Educational and Psychological Measurement, 2016
Differential test functioning, or DTF, occurs when one or more items in a test demonstrate differential item functioning (DIF) and the aggregate of these effects are witnessed at the test level. In many applications, DTF can be more important than DIF when the overall effects of DIF at the test level can be quantified. However, optimal statistical…
Descriptors: Test Bias, Sampling, Test Items, Statistical Analysis
Hopfenbeck, Therese N.; Lenkeit, Jenny; El Masri, Yasmine; Cantrell, Kate; Ryan, Jeanne; Baird, Jo-Anne – Scandinavian Journal of Educational Research, 2018
International large-scale assessments are on the rise, with the Programme for International Student Assessment (PISA) seen by many as having strategic prominence in education policy debates. The present article reviews PISA-related English-language peer-reviewed articles from the programme's first cycle in 2000 to its most current in 2015. Five…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Padilla, Miguel A.; Divers, Jasmin – Educational and Psychological Measurement, 2013
The performance of the normal theory bootstrap (NTB), the percentile bootstrap (PB), and the bias-corrected and accelerated (BCa) bootstrap confidence intervals (CIs) for coefficient omega was assessed through a Monte Carlo simulation under conditions not previously investigated. Of particular interests were nonnormal Likert-type and binary items.…
Descriptors: Sampling, Statistical Inference, Computation, Statistical Analysis
Adeleke, A. A.; Joshua, E. O. – Journal of Education and Practice, 2015
Physics literacy plays a crucial part in global technological development as several aspects of science and technology apply concepts and principles of physics in their operations. However, the acquisition of scientific literacy in physics in our society today is not encouraging enough to the desirable standard. Therefore, this study focuses on…
Descriptors: Physics, Secondary School Students, Scientific Literacy, Foreign Countries
Qian, Jiahe; Jiang, Yanming; von Davier, Alina A. – ETS Research Report Series, 2013
Several factors could cause variability in item response theory (IRT) linking and equating procedures, such as the variability across examinee samples and/or test items, seasonality, regional differences, native language diversity, gender, and other demographic variables. Hence, the following question arises: Is it possible to select optimal…
Descriptors: Item Response Theory, Test Items, Sampling, True Scores
Koyama, Dennis; Sun, Angela; Ockey, Gary J. – Language Learning & Technology, 2016
Multiple-choice formats remain a popular design for assessing listening comprehension, yet no consensus has been reached on how multiple-choice formats should be employed. Some researchers argue that test takers must be provided with a preview of the items prior to the input (Buck, 1995; Sherman, 1997); others argue that a preview may decrease the…
Descriptors: Multiple Choice Tests, Listening Comprehension Tests, Statistical Analysis, Language Proficiency
Babcock, Ben – Applied Psychological Measurement, 2011
Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…
Descriptors: Item Response Theory, Sampling, Computation, Statistical Analysis
Lorié, William A. – Online Submission, 2013
A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…
Descriptors: Numeracy, Mathematical Concepts, Mathematical Logic, Difficulty Level
Waller, Niels G. – Applied Psychological Measurement, 2008
Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…
Descriptors: Test Items, Reliability, Scores, Psychometrics