ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	16

Descriptor

Statistical Analysis	36
Test Items	36
Sampling	26
Item Analysis	15
Test Construction	14
Difficulty Level	12
Item Sampling	12
Achievement Tests	10
Foreign Countries	8
Mathematical Models	8
Comparative Analysis	7
Item Response Theory	7
Reliability	6
Scores	6
Test Validity	6
Computation	5
Test Interpretation	5
Adaptive Testing	4
Criterion Referenced Tests	4
Evaluation Criteria	4
Goodness of Fit	4
Latent Trait Theory	4
Monte Carlo Methods	4
Psychometrics	4
Test Bias	4
More ▼

Source

Applied Psychological…	4
Educational and Psychological…	3
ETS Research Report Series	2
Cognitive Research:…	1
International Association for…	1
International Education…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational…	1
Journal of Studies in…	1
Language Learning & Technology	1
Large-scale Assessments in…	1
Measurement:…	1
Online Submission	1
Practical Assessment,…	1
Scandinavian Journal of…	1
More ▼

Publication Type

Reports - Research	25
Journal Articles	19
Reports - Evaluative	5
Speeches/Meeting Papers	5
Reports - General	3
Numerical/Quantitative Data	2
Information Analyses	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Secondary Education	3
Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Asia	1
Australia	1
Canada	1
Germany	1
Iran	1
Japan	1
Nigeria	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	3
Test of English as a Foreign…	2
Armed Services Vocational…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 36 results Save | Export

Designing and Evaluating Tasks to Measure Individual Differences in Experimental Psychology: A Tutorial

Peer reviewed

Direct link

Marc Brysbaert – Cognitive Research: Principles and Implications, 2024

Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…

Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis

Evaluating the Effects of Analytical Decisions in Large-Scale Assessments: Analyzing PISA Mathematics 2003-2012

Peer reviewed

Direct link

Heine, Jörg-Henrik; Robitzsch, Alexander – Large-scale Assessments in Education, 2022

Research Question: This paper examines the overarching question of to what extent different analytic choices may influence the inference about country-specific cross-sectional and trend estimates in international large-scale assessments. We take data from the assessment of PISA mathematics proficiency from the four rounds from 2003 to 2012 as a…

Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students

Evaluating CAT-Adjusted Approaches for Suspected Item Parameter Drift Detection

Peer reviewed

Direct link

Cappaert, Kevin J.; Wen, Yao; Chang, Yu-Feng – Measurement: Interdisciplinary Research and Perspectives, 2018

Events such as curriculum changes or practice effects can lead to item parameter drift (IPD) in computer adaptive testing (CAT). The current investigation introduced a point- and weight-adjusted D[superscript 2] method for IPD detection for use in a CAT environment when items are suspected of drifting across test administrations. Type I error and…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Identification

Determining Item Screening Criteria Using Cost-Benefit Analysis

Peer reviewed
PDF on ERIC

Download full text

Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019

Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…

Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy

An Algorithm to Improve Test Answer Copying Detection Using the Omega Statistic

Peer reviewed

Direct link

Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017

The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…

Descriptors: Cheating, Test Items, Mathematics, Statistics

The Psychological Effect of Errors in Standardized Language Test Items on EFL Students' Responses to the Following Item

Peer reviewed
PDF on ERIC

Download full text

Khaksefidi, Saman – International Education Studies, 2017

This study investigates the psychological effect of a wrong question with wrong items on answering to the next question in a test of structure. Forty students selected through stratified random sampling are given 15 questions of a standardized test namely a TOEFL structure test in which questions number 7 and number 11 are wrong and their answers…

Descriptors: Language Tests, English (Second Language), Second Language Learning, Statistical Analysis

It Might Not Make a Big DIF: Improved Differential Test Functioning Statistics That Account for Sampling Variability

Peer reviewed

Direct link

Chalmers, R. Philip; Counsell, Alyssa; Flora, David B. – Educational and Psychological Measurement, 2016

Differential test functioning, or DTF, occurs when one or more items in a test demonstrate differential item functioning (DIF) and the aggregate of these effects are witnessed at the test level. In many applications, DTF can be more important than DIF when the overall effects of DIF at the test level can be quantified. However, optimal statistical…

Descriptors: Test Bias, Sampling, Test Items, Statistical Analysis

Lessons Learned from PISA: A Systematic Review of Peer-Reviewed Articles on the Programme for International Student Assessment

Peer reviewed

Direct link

Hopfenbeck, Therese N.; Lenkeit, Jenny; El Masri, Yasmine; Cantrell, Kate; Ryan, Jeanne; Baird, Jo-Anne – Scandinavian Journal of Educational Research, 2018

International large-scale assessments are on the rise, with the Programme for International Student Assessment (PISA) seen by many as having strategic prominence in education policy debates. The present article reviews PISA-related English-language peer-reviewed articles from the programme's first cycle in 2000 to its most current in 2015. Five…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Coefficient Omega Bootstrap Confidence Intervals: Nonnormal Distributions

Peer reviewed

Direct link

Padilla, Miguel A.; Divers, Jasmin – Educational and Psychological Measurement, 2013

The performance of the normal theory bootstrap (NTB), the percentile bootstrap (PB), and the bias-corrected and accelerated (BCa) bootstrap confidence intervals (CIs) for coefficient omega was assessed through a Monte Carlo simulation under conditions not previously investigated. Of particular interests were nonnormal Likert-type and binary items.…

Descriptors: Sampling, Statistical Inference, Computation, Statistical Analysis

Development and Validation of Scientific Literacy Achievement Test to Assess Senior Secondary School Students' Literacy Acquisition in Physics

Peer reviewed
PDF on ERIC

Download full text

Adeleke, A. A.; Joshua, E. O. – Journal of Education and Practice, 2015

Physics literacy plays a crucial part in global technological development as several aspects of science and technology apply concepts and principles of physics in their operations. However, the acquisition of scientific literacy in physics in our society today is not encouraging enough to the desirable standard. Therefore, this study focuses on…

Descriptors: Physics, Secondary School Students, Scientific Literacy, Foreign Countries

Weighting Test Samples in IRT Linking and Equating: Toward an Improved Sampling Design for Complex Equating. Research Report. ETS RR-13-39

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe; Jiang, Yanming; von Davier, Alina A. – ETS Research Report Series, 2013

Several factors could cause variability in item response theory (IRT) linking and equating procedures, such as the variability across examinee samples and/or test items, seasonality, regional differences, native language diversity, gender, and other demographic variables. Hence, the following question arises: Is it possible to select optimal…

Descriptors: Item Response Theory, Test Items, Sampling, True Scores

The Effects of Item Preview on Video-Based Multiple-Choice Listening Assessments

Peer reviewed

Direct link

Koyama, Dennis; Sun, Angela; Ockey, Gary J. – Language Learning & Technology, 2016

Multiple-choice formats remain a popular design for assessing listening comprehension, yet no consensus has been reached on how multiple-choice formats should be employed. Some researchers argue that test takers must be provided with a preview of the items prior to the input (Buck, 1995; Sherman, 1997); others argue that a preview may decrease the…

Descriptors: Multiple Choice Tests, Listening Comprehension Tests, Statistical Analysis, Language Proficiency

Estimating a Noncompensatory IRT Model Using Metropolis within Gibbs Sampling

Peer reviewed

Direct link

Babcock, Ben – Applied Psychological Measurement, 2011

Relatively little research has been conducted with the noncompensatory class of multidimensional item response theory (MIRT) models. A Monte Carlo simulation study was conducted exploring the estimation of a two-parameter noncompensatory item response theory (IRT) model. The estimation method used was a Metropolis-Hastings within Gibbs algorithm…

Descriptors: Item Response Theory, Sampling, Computation, Statistical Analysis

An Application of Reverse Engineering to Automatic Item Generation: A Proof of Concept Using Automatically Generated Figures

Download full text

Lorié, William A. – Online Submission, 2013

A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…

Descriptors: Numeracy, Mathematical Concepts, Mathematical Logic, Difficulty Level

Commingled Samples: A Neglected Source of Bias in Reliability Analysis

Peer reviewed

Direct link

Waller, Niels G. – Applied Psychological Measurement, 2008

Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…

Descriptors: Test Items, Reliability, Scores, Psychometrics

Previous Page | Next Page »

Pages: 1 | 2 | 3

Reckase, Mark D.	2
Adeleke, A. A.	1
Ainley, John, Ed.	1
Babcock, Ben	1
Baird, Jo-Anne	1
Bashkov, Bozhidar M.	1
Berk, Ronald A.	1
Bock, R. Darrell	1
Cantrell, Kate	1
Cappaert, Kevin J.	1
Chalmers, R. Philip	1
Chang, Yu-Feng	1
Clauser, Jerome C.	1
Counsell, Alyssa	1
Divers, Jasmin	1
Dorans, Neil J.	1
Doron, Rina	1
Douglass, James B.	1
El Masri, Yasmine	1
Ervin, Nancy S.	1
Farish, Stephen J.	1
Flora, David B.	1
Forsyth, Robert A.	1
Fraillon, Julian, Ed.	1
Haladyna, Tom	1
More ▼