ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	17

Descriptor

Error Patterns	17
Sample Size	17
Item Response Theory	5
Test Items	5
Correlation	4
Test Bias	4
Effect Size	3
Error of Measurement	3
Meta Analysis	3
Regression (Statistics)	3
Test Length	3
Bias	2
Foreign Countries	2
Goodness of Fit	2
International Assessment	2
Item Analysis	2
Monte Carlo Methods	2
Publications	2
Recall (Psychology)	2
Research Design	2
Research Problems	2
Scores	2
Scoring	2
Simulation	2
Statistical Analysis	2
More ▼

Source

Journal of Educational…	3
International Journal of…	2
Journal of Experimental…	2
Journal of Experimental…	2
Large-scale Assessments in…	2
Canadian Journal of School…	1
Grantee Submission	1
Journal of Education and…	1
National Center for Education…	1
ProQuest LLC	1
Research Synthesis Methods	1
More ▼

Publication Type

Reports - Research	15
Journal Articles	14
Dissertations/Theses -…	1
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1
Secondary Education	1

Audience

Location

Canada	1
Montana	1
Portugal	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Program for International…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Understanding the Private-Public School Performance Gap in PISA: Evidence from Portugal

Peer reviewed

Direct link

Ricardo Colaço; Pedro Freitas; Luis Catela Nunes; Ana Balcão Reis – Large-scale Assessments in Education, 2025

We analyse the PISA-reported convergence in the performance of private and public schools in Portugal. When PISA sampling weights are used, the number of students enrolled in those types of schools and specific grades/tracks of study differs significantly from official population figures. To account for those differences, we apply a…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Principal Component Analysis on the Covariance Matrix for Data Reduction in Large-Scale Assessments

Peer reviewed

Direct link

Paul A. Jewsbury; Matthew S. Johnson – Large-scale Assessments in Education, 2025

The standard methodology for many large-scale assessments in education involves regressing latent variables on numerous contextual variables to estimate proficiency distributions. To reduce the number of contextual variables used in the regression and improve estimation, we propose and evaluate principal component analysis on the covariance matrix…

Descriptors: Factor Analysis, Matrices, Regression (Statistics), Educational Assessment

Type I Error and Power Rates: A Comparative Analysis of Techniques in Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Ayse Bilicioglu Gunes; Bayram Bicak – International Journal of Assessment Tools in Education, 2023

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord's [chi-squared], and Raju's Areas…

Descriptors: Test Items, Item Response Theory, Error of Measurement, Test Bias

Evaluation of Factors Affecting the Performance of the "S - X[superscript 2]" Item-Fit Index

Peer reviewed

Direct link

Kim, Hyung Jin; Lee, Won-Chan – Journal of Educational Measurement, 2022

Orlando and Thissen (2000) introduced the "S - X[superscript 2]" item-fit index for testing goodness-of-fit with dichotomous item response theory (IRT) models. This study considers and evaluates an alternative approach for computing "S - X[superscript 2]" values and other factors associated with collapsing tables of observed…

Descriptors: Goodness of Fit, Test Items, Item Response Theory, Computation

A Comparison of the Efficacies of Differential Item Functioning Detection Methods

Peer reviewed
PDF on ERIC

Download full text

Basman, Munevver – International Journal of Assessment Tools in Education, 2023

To ensure the validity of the tests is to check that all items have similar results across different groups of individuals. However, differential item functioning (DIF) occurs when the results of individuals with equal ability levels from different groups differ from each other on the same test item. Based on Item Response Theory and Classic Test…

Descriptors: Test Bias, Test Items, Test Validity, Item Response Theory

Effectiveness of Equating at the Passing Score for Exams with Small Sample Sizes

Peer reviewed

Direct link

Wolkowitz, Amanda A.; Wright, Keith D. – Journal of Educational Measurement, 2019

This article explores the amount of equating error at a passing score when equating scores from exams with small samples sizes. This article focuses on equating using classical test theory methods of Tucker linear, Levine linear, frequency estimation, and chained equipercentile equating. Both simulation and real data studies were used in the…

Descriptors: Error Patterns, Sample Size, Test Theory, Test Bias

Detecting Rater Effects in Trend Scoring

Direct link

Abdalla, Widad – ProQuest LLC, 2019

Trend scoring is often used in large-scale assessments to monitor for rater drift when the same constructed response items are administered in multiple test administrations. In trend scoring, a set of responses from Time "A" are rescored by raters at Time "B." The purpose of this study is to examine the ability of…

Descriptors: Scoring, Interrater Reliability, Test Items, Error Patterns

Severe Publication Bias Contributes to Illusory Sleep Consolidation in the Motor Sequence Learning Literature

Peer reviewed

Direct link

Rickard, Timothy C.; Pan, Steven C.; Gupta, Mohan W. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2022

We explored the possibility of publication bias in the sleep and explicit motor sequence learning literature by applying precision effect test (PET) and precision effect test with standard errors (PEESE) weighted regression analyses to the 88 effect sizes from a recent comprehensive literature review (Pan & Rickard, 2015). Basic PET analysis…

Descriptors: Publications, Bias, Sleep, Psychomotor Skills

Effect of Sample Size Ratio and Model Misfit When Using the Difficulty Parameter Differences Procedure to Detect DIF

Peer reviewed

Direct link

Berrío, Ángela I.; Herrera, Aura N.; Gómez-Benito, Juana – Journal of Experimental Education, 2019

This study examined the effect of sample size ratio and model misfit on the Type I error rates and power of the Difficulty Parameter Differences procedure using Winsteps. A unidimensional 30-item test with responses from 130,000 examinees was simulated and four independent variables were manipulated: sample size ratio (20/100/250/500/1000); model…

Descriptors: Sample Size, Test Bias, Goodness of Fit, Statistical Analysis

Using Total Sample Size Weights in Meta-Analysis of Log-Odds Ratios

Peer reviewed

Direct link

Park, Sunyoung; Beretvas, S. Natasha – Journal of Experimental Education, 2019

The log-odds ratio (ln[OR]) is commonly used to quantify treatments' effects on dichotomous outcomes and then pooled across studies using inverse-variance (1/v) weights. Calculation of the ln[OR]'s variance requires four cell frequencies for two groups crossed with values for dichotomous outcomes. While primary studies report the total sample size…

Descriptors: Sample Size, Meta Analysis, Statistical Analysis, Efficiency

Does Collaboration Help or Hurt Recall? The Answer Depends on Working Memory Capacity

Peer reviewed

Direct link

Hood, Audrey V. B.; Whillock, Summer R.; Meade, Michelle L.; Hutchison, Keith A. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2023

Collaborative inhibition (reduced recall in collaborative vs. nominal groups) is a robust phenomenon. However, it is possible that not everyone is as susceptible to collaborative inhibition, such as those higher in working memory capacity (WMC). In the current study, we examined the relationship between WMC and collaborative inhibition.…

Descriptors: Short Term Memory, Recall (Psychology), Task Analysis, Error Patterns

Standard Errors of IRT Parameter Scale Transformation Coefficients: Comparison of Bootstrap Method, Delta Method, and Multiple Imputation Method

Peer reviewed

Direct link

Zhang, Zhonghua; Zhao, Mingren – Journal of Educational Measurement, 2019

The present study evaluated the multiple imputation method, a procedure that is similar to the one suggested by Li and Lissitz (2004), and compared the performance of this method with that of the bootstrap method and the delta method in obtaining the standard errors for the estimates of the parameter scale transformation coefficients in item…

Descriptors: Item Response Theory, Error Patterns, Item Analysis, Simulation

Funnel Plots May Show Asymmetry in the Absence of Publication Bias with Continuous Outcomes Dependent on Baseline Risk: Presentation of a New Publication Bias Test

Peer reviewed

Direct link

Doleman, Brett; Freeman, Suzanne C.; Lund, Jonathan N.; Williams, John P.; Sutton, Alex J. – Research Synthesis Methods, 2020

This study aimed to determine for continuous outcomes dependent on baseline risk, whether funnel plot asymmetry may be due to statistical artefact rather than publication bias and evaluate a novel test to resolve this. Firstly, we conducted assessment for publication bias in nine meta-analyses of postoperative analgesics (344 trials with 25 348…

Descriptors: Outcomes of Treatment, Risk, Publications, Bias

Developing Proficiency in Standardized Cognitive Assessment Scoring: How Much Is Enough?

Peer reviewed

Direct link

Cormier, Damien C.; Van Norman, Ethan R.; Cheong, Clarissa; Kennedy, Kathleen E.; Bulut, Okan; Mrazik, Martin – Canadian Journal of School Psychology, 2019

This study aims to systematically evaluate the scoring errors made by psychologists in training, in the hopes of providing strong, empirically based guidelines to training programs. Survival analysis was used to determine the number of attempts required for graduate students to achieve proficiency in scoring standardized record forms from the…

Descriptors: Cognitive Measurement, Assessment Literacy, Scoring, Psychologists

nCoder+: A Semantic Tool for Improving Recall of nCoder Coding

Peer reviewed
PDF on ERIC

Download full text

Direct link

Cai, Zhiqiang; Siebert-Evenstone, Amanda; Eagan, Brendan; Shaffer, David Williamson; Hu, Xiangen; Graesser, Arthur C. – Grantee Submission, 2019

Coding is a process of assigning meaning to a given piece of evidence. Evidence may be found in a variety of data types, including documents, research interviews, posts from social media, conversations from learning platforms, or any source of data that may provide insights for the questions under qualitative study. In this study, we focus on text…

Descriptors: Semantics, Computational Linguistics, Evidence, Coding

Previous Page | Next Page »

Pages: 1 | 2

Abdalla, Widad	1
Ana Balcão Reis	1
Ayse Bilicioglu Gunes	1
Basman, Munevver	1
Bayram Bicak	1
Beretvas, S. Natasha	1
Berrío, Ángela I.	1
Bulut, Okan	1
Cai, Zhiqiang	1
Cheong, Clarissa	1
Cormier, Damien C.	1
Deke, John	1
Doleman, Brett	1
Eagan, Brendan	1
Freeman, Suzanne C.	1
Graesser, Arthur C.	1
Gupta, Mohan W.	1
Gómez-Benito, Juana	1
Herrera, Aura N.	1
Hood, Audrey V. B.	1
Hu, Xiangen	1
Hutchison, Keith A.	1
Kautz, Tim	1
Kelecioglu, Hülya	1
Kennedy, Kathleen E.	1
More ▼