ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	26

Descriptor

Probability	32
Scores	32
Test Items	32
Item Response Theory	12
Difficulty Level	9
Psychometrics	8
Foreign Countries	7
Models	7
Item Analysis	6
Mathematics Tests	6
Correlation	5
Evaluation Methods	5
Measurement	5
Simulation	5
Test Bias	5
Test Construction	5
College Students	4
Comparative Analysis	4
Knowledge Level	4
Multiple Choice Tests	4
Regression (Statistics)	4
Statistics	4
Achievement Tests	3
Error Patterns	3
Error of Measurement	3
More ▼

Publication Type

Journal Articles	27
Reports - Research	25
Reports - Evaluative	4
Speeches/Meeting Papers	2
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	5
Postsecondary Education	5
Elementary Education	4
Grade 8	3
Secondary Education	3
Elementary Secondary Education	2
Grade 12	2
High Schools	2
Junior High Schools	2
Middle Schools	2
Grade 4	1
Grade 7	1
Intermediate Grades	1
Kindergarten	1
More ▼

Audience

Researchers

Location

Canada	2
Colorado	1
Cyprus	1
Florida	1
Mexico	1
South Africa	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Program for International…	1
Stanford Early School…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

What Is Actually Equated in "Test Equating"? A Didactic Note

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022

The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…

Descriptors: Equated Scores, Test Items, Scores, Probability

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

Goodman-Kruskal Gamma and Dimension-Corrected Gamma in Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2021

Although Goodman-Kruskal gamma (G) is used relatively rarely it has promising potential as a coefficient of association in educational settings. Characteristics of G are studied in three sub-studies related to educational measurement settings. G appears to be unexpectedly appealing as an estimator of association between an item and a score because…

Descriptors: Educational Assessment, Measurement, Item Analysis, Correlation

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment

Peer reviewed

Direct link

Chen, Michelle Y.; Liu, Yan; Zumbo, Bruno D. – Educational and Psychological Measurement, 2020

This study introduces a novel differential item functioning (DIF) method based on propensity score matching that tackles two challenges in analyzing performance assessment data, that is, continuous task scores and lack of a reliable internal variable as a proxy for ability or aptitude. The proposed DIF method consists of two main stages. First,…

Descriptors: Probability, Scores, Evaluation Methods, Test Items

An Approximation of University Students' Learning Ability in the Area of Probability

Peer reviewed
PDF on ERIC

Download full text

Sandoval-Bravo, Salvador; Celso-Arellano, Pedro Luis; Gualajara, Victor; Coronado, Semei – European Journal of Contemporary Education, 2019

The objective of this study is to analyze the ability of students of the University Center for the Economic Administrative Sciences which forms part of the University of Guadalajara from different economic-administrative undergraduate programs, to solve distinct problems in the area of probability, applying a multiple-choice instrument aligned to…

Descriptors: Probability, Undergraduate Students, Economics Education, Problem Solving

An Analysis for the Qualitative Improvement of Education and Learning Based on the Way of Learner Errors in Descriptive Questions

Peer reviewed
PDF on ERIC

Download full text

Tsubaki, Michiko; Ogawara, Wataru; Tanaka, Kenta – International Electronic Journal of Mathematics Education, 2020

This study proposes and examines an analytical method with the aim of improving the quality of education and learning by situating the answers to full descriptive questions in probability and statistics to make variables of learners' comprehension of learned content as answer characteristics, based on actual student mistakes. First, we proposed…

Descriptors: Probability, Statistics, Comprehension, Learning Strategies

The Prevalence and Implications of Slipping on Low-Stakes, Large-Scale Assessments

Peer reviewed

Direct link

Culpepper, Steven Andrew – Journal of Educational and Behavioral Statistics, 2017

In the absence of clear incentives, achievement tests may be subject to the effect of slipping where item response functions have upper asymptotes below one. Slipping reduces score precision for higher latent scores and distorts test developers' understandings of item and test information. A multidimensional four-parameter normal ogive model was…

Descriptors: Measurement, Achievement Tests, Item Response Theory, National Competency Tests

Testing Manifest Monotonicity Using Order-Constrained Statistical Inference

Peer reviewed

Direct link

Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas – Psychometrika, 2013

Most dichotomous item response models share the assumption of latent monotonicity, which states that the probability of a positive response to an item is a nondecreasing function of a latent variable intended to be measured. Latent monotonicity cannot be evaluated directly, but it implies manifest monotonicity across a variety of observed scores,…

Descriptors: Item Response Theory, Statistical Inference, Probability, Psychometrics

Assessing the Effect of Language Demand in Bundles of Math Word Problems

Peer reviewed

Direct link

Banks, Kathleen; Jeddeeni, Ahmad; Walker, Cindy M. – International Journal of Testing, 2016

Differential bundle functioning (DBF) analyses were conducted to determine whether seventh and eighth grade second language learners (SLLs) had lower probabilities of answering bundles of math word problems correctly that had heavy language demands, when compared to non-SLLs of equal math proficiency. Math word problems on each of four test forms…

Descriptors: Middle School Students, English Language Learners, Second Language Learning, Grade 7

Investigating Causal DIF via Propensity Score Methods

Peer reviewed
PDF on ERIC

Download full text

Liu, Yan; Zumbo, Bruno D.; Gustafson, Paul; Huang, Yi; Kroc, Edward; Wu, Amery D. – Practical Assessment, Research & Evaluation, 2016

A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g.,…

Descriptors: Test Items, Test Bias, Probability, Scores

Teaching College Probability for Higher Achievement

Peer reviewed
PDF on ERIC

Download full text

Papaieronymou, Irini – Athens Journal of Education, 2017

This paper presents the results of a study which examined the role of particular tasks implemented through two instructional methods on college students' achievement in probability. A mixed methods design that utilized a pre-test (with multiple-choice items) and post-test (with multiple-choice and open-ended items) in treatment and control groups…

Descriptors: Teaching Methods, College Students, Academic Achievement, Probability

Abridged Mathematics Framework for the 2017 National Assessment of Educational Progress

Download full text

National Assessment Governing Board, 2017

The National Assessment of Educational Progress (NAEP) is the only continuing and nationally representative measure of trends in academic achievement of U.S. elementary and secondary school students in various subjects. For more than four decades, NAEP assessments have been conducted periodically in reading, mathematics, science, writing, U.S.…

Descriptors: Mathematics Achievement, Multiple Choice Tests, National Competency Tests, Educational Trends

CAT Model with Personalized Algorithm for Evaluation of Estimated Student Knowledge

Peer reviewed

Direct link

Andjelic, Svetlana; Cekerevac, Zoran – Education and Information Technologies, 2014

This article presents the original model of the computer adaptive testing and grade formation, based on scientifically recognized theories. The base of the model is a personalized algorithm for selection of questions depending on the accuracy of the answer to the previous question. The test is divided into three basic levels of difficulty, and the…

Descriptors: Computer Assisted Testing, Educational Technology, Grades (Scholastic), Test Construction

An Investigation of Mathematical Literacy Assessment Supported by an Application of Rasch Measurement

Peer reviewed
PDF on ERIC

Download full text

Long, Caroline; Bansilal, Sarah; Debba, Rajan – Pythagoras, 2014

Mathematical Literacy (ML) is a relatively new school subject that learners study in the final 3 years of high school and is examined as a matric subject. An investigation of a 2009 provincial examination written by matric pupils was conducted on both the curriculum elements of the test and learner performance. In this study we supplement the…

Descriptors: Numeracy, Item Response Theory, Mathematics Instruction, Test Items

Previous Page | Next Page »

Pages: 1 | 2 | 3

International Journal of…	3
Applied Psychological…	2
Hacettepe University Journal…	2
Journal of Educational and…	2
Practical Assessment,…	2
Psychometrika	2
Applied Measurement in…	1
Assessment for Effective…	1
Athens Journal of Education	1
ETS Research Report Series	1
Education and Information…	1
Educational and Psychological…	1
European Journal of…	1
Infant and Child Development	1
International Electronic…	1
International Journal of…	1
International Journal of…	1
National Assessment Governing…	1
Physical Review Special…	1
Psicologica: International…	1
Pythagoras	1
More ▼

Liu, Yan	2
Lord, Frederic M.	2
Metsämuuronen, Jari	2
Sijtsma, Klaas	2
Zumbo, Bruno D.	2
Abdel-fattah, Abdel-fattah A.	1
Abu-Ghazalah, Rashid M.	1
Andjelic, Svetlana	1
Atar, Burcu	1
Banks, Kathleen	1
Bansilal, Sarah	1
Catley, Kefyn M.	1
Cekerevac, Zoran	1
Celso-Arellano, Pedro Luis	1
Chen, Michelle Y.	1
Coronado, Semei	1
Culpepper, Steven Andrew	1
Debba, Rajan	1
Dimitrov, Dimiter M.	1
Dubins, David N.	1
Emons, Wilco H. M.	1
Ferrando, Pere J.	1
Fluke, Rickey	1
Gualajara, Victor	1
Gustafson, Paul	1
More ▼