ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	23

Descriptor

Probability	45
Test Items	45
Item Response Theory	15
Simulation	13
Difficulty Level	11
Models	9
Comparative Analysis	7
Psychometrics	7
Adaptive Testing	6
Computation	6
Computer Assisted Testing	6
Equations (Mathematics)	6
Classification	5
Estimation (Mathematics)	5
Item Bias	5
Mathematical Models	5
Multiple Choice Tests	5
Statistical Analysis	5
Ability	4
Equated Scores	4
Error of Measurement	4
Evaluation Methods	4
Factor Analysis	4
Higher Education	4
Item Analysis	4
More ▼

Source

Journal of Educational and…	6
Journal of Educational…	4
Psychometrika	4
Applied Measurement in…	3
Educational and Psychological…	3
Applied Psychological…	2
International Journal of…	2
Measurement:…	2
Educational Researcher	1
Journal of Experimental…	1
Journal of Outcome Measurement	1
Journal of Psycholinguistic…	1
Practical Assessment,…	1
Psicologica: International…	1
Research in Developmental…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Reports - Evaluative	45
Journal Articles	34
Speeches/Meeting Papers	5
Reports - Research	1

Education Level

Higher Education	2
Elementary Education	1
Elementary Secondary Education	1
Grade 8	1
Postsecondary Education	1

Audience

Practitioners

Location

China	1
Oregon	1
Sweden	1
United Kingdom	1
Vermont	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Law School Admission Test	1
National Assessment of…	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 45 results Save | Export

What Is Actually Equated in "Test Equating"? A Didactic Note

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022

The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…

Descriptors: Equated Scores, Test Items, Scores, Probability

Interval Estimation of Item Response Probabilities along Studied Latent Dimensions

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Pusic, Martin – Measurement: Interdisciplinary Research and Perspectives, 2021

An interval estimation procedure is discussed that can be used to evaluate the probability of a particular response for a binary or binary scored item at a pre-specified point along an underlying latent continuum. The item is assumed to: (a) be part of a unidimensional multi-component measuring instrument that may contain also polytomous items,…

Descriptors: Item Response Theory, Computation, Probability, Test Items

The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models

Peer reviewed

Direct link

Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…

Descriptors: Reliability, Probability, Skill Development, Classification

Kernel Equating Using Propensity Scores for Nonequivalent Groups

Peer reviewed

Direct link

Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2019

When equating two test forms, the equated scores will be biased if the test groups differ in ability. To adjust for the ability imbalance between nonequivalent groups, a set of common items is often used. When no common items are available, it has been suggested to use covariates correlated with the test scores instead. In this article, we reduce…

Descriptors: Equated Scores, Test Items, Probability, College Entrance Examinations

Probabilistic Approaches to Examining Linguistic Features of Test Items and Their Effect on the Performance of English Language Learners

Peer reviewed

Direct link

Solano-Flores, Guillermo – Applied Measurement in Education, 2014

This article addresses validity and fairness in the testing of English language learners (ELLs)--students in the United States who are developing English as a second language. It discusses limitations of current approaches to examining the linguistic features of items and their effect on the performance of ELL students. The article submits that…

Descriptors: English Language Learners, Test Items, Probability, Test Bias

Studying Differential Item Functioning via Latent Variable Modeling: A Note on a Multiple-Testing Procedure

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Lee, Chun-Lung; Chang, Chi – Educational and Psychological Measurement, 2013

This note is concerned with a latent variable modeling approach for the study of differential item functioning in a multigroup setting. A multiple-testing procedure that can be used to evaluate group differences in response probabilities on individual items is discussed. The method is readily employed when the aim is also to locate possible…

Descriptors: Test Bias, Statistical Analysis, Models, Hypothesis Testing

Standard Error of Linear Observed-Score Equating for the NEAT Design with Nonnormally Distributed Data

Peer reviewed

Direct link

Zu, Jiyun; Yuan, Ke-Hai – Journal of Educational Measurement, 2012

In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…

Descriptors: Sample Size, Equated Scores, Test Items, Error of Measurement

Assessing the Discriminating Power of Item and Test Scores in the Linear Factor-Analysis Model

Peer reviewed
PDF on ERIC

Download full text

Ferrando, Pere J. – Psicologica: International Journal of Methodology and Experimental Psychology, 2012

Model-based attempts to rigorously study the broad and imprecise concept of "discriminating power" are scarce, and generally limited to nonlinear models for binary responses. This paper proposes a comprehensive framework for assessing the discriminating power of item and test scores which are analyzed or obtained using Spearman's…

Descriptors: Student Evaluation, Psychometrics, Test Items, Scores

An Analytic Comparison of Effect Sizes for Differential Item Functioning

Peer reviewed

Direct link

Demars, Christine E. – Applied Measurement in Education, 2011

Three types of effects sizes for DIF are described in this exposition: log of the odds-ratio (differences in log-odds), differences in probability-correct, and proportion of variance accounted for. Using these indices involves conceptualizing the degree of DIF in different ways. This integrative review discusses how these measures are impacted in…

Descriptors: Effect Size, Test Bias, Probability, Difficulty Level

Termination Criteria for Computerized Classification Testing

Peer reviewed

Direct link

Thompson, Nathan A. – Practical Assessment, Research & Evaluation, 2011

Computerized classification testing (CCT) is an approach to designing tests with intelligent algorithms, similar to adaptive testing, but specifically designed for the purpose of classifying examinees into categories such as "pass" and "fail." Like adaptive testing for point estimation of ability, the key component is the…

Descriptors: Adaptive Testing, Computer Assisted Testing, Classification, Probability

Psychometric Properties of a Chinese Version of the Developmental Coordination Disorder Questionnaire in Community-Based Children

Peer reviewed

Direct link

Tseng, Mei-Hui; Fu, Chung-Pei; Wilson, Brenda N.; Hu, Fu-Chang – Research in Developmental Disabilities: A Multidisciplinary Journal, 2010

The aim of this study was to adapt and evaluate the Developmental Coordination Disorder Questionnaire (DCDQ) for use in Chinese-speaking countries. A total of 1082 parents completed the DCDQ and 35 parents repeated it after 2 weeks for test-retest reliability. Two items were deleted after examination of test consistency. Cronbach's [alpha] for the…

Descriptors: Test Validity, Measures (Individuals), Psychometrics, Probability

The Role of Memory Activation in Creating False Memories of Encoding Context

Peer reviewed

Direct link

Arndt, Jason – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2010

Using 3 experiments, I examined false memory for encoding context by presenting Deese-Roediger-McDermott themes (Deese, 1959; Roediger & McDermott, 1995) in usual-looking fonts and by testing related, but unstudied, lure items in a font that was shown during encoding. In 2 of the experiments, testing lure items in the font used to study their…

Descriptors: Testing, Recognition (Psychology), Experiments, Memory

Using the Attribute Hierarchy Method to Identify and Interpret Cognitive Skills that Produce Group Differences

Peer reviewed

Direct link

Gierl, Mark J.; Zheng, Yinggan; Cui, Ying – Journal of Educational Measurement, 2008

The purpose of this study is to describe how the attribute hierarchy method (AHM) can be used to evaluate differential group performance at the cognitive attribute level. The AHM is a psychometric method for classifying examinees' test item responses into a set of attribute-mastery patterns associated with different components in a cognitive model…

Descriptors: Test Items, Student Reaction, Pattern Recognition, Psychometrics

Predicting Naming Latencies with an Analogical Model

Peer reviewed

Direct link

Chandler, Steve – Journal of Psycholinguistic Research, 2008

Skousen's (1989, Analogical modeling of language, Kluwer Academic Publishers, Dordrecht) Analogical Model (AM) predicts behavior such as spelling pronunciation by comparing the characteristics of a test item (a given input word) to those of individual exemplars in a data set of previously encountered items. While AM and other exemplar-based models…

Descriptors: Test Items, Reaction Time, Psycholinguistics, Probability

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3

Johnson, Matthew S.	2
Marcoulides, George A.	2
Raykov, Tenko	2
Sinharay, Sandip	2
Solano-Flores, Guillermo	2
Veldkamp, Bernard P.	2
van der Linden, Wim J.	2
Abdel-fattah, Abdel-fattah A.	1
Arndt, Jason	1
Asparouhov, Tihomir	1
Beretvas, S. Natasha	1
Bielinski, John	1
Camilli, Gregory	1
Chandler, Steve	1
Chang, Chi	1
Chang, Chih-Hsin	1
Chis, Liliana	1
Clauser, Brian E.	1
Cohen, Steve	1
Cui, Ying	1
Davison, Mark L.	1
Demars, Christine E.	1
Eggen, Theo J. H. M.	1
Ferdous, Abdullah A.	1
More ▼