ERIC - Search Results

Publication Date

In 2025	1
Since 2024	5
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	12

Descriptor

Test Construction	29
Computer Assisted Testing	11
Item Response Theory	11
Test Items	10
Adaptive Testing	9
Simulation	6
Bayesian Statistics	5
Psychometrics	5
Evaluation Methods	4
Models	4
Achievement Tests	3
Algorithms	3
Educational Assessment	3
Estimation (Mathematics)	3
Internet	3
Scoring	3
Student Evaluation	3
Ability	2
Academic Achievement	2
Causal Models	2
Correlation	2
Cutting Scores	2
Difficulty Level	2
Educational Research	2
Elementary Secondary Education	2
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	29
Reports - Evaluative	11
Reports - Research	10
Reports - Descriptive	7
Opinion Papers	2

Education Level

Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 2	1
Middle Schools	1
Primary Education	1

Audience

Researchers

Location

Tennessee

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Utilizing Real-Time Test Data to Solve Attenuation Paradox in Computerized Adaptive Testing to Enhance Optimal Design

Peer reviewed

Direct link

Jyun-Hong Chen; Hsiu-Yi Chao – Journal of Educational and Behavioral Statistics, 2024

To solve the attenuation paradox in computerized adaptive testing (CAT), this study proposes an item selection method, the integer programming approach based on real-time test data (IPRD), to improve test efficiency. The IPRD method turns information regarding the ability distribution of the population from real-time test data into feasible test…

Descriptors: Data Use, Computer Assisted Testing, Adaptive Testing, Design

Improving Balance in Educational Measurement: A Legacy of E. F. Lindquist

Peer reviewed

Direct link

Daniel Koretz – Journal of Educational and Behavioral Statistics, 2024

A critically important balance in educational measurement between practical concerns and matters of technique has atrophied in recent decades, and as a result, some important issues in the field have not been adequately addressed. I start with the work of E. F. Lindquist, who exemplified the balance that is now wanting. Lindquist was arguably the…

Descriptors: Educational Assessment, Evaluation Methods, Achievement Tests, Educational History

A Two-Level Adaptive Test Battery

Peer reviewed

Direct link

Wim J. van der Linden; Luping Niu; Seung W. Choi – Journal of Educational and Behavioral Statistics, 2024

A test battery with two different levels of adaptation is presented: a within-subtest level for the selection of the items in the subtests and a between-subtest level to move from one subtest to the next. The battery runs on a two-level model consisting of a regular response model for each of the subtests extended with a second level for the joint…

Descriptors: Adaptive Testing, Test Construction, Test Format, Test Reliability

Disentangling Person-Dependent and Item-Dependent Causal Effects: Applications of Item Response Theory to the Estimation of Treatment Effect Heterogeneity

Peer reviewed

Direct link

Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Journal of Educational and Behavioral Statistics, 2025

Analyzing heterogeneous treatment effects (HTEs) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and preintervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…

Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics

Chance-Constrained Automated Test Assembly

Peer reviewed

Direct link

Giada Spaccapanico Proietti; Mariagiulia Matteucci; Stefania Mignani; Bernard P. Veldkamp – Journal of Educational and Behavioral Statistics, 2024

Classical automated test assembly (ATA) methods assume fixed and known coefficients for the constraints and the objective function. This hypothesis is not true for the estimates of item response theory parameters, which are crucial elements in test assembly classical models. To account for uncertainty in ATA, we propose a chance-constrained…

Descriptors: Automation, Computer Assisted Testing, Ambiguity (Context), Item Response Theory

The Cut-Score Operating Function: A New Tool to Aid in Standard Setting

Peer reviewed

Direct link

Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017

In this essay, we describe the construction and use of the Cut-Score Operating Function in aiding standard setting decisions. The Cut-Score Operating Function shows the relation between the cut-score chosen and the consequent error rate. It allows error rates to be defined by multiple loss functions and will show the behavior of each loss…

Descriptors: Cutting Scores, Standard Setting (Scoring), Decision Making, Error Patterns

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Bad Questions: An Essay Involving Item Response Theory

Peer reviewed

Direct link

Thissen, David – Journal of Educational and Behavioral Statistics, 2016

David Thissen, a professor in the Department of Psychology and Neuroscience, Quantitative Program at the University of North Carolina, has consulted and served on technical advisory committees for assessment programs that use item response theory (IRT) over the past couple decades. He has come to the conclusion that there are usually two purposes…

Descriptors: Item Response Theory, Test Construction, Testing Problems, Student Evaluation

Bayesian Network Models for Local Dependence among Observable Outcome Variables

Peer reviewed

Direct link

Almond, Russell G.; Mulder, Joris; Hemat, Lisa A.; Yan, Duanli – Journal of Educational and Behavioral Statistics, 2009

Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context--ignores dependence among observables; (b) compensatory context--introduces…

Descriptors: Bayesian Statistics, Models, Observation, Experiments

14 Conversations about Three Things

Peer reviewed

Direct link

Wainer, Howard – Journal of Educational and Behavioral Statistics, 2010

In this essay, the author tries to look forward into the 21st century to divine three things: (i) What skills will researchers in the future need to solve the most pressing problems? (ii) What are some of the most likely candidates to be those problems? and (iii) What are some current areas of research that seem mined out and should not distract…

Descriptors: Research Skills, Researchers, Internet, Access to Information

Profiles in Research: Susan E. Embretson

Peer reviewed

Direct link

Wainer, Howard; Robinson, Daniel H. – Journal of Educational and Behavioral Statistics, 2007

This article presents an interview with Susan E. Embretson. Embretson attended the University of Minnesota where she received her bachelor's degree in 1967 and earned a PhD in 1973 in psychology. She became an assistant professor at the University of Kansas in 1974 and was promoted to associate professor and full professor. In 2004, she accepted a…

Descriptors: Educational Research, Psychometrics, Cognitive Psychology, Item Response Theory

A New Computer Algorithm for Simultaneous Test Construction of Two-Stage and Multistage Testing.

Peer reviewed

Wu, Ing-Long – Journal of Educational and Behavioral Statistics, 2001

Presents two binary programming models with a special network structure that can be explored computationally for simultaneous test construction. Uses an efficient special purpose network algorithm to solve these models. An empirical study illustrates the approach. (SLD)

Descriptors: Algorithms, Computer Software, Networks, Test Construction

Sample Size Requirements for Testing and Estimating Coefficient Alpha.

Peer reviewed

Bonett, Douglas G. – Journal of Educational and Behavioral Statistics, 2002

Derived an approximate test and confidence interval for coefficient alpha and used the approximate test and confidence interval to derive closed-form sample size formulas that can be used to determine the sample size needed to test coefficient alpha with desired power or to test coefficient alpha with desired precision. (SLD)

Descriptors: Estimation (Mathematics), Reliability, Sample Size, Test Construction

Uniform DIF and DIF Defined by Difference in Item Response Functions.

Peer reviewed

Hanson, Bradley A. – Journal of Educational and Behavioral Statistics, 1998

Presents precise definitions of uniform differential item functioning (DIF), unidirectional DIF, and parallel DIF. Shows that these three types of DIF are not equivalent, and discusses the theoretical relationships among them. Results demonstrate that cases of unidirectional and parallel DIF that have been considered uniform are not, in fact,…

Descriptors: Item Bias, Item Response Theory, Test Construction

An Assessment of Basic Computer Proficiency among Active Internet Users: Test Construction, Calibration, Antecedents and Consequences.

Peer reviewed

Bradlow, Eric T.; Hoch, Stephen J.; Hutchinson, J. Wesley – Journal of Educational and Behavioral Statistics, 2002

Developed a test of basic computer proficiency, examined its properties using parametric test scoring methods, and identified antecedents and consequences of differences in performance. Data from 1,520 Internet users suggest that the test yields an approximately unidimensional measure of basic computer proficiency. (SLD)

Descriptors: Computer Literacy, Internet, Measures (Individuals), Scoring

Previous Page | Next Page »

Pages: 1 | 2

Wainer, Howard	4
Bradlow, Eric T.	3
Thissen, David	2
Almond, Russell G.	1
Armstrong, Ronald D.	1
Ballou, Dale	1
Benjamin W. Domingue	1
Berger, Martijn P. F.	1
Bernard P. Veldkamp	1
Bonett, Douglas G.	1
Chen, Ping	1
Chen, Wen-Hung	1
Daniel Koretz	1
Giada Spaccapanico Proietti	1
Grabovsky, Irina	1
Hanson, Bradley A.	1
Harwell, Michael R.	1
Hemat, Lisa A.	1
Hoch, Stephen J.	1
Hsiu-Yi Chao	1
Hutchinson, J. Wesley	1
Jones, Douglas H.	1
Joshua B. Gilbert	1
Jyun-Hong Chen	1
More ▼