ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	10

Descriptor

Statistical Analysis	21
Hypothesis Testing	7
Test Items	7
Computer Assisted Testing	6
Error of Measurement	6
Item Response Theory	6
Measurement Techniques	5
Simulation	5
Adaptive Testing	4
Sampling	4
Analysis of Variance	3
Comparative Analysis	3
Computation	3
Equated Scores	3
Item Analysis	3
Maximum Likelihood Statistics	3
Models	3
Reliability	3
Research Design	3
Test Bias	3
Test Construction	3
Classification	2
Computer Software	2
Educational Testing	2
Evaluation Methods	2
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	17
Reports - Evaluative	9
Reports - Research	6
Information Analyses	2
Collected Works - Serials	1
Reports - Descriptive	1

Education Level

Higher Education

Audience

Location

Canada (Toronto)

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly

Peer reviewed

Direct link

Veldkamp, Bernard P.; Matteucci, Mariagiulia; de Jong, Martijn G. – Applied Psychological Measurement, 2013

Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values,…

Descriptors: Test Construction, Test Items, Item Banks, Automation

Detecting Differential Item Functioning of Polytomous Items for an Ideal Point Response Process

Peer reviewed

Direct link

Wang, Wei; Tay, Louis; Drasgow, Fritz – Applied Psychological Measurement, 2013

There has been growing use of ideal point models to develop scales measuring important psychological constructs. For meaningful comparisons across groups, it is important to identify items on such scales that exhibit differential item functioning (DIF). In this study, the authors examined several methods for assessing DIF on polytomous items…

Descriptors: Test Bias, Effect Size, Item Response Theory, Statistical Analysis

A Stochastic Method for Balancing Item Exposure Rates in Computerized Classification Tests

Peer reviewed

Direct link

Huebner, Alan; Li, Zhushan – Applied Psychological Measurement, 2012

Computerized classification tests (CCTs) classify examinees into categories such as pass/fail, master/nonmaster, and so on. This article proposes the use of stochastic methods from sequential analysis to address item overexposure, a practical concern in operational CCTs. Item overexposure is traditionally dealt with in CCTs by the Sympson-Hetter…

Descriptors: Computer Assisted Testing, Classification, Statistical Analysis, Test Items

A Review of DIMPACK Version 1.0: Conditional Covariance-Based Test Dimensionality Analysis Package

Peer reviewed

Direct link

Deng, Nina; Han, Kyung T.; Hambleton, Ronald K. – Applied Psychological Measurement, 2013

DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional covariance approach is reviewed. This software was originally distributed by Assessment Systems Corporation and now can be freely accessed online. The software consists of Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC, which…

Descriptors: Item Response Theory, Nonparametric Statistics, Statistical Analysis, Computer Software

Variations on Stochastic Curtailment in Sequential Mastery Testing

Peer reviewed

Direct link

Finkelman, Matthew David – Applied Psychological Measurement, 2010

In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability…

Descriptors: Mastery Tests, Computer Assisted Testing, Adaptive Testing, Test Length

Comparison of CAT Item Selection Criteria for Polytomous Items

Peer reviewed

Direct link

Choi, Seung W.; Swartz, Richard J. – Applied Psychological Measurement, 2009

Item selection is a core component in computerized adaptive testing (CAT). Several studies have evaluated new and classical selection methods; however, the few that have applied such methods to the use of polytomous items have reported conflicting results. To clarify these discrepancies and further investigate selection method properties, six…

Descriptors: Adaptive Testing, Item Analysis, Comparative Analysis, Test Items

Conservativeness in Rejection of the Null Hypothesis when Using the Continuity Correction in the MH Chi-Square Test in DIF Applications

Peer reviewed

Direct link

Paek, Insu – Applied Psychological Measurement, 2010

Conservative bias in rejection of a null hypothesis from using the continuity correction in the Mantel-Haenszel (MH) procedure was examined through simulation in a differential item functioning (DIF) investigation context in which statistical testing uses a prespecified level [alpha] for the decision on an item with respect to DIF. The standard MH…

Descriptors: Test Bias, Statistical Analysis, Sample Size, Error of Measurement

Detecting Answer Copying Using the Kappa Statistic

Peer reviewed

Direct link

Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006

A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…

Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

A Method for Severely Constrained Item Selection in Adaptive Testing.

Peer reviewed

Stocking, Martha L.; Swanson, Len – Applied Psychological Measurement, 1993

A method is presented for incorporating a large number of constraints on adaptive item selection in the construction of computerized adaptive tests. The method, which emulates practices of expert test specialists, is illustrated for verbal and quantitative measures. Its foundation is application of a weighted deviations model and algorithm. (SLD)

Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Expert Systems

Methodology Review: Statistical Approaches for Assessing Measurement Bias.

Peer reviewed

Millsap, Roger E.; Everson, Howard T. – Applied Psychological Measurement, 1993

This review employs a conceptual framework that distinguishes methods of detecting measurement bias based on either observed or unobserved conditional invariance models. The primary interest is in group-level measurement bias. Methods for bias detection in both continuous and ordered-categorical measures are reviewed, as are methods for testlets…

Descriptors: Educational Assessment, Educational Testing, Evaluation Methods, Identification

A Monte Carlo Approach to Unidimensionality Testing in Polytomous Rasch Models

Peer reviewed

Direct link

Christensen, Karl Bang; Kreiner, Svend – Applied Psychological Measurement, 2007

Many statistical tests are designed to test the different assumptions of the Rasch model, but only few are directed at detecting multidimensionality. The Martin-Lof test is an attractive approach, the disadvantage being that its null distribution deviates strongly from the asymptotic chi-square distribution for most realistic sample sizes. A Monte…

Descriptors: Item Response Theory, Monte Carlo Methods, Testing, Models

A Comparison of the Nedelsky and Angoff Cutting Score Procedures Using Generalizability Theory.

Peer reviewed

Brennan, Robert L.; Lockwood, Robert E. – Applied Psychological Measurement, 1980

Generalizability theory is used to characterize and quantify expected variance in cutting scores and to compare the Nedelsky and Angoff procedures for establishing a cutting score. Results suggest that the restricted nature of the Nedelsky (inferred) probability scale may limit its applicability in certain contexts. (Author/BW)

Descriptors: Cutting Scores, Generalization, Statistical Analysis, Test Reliability

Effects of Semantic Incompatibility on Rating Response

Peer reviewed

Direct link

Lam, Tony C. M.; Kolic, Mary – Applied Psychological Measurement, 2008

Semantic incompatibility, an error in constructing measuring instruments for rating oneself, others, or objects, refers to the extent to which item wordings are incongruent with, and hence inappropriate for, scale labels and vice versa. This study examines the effects of semantic incompatibility on rating responses. Using a 2 x 2 factorial design…

Descriptors: Semantics, Rating Scales, Statistical Analysis, Academic Ability

A Note on "Planning an Experiment in the Company of Measurement Error" by Levin and Subkoviak.

Peer reviewed

Forsyth, Robert A. – Applied Psychological Measurement, 1978

This note shows that, under conditions specified by Levin and Subkoviak (TM 503 420), it is not necessary to specify the reliabilities of observed scores when comparing completely randomized designs with randomized block designs. Certain errors in their illustrative example are also discussed. (Author/CTM)

Descriptors: Analysis of Variance, Error of Measurement, Hypothesis Testing, Reliability

Correcting "Planning an Experiment in the Company of Measurement Error."

Peer reviewed

Levin, Joel R.; Subkoviak, Michael J. – Applied Psychological Measurement, 1978

Comments (TM 503 706) on an earlier article (TM 503 420) concerning the comparison of the completely randomized design and the randomized block design are acknowledged and appreciated. In addition, potentially misleading notions arising from these comments are addressed and clarified. (See also TM 503 708). (Author/CTM)

Descriptors: Analysis of Variance, Error of Measurement, Hypothesis Testing, Reliability

Previous Page | Next Page »

Pages: 1 | 2

Forsyth, Robert A.	2
Brennan, Robert L.	1
Choi, Seung W.	1
Christensen, Karl Bang	1
Dawis, Rene V.	1
Deng, Nina	1
Drasgow, Fritz	1
Everson, Howard T.	1
Finkelman, Matthew David	1
Gialluca, Kathleen A.	1
Hambleton, Ronald K.	1
Han, Kyung T.	1
Huebner, Alan	1
Kolen, Michael J.	1
Kolic, Mary	1
Kreiner, Svend	1
Lam, Tony C. M.	1
Levin, Joel R.	1
Li, Zhushan	1
Lockwood, Robert E.	1
Matteucci, Mariagiulia	1
Meijer, Rob R.	1
Millsap, Roger E.	1
Paek, Insu	1
Sotaridona, Leonardo S.	1
More ▼