Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Author
De Ayala, R. J. | 2 |
Lissitz, Robert W. | 2 |
Ang, Cheng | 1 |
Bhola, Dennison S. | 1 |
Bos, Wilfried | 1 |
Carlson, James E. | 1 |
Chang, Yu-Wen | 1 |
Clauser, Brian E. | 1 |
Cohen, Allan S. | 1 |
Davison, Mark L. | 1 |
DeMars, Christine E. | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 24 |
Journal Articles | 13 |
Speeches/Meeting Papers | 10 |
Education Level
Higher Education | 4 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Audience
Location
Australia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 2 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Wendt, Heike; Bos, Wilfried; Goy, Martin – Educational Research and Evaluation, 2011
Several current international comparative large-scale assessments of educational achievement (ICLSA) make use of "Rasch models", to address functions essential for valid cross-cultural comparisons. From a historical perspective, ICLSA and Georg Rasch's "models for measurement" emerged at about the same time, half a century ago. However, the…
Descriptors: Measures (Individuals), Test Theory, Group Testing, Educational Testing
Liow, Jong-Leng – European Journal of Engineering Education, 2008
Peer assessment has been studied in various situations and actively pursued as a means by which students are given more control over their learning and assessment achievement. This study investigated the reliability of staff and student assessments in two oral presentations with limited feedback for a school-based thesis course in engineering…
Descriptors: Feedback (Response), Student Evaluation, Grade Point Average, Peer Evaluation
Clauser, Brian E.; And Others – 1991
Item bias has been a major concern for test developers during recent years. The Mantel-Haenszel statistic has been among the preferred methods for identifying biased items. The statistic's performance in identifying uniform bias in simulated data modeled by producing various levels of difference in the (item difficulty) b-parameter for reference…
Descriptors: Comparative Testing, Difficulty Level, Item Bias, Item Response Theory
Kong, Xiaojing J.; Wise, Steven L.; Bhola, Dennison S. – Educational and Psychological Measurement, 2007
This study compared four methods for setting item response time thresholds to differentiate rapid-guessing behavior from solution behavior. Thresholds were either (a) common for all test items, (b) based on item surface features such as the amount of reading required, (c) based on visually inspecting response time frequency distributions, or (d)…
Descriptors: Test Items, Reaction Time, Timed Tests, Item Response Theory
Shin, Tacksoo – Asia Pacific Education Review, 2007
This study introduces three growth modeling techniques: latent growth modeling (LGM), hierarchical linear modeling (HLM), and longitudinal profile analysis via multidimensional scaling (LPAMS). It compares the multilevel growth parameter estimates and potential predictor effects obtained using LGM, HLM, and LPAMS. The purpose of this multilevel…
Descriptors: Multidimensional Scaling, Academic Achievement, Structural Equation Models, Causal Models
Mazor, Kathleen M.; And Others – 1993
The Mantel-Haenszel (MH) procedure has become one of the most popular procedures for detecting differential item functioning (DIF). One of the most troublesome criticisms of this procedure is that while detection rates for uniform DIF are very good, the procedure is not sensitive to non-uniform DIF. In this study, examinee responses were generated…
Descriptors: Comparative Testing, Computer Simulation, Item Bias, Item Response Theory
Ang, Cheng; Miller, M. David – 1993
The power of the procedure of W. Stout to detect deviations from essential unidimensionality in two-dimensional data was investigated for minor, moderate, and large deviations from unidimensionality using criteria for deviations from unidimensionality based on prior research. Test lengths of 20 and 40 items and sample sizes of 700 and 1,500 were…
Descriptors: Ability, Comparative Testing, Correlation, Item Response Theory
Meijer, Rob R. – Journal of Educational Measurement, 2004
Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…
Descriptors: Probability, Adaptive Testing, Item Response Theory, Scores
van der Linden, Wim J. – Applied Psychological Measurement, 2006
Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Equated Scores
Chang, Yu-Wen; Davison, Mark L. – 1992
Standard errors and bias of unidimensional and multidimensional ability estimates were compared in a factorial, simulation design with two item response theory (IRT) approaches, two levels of test correlation (0.42 and 0.63), two sample sizes (500 and 1,000), and a hierarchical test content structure. Bias and standard errors of subtest scores…
Descriptors: Comparative Testing, Computer Simulation, Correlation, Error of Measurement
Nandakumar, Ratna – 1992
The performance of the following four methodologies for assessing unidimensionality was examined: (1) DIMTEST; (2) the approach of P. W. Holland and P. R. Rosenbaum; (3) linear factor analysis; and (4) non-linear factor analysis. Each method is examined and compared with other methods using simulated data sets and real data sets. Seven data sets,…
Descriptors: Ability, Comparative Testing, Correlation, Equations (Mathematics)
Sykes, Robert C.; And Others – 1992
A part-form methodology was used to study the effect of varying degrees of multidimensionality on the consistency of pass/fail classification decisions obtained from simulated unidimensional item response theory (IRT) based licensure examinations. A control on the degree of form multidimensionality permitted an assessment throughout the range of…
Descriptors: Classification, Comparative Testing, Computer Simulation, Decision Making
DeMars, Christine E. – Online Submission, 2005
Several methods for estimating item response theory scores for multiple subtests were compared. These methods included two multidimensional item response theory models: a bi-factor model where each subtest was a composite score based on the primary trait measured by the set of tests and a secondary trait measured by the individual subtest, and a…
Descriptors: Item Response Theory, Multidimensional Scaling, Correlation, Scoring Rubrics

De Ayala, R. J. – Applied Psychological Measurement, 1992
A computerized adaptive test (CAT) based on the nominal response model (NR CAT) was implemented, and the performance of the NR CAT and a CAT based on the three-parameter logistic model was compared. The NR CAT produced trait estimates comparable to those of the three-parameter test. (SLD)
Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Equations (Mathematics)
De Ayala, R. J. – 1992
One important and promising application of item response theory (IRT) is computerized adaptive testing (CAT). The implementation of a nominal response model-based CAT (NRCAT) was studied. Item pool characteristics for the NRCAT as well as the comparative performance of the NRCAT and a CAT based on the three-parameter logistic (3PL) model were…
Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation
Previous Page | Next Page ยป
Pages: 1 | 2