Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 26 |
Descriptor
Psychometrics | 54 |
Test Items | 22 |
Item Response Theory | 13 |
Test Construction | 13 |
Computer Assisted Testing | 10 |
Models | 8 |
Scores | 7 |
Difficulty Level | 6 |
Educational Assessment | 6 |
Equated Scores | 6 |
Foreign Countries | 6 |
More ▼ |
Source
Applied Measurement in… | 54 |
Author
Angoff, William H. | 2 |
Beck, Michael D. | 2 |
Brennan, Robert L. | 2 |
Gierl, Mark J. | 2 |
Hambleton, Ronald K. | 2 |
Moshinsky, Avital | 2 |
Puhan, Gautam | 2 |
Alves, Cecilia B. | 1 |
Amirhossein Rasooli | 1 |
Anastasi, Anne | 1 |
Antal, Judit | 1 |
More ▼ |
Publication Type
Journal Articles | 54 |
Reports - Research | 23 |
Reports - Evaluative | 21 |
Reports - Descriptive | 6 |
Information Analyses | 5 |
Speeches/Meeting Papers | 5 |
Opinion Papers | 3 |
Collected Works - General | 1 |
Education Level
Higher Education | 3 |
Elementary Education | 2 |
Grade 10 | 2 |
High Schools | 2 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 7 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
More ▼ |
Audience
Researchers | 1 |
Location
Canada | 3 |
Connecticut | 1 |
Georgia | 1 |
Germany | 1 |
Israel | 1 |
New York | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Amirhossein Rasooli; Christopher DeLuca – Applied Measurement in Education, 2024
Inspired by the recent 21st century social and educational movements toward equity, diversity, and inclusion for disadvantaged groups, educational researchers have sought in conceptualizing fairness in classroom assessment contexts. These efforts have provoked promising key theoretical foundations and empirical investigations to examine fairness…
Descriptors: Test Bias, Student Evaluation, Social Justice, Equal Education
O'Neill, Thomas R.; Gregg, Justin L.; Peabody, Michael R. – Applied Measurement in Education, 2020
This study addresses equating issues with varying sample sizes using the Rasch model by examining how sample size affects the stability of item calibrations and person ability estimates. A resampling design was used to create 9 sample size conditions (200, 100, 50, 45, 40, 35, 30, 25, and 20), each replicated 10 times. Items were recalibrated…
Descriptors: Sample Size, Equated Scores, Item Response Theory, Raw Scores
Furter, Robert T.; Dwyer, Andrew C. – Applied Measurement in Education, 2020
Maintaining equivalent performance standards across forms is a psychometric challenge exacerbated by small samples. In this study, the accuracy of two equating methods (Rasch anchored calibration and nominal weights mean) and four anchor item selection methods were investigated in the context of very small samples (N = 10). Overall, nominal…
Descriptors: Classification, Accuracy, Item Response Theory, Equated Scores
Visser, Linda; Cartschau, Friederike; von Goldammer, Ariane; Brandenburg, Janin; Timmerman, Marieke; Hasselhorn, Marcus; Mähler, Claudia – Applied Measurement in Education, 2023
The growing number of children in primary schools in Germany who have German as their second language (L2) has raised questions about the fairness of performance assessment. Fair tests are a prerequisite for distinguishing between L2 learning delay and a specific learning disability. We evaluated five commonly used reading and spelling tests for…
Descriptors: Foreign Countries, Error of Measurement, Second Language Learning, German
McGill, Ryan J.; Dombrowski, Stefan C. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) model presently serves as a blueprint for both test development and a taxonomy for clinical interpretation of modern tests of cognitive ability. Accordingly, the trend among test publishers has been toward creating tests that provide users with an ever-increasing array of scores that comport with CHC. However, an…
Descriptors: Models, Cognitive Ability, Intelligence Tests, Intelligence
Dynamic Bayesian Networks in Educational Measurement: Reviewing and Advancing the State of the Field
Reichenberg, Ray – Applied Measurement in Education, 2018
As the popularity of rich assessment scenarios increases so must the availability of psychometric models capable of handling the resulting data. Dynamic Bayesian networks (DBNs) offer a fast, flexible option for characterizing student ability across time under psychometrically complex conditions. In this article, a brief introduction to DBNs is…
Descriptors: Bayesian Statistics, Measurement, Student Evaluation, Psychometrics
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Beaujean, A. Alexander; Benson, Nicholas F. – Applied Measurement in Education, 2019
Charles Spearman and L. L. Thurstone were pioneers in the field of intelligence. They not only developed methods to assess and understand intelligence, but also developed theories about its structure and function. Methodologically, their approaches were not that distinct, but their theories of intelligence were philosophically very different --…
Descriptors: Psychologists, Intelligence Tests, Scores, Theories
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Boyd, Aimee M.; Dodd, Barbara; Fitzpatrick, Steven – Applied Measurement in Education, 2013
This study compared several exposure control procedures for CAT systems based on the three-parameter logistic testlet response theory model (Wang, Bradlow, & Wainer, 2002) and Masters' (1982) partial credit model when applied to a pool consisting entirely of testlets. The exposure control procedures studied were the modified within 0.10 logits…
Descriptors: Computer Assisted Testing, Item Response Theory, Test Construction, Models
Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016
Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…
Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis
Roduta Roberts, Mary; Alves, Cecilia B.; Chu, Man-Wai; Thompson, Margaret; Bahry, Louise M.; Gotzmann, Andrea – Applied Measurement in Education, 2014
The purpose of this study was to evaluate the adequacy of three cognitive models, one developed by content experts and two generated from student verbal reports for explaining examinee performance on a grade 3 diagnostic mathematics test. For this study, the items were developed to directly measure the attributes in the cognitive model. The…
Descriptors: Foreign Countries, Mathematics Tests, Cognitive Processes, Models
Edwards, Michael C.; Flora, David B.; Thissen, David – Applied Measurement in Education, 2012
This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Test Items
Kahraman, Nilufer; De Champlain, Andre; Raymond, Mark – Applied Measurement in Education, 2012
Item-level information, such as difficulty and discrimination are invaluable to the test assembly, equating, and scoring practices. Estimating these parameters within the context of large-scale performance assessments is often hindered by the use of unbalanced designs for assigning examinees to tasks and raters because such designs result in very…
Descriptors: Performance Based Assessment, Medicine, Factor Analysis, Test Items
Brennan, Robert L. – Applied Measurement in Education, 2010
This paper provides an overview of evidence-centered assessment design (ECD) and some general information about of the Advanced Placement (AP[R]) Program. Then the papers in this special issue are discussed, as they relate to the use of ECD in the revision of various AP tests. This paper concludes with some observations about the need to validate…
Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction