Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Adaptive Testing | 17 |
Scores | 17 |
Scoring | 17 |
Computer Assisted Testing | 10 |
Item Response Theory | 10 |
Test Construction | 9 |
Difficulty Level | 4 |
Educational Assessment | 4 |
Item Analysis | 4 |
Psychometrics | 4 |
Test Items | 4 |
More ▼ |
Source
Author
Lord, Frederic M. | 2 |
Aviad-Levitzky, Tami | 1 |
Bock, R. Darrell | 1 |
Davey, Tim | 1 |
DeAyala, R. J. | 1 |
Glas, Cees A. W. | 1 |
Goldstein, Zahava | 1 |
Hambleton, Ronald K. | 1 |
Herbert, Erin | 1 |
Hong, Kian Sam | 1 |
Judd, Wallace | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 7 |
Reports - Evaluative | 6 |
Speeches/Meeting Papers | 3 |
Guides - Non-Classroom | 2 |
Books | 1 |
Numerical/Quantitative Data | 1 |
Education Level
Secondary Education | 2 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 1 |
Graduate Record Examinations | 1 |
NEO Personality Inventory | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Aviad-Levitzky, Tami; Laufer, Batia; Goldstein, Zahava – Language Assessment Quarterly, 2019
This article describes the development and validation of the new CATSS (Computer Adaptive Test of Size and Strength), which measures vocabulary knowledge in four modalities -- productive recall, receptive recall, productive recognition, and receptive recognition. In the first part of the paper we present the assumptions that underlie the test --…
Descriptors: Foreign Countries, Test Construction, Test Validity, Test Reliability
Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry – ETS Research Report Series, 2015
The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…
Descriptors: Item Response Theory, Computation, Statistical Bias, Error of Measurement
Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W. – Assessment, 2013
Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…
Descriptors: Computer Assisted Testing, Adaptive Testing, Personality Measures, Accuracy
Partnership for Assessment of Readiness for College and Careers, 2016
The Partnership for Assessment of Readiness for College and Careers (PARCC) is a state-led consortium designed to create next-generation assessments that, compared to traditional K-12 assessments, more accurately measure student progress toward college and career readiness. The PARCC assessments are aligned to the Common Core State Standards…
Descriptors: Standardized Tests, Career Readiness, College Readiness, Test Validity
Judd, Wallace – Practical Assessment, Research & Evaluation, 2009
Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…
Descriptors: Test Items, Test Format, Evaluation, Item Analysis
Lau, Paul Ngee Kiong; Lau, Sie Hoe; Hong, Kian Sam; Usop, Hasbee – Educational Technology & Society, 2011
The number right (NR) method, in which students pick one option as the answer, is the conventional method for scoring multiple-choice tests that is heavily criticized for encouraging students to guess and failing to credit partial knowledge. In addition, computer technology is increasingly used in classroom assessment. This paper investigates the…
Descriptors: Guessing (Tests), Multiple Choice Tests, Computers, Scoring
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Wise, Steven L. – 1999
Outside of large-scale testing programs, the computerized adaptive test (CAT) has thus far had only limited impact on measurement practice. In smaller-scale testing contexts, limited data are often available, which precludes the establishment of calibrated item pools for use by traditional (i.e., item response theory (IRT) based) CATs. This paper…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Scores

Stocking, Martha L. – Journal of Educational and Behavioral Statistics, 1996
An alternative method for scoring adaptive tests, based on number-correct scores, is explored and compared with a method that relies more directly on item response theory. Using the number-correct score with necessary adjustment for intentional differences in adaptive test difficulty is a statistically viable scoring method. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Item Response Theory

Wheeler, Patricia H. – 1995
When individuals are given tests that are too hard or too easy, the resulting scores are likely to be poor estimates of their performance. To get valid and accurate test scores that provide meaningful results, one should use functional-level testing (FLT). FLT is the practice of administering to an individual a version of a test with a difficulty…
Descriptors: Adaptive Testing, Difficulty Level, Educational Assessment, Performance
Slater, Sharon C.; Schaeffer, Gary A. – 1996
The General Computer Adaptive Test (CAT) of the Graduate Record Examinations (GRE) includes three operational sections that are separately timed and scored. A "no score" is reported if the examinee answers fewer than 80% of the items or if the examinee does not answer all of the items and leaves the section before time expires. The 80%…
Descriptors: Adaptive Testing, College Students, Computer Assisted Testing, Equal Education
Lord, Frederic M. – 1980
The purpose of this book is to make it possible for measurement specialists to solve practical testing problems through the use of item response theory (IRT). The topics, organization, and presentation are those used in a 4-week seminar held each summer for the past several years. The material is organized to facilitate understanding; all related…
Descriptors: Adaptive Testing, Estimation (Mathematics), Evaluation Problems, Item Analysis
Bock, R. Darrell; Zimowski, Michele F. – National Center for Education Statistics, 2003
This report examines the potential of adaptive testing, two?-stage testing in particular, for improving the data quality of the National Assessment of Educational Progress (NAEP). Following a discussion of the rationale for adaptive testing in assessment and a review of previous studies of two-?stage testing, this report describes a 1993 Ohio…
Descriptors: National Competency Tests, Test Validity, Feasibility Studies, Educational Assessment
DeAyala, R. J.; Koch, William R. – 1986
A computerized flexilevel test was implemented and its ability estimates were compared with those of a Bayesian estimation based computerized adaptive test (CAT) as well as with known true ability estimates. Results showed that when the flexilevel test was terminated according to Lord's criterion, its ability estimates were highly and…
Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Comparative Analysis
Lord, Frederic M. – 1971
Some stochastic approximation procedures are considered in relation to the problem of choosing a sequence of test questions to accurately estimate a given examinee's standing on a psychological dimension. Illustrations are given evaluating certain procedures in a specific context. (Author/CK)
Descriptors: Academic Ability, Adaptive Testing, Computer Programs, Difficulty Level
Previous Page | Next Page ยป
Pages: 1 | 2