ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	7

Descriptor

Adaptive Testing	17
Scores	17
Scoring	17
Computer Assisted Testing	10
Item Response Theory	10
Test Construction	9
Difficulty Level	4
Educational Assessment	4
Item Analysis	4
Psychometrics	4
Test Items	4
Test Validity	4
Bayesian Statistics	3
Correlation	3
Foreign Countries	3
Test Format	3
Testing Problems	3
Accuracy	2
Comparative Analysis	2
Computation	2
Criterion Referenced Tests	2
Educational Technology	2
Error of Measurement	2
Estimation (Mathematics)	2
Feedback (Response)	2
More ▼

Source

ETS Research Report Series	2
Assessment	1
Educational Technology &…	1
Educational Testing Service	1
International Journal of…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
National Center for Education…	1
Partnership for Assessment of…	1
Practical Assessment,…	1

Publication Type

Journal Articles	8
Reports - Research	7
Reports - Evaluative	6
Speeches/Meeting Papers	3
Guides - Non-Classroom	2
Books	1
Numerical/Quantitative Data	1

Education Level

Secondary Education

Audience

Researchers

Location

Denmark	1
Israel	1
Malaysia	1
Ohio	1

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
Graduate Record Examinations	1
NEO Personality Inventory	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

The New Computer Adaptive Test of Size and Strength (CATSS): Development and Validation

Peer reviewed

Direct link

Aviad-Levitzky, Tami; Laufer, Batia; Goldstein, Zahava – Language Assessment Quarterly, 2019

This article describes the development and validation of the new CATSS (Computer Adaptive Test of Size and Strength), which measures vocabulary knowledge in four modalities -- productive recall, receptive recall, productive recognition, and receptive recognition. In the first part of the paper we present the assumptions that underlie the test --…

Descriptors: Foreign Countries, Test Construction, Test Validity, Test Reliability

Effectiveness of Item Response Theory (IRT) Proficiency Estimation Methods under Adaptive Multistage Testing. Research Report. ETS RR-15-11

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim; Yoo, Hanwook Henry – ETS Research Report Series, 2015

The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths…

Descriptors: Item Response Theory, Computation, Statistical Bias, Error of Measurement

Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

Peer reviewed

Direct link

Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W. – Assessment, 2013

Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

Descriptors: Computer Assisted Testing, Adaptive Testing, Personality Measures, Accuracy

PARCC Final Technical Report for 2015 Administration

Download full text

Partnership for Assessment of Readiness for College and Careers, 2016

The Partnership for Assessment of Readiness for College and Careers (PARCC) is a state-led consortium designed to create next-generation assessments that, compared to traditional K-12 assessments, more accurately measure student progress toward college and career readiness. The PARCC assessments are aligned to the Common Core State Standards…

Descriptors: Standardized Tests, Career Readiness, College Readiness, Test Validity

Gating Items: Definition, Significance, and Need for Further Study

Peer reviewed

Direct link

Judd, Wallace – Practical Assessment, Research & Evaluation, 2009

Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…

Descriptors: Test Items, Test Format, Evaluation, Item Analysis

Guessing, Partial Knowledge, and Misconceptions in Multiple-Choice Tests

Peer reviewed

Direct link

Lau, Paul Ngee Kiong; Lau, Sie Hoe; Hong, Kian Sam; Usop, Hasbee – Educational Technology & Society, 2011

The number right (NR) method, in which students pick one option as the answer, is the conventional method for scoring multiple-choice tests that is heavily criticized for encouraging students to guess and failing to credit partial knowledge. In addition, computer technology is increasingly used in classroom assessment. This paper investigates the…

Descriptors: Guessing (Tests), Multiple Choice Tests, Computers, Scoring

Modeling Change in Large-Scale Longitudinal Studies of Educational Growth: Four Decades of Contributions to the Assessment of Educational Growth. Research Report. ETS RR-12-04. ETS R&D Scientific and Policy Contributions Series. ETS SPC-12-01

Peer reviewed
PDF on ERIC

Download full text

Rock, Donald A. – ETS Research Report Series, 2012

This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…

Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development

Comparison of Stratum Scored and Maximum-Likelihood Scored CATs.

Download full text

Wise, Steven L. – 1999

Outside of large-scale testing programs, the computerized adaptive test (CAT) has thus far had only limited impact on measurement practice. In smaller-scale testing contexts, limited data are often available, which precludes the establishment of calibrated item pools for use by traditional (i.e., item response theory (IRT) based) CATs. This paper…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Scores

An Alternative Method for Scoring Adaptive Tests.

Peer reviewed

Stocking, Martha L. – Journal of Educational and Behavioral Statistics, 1996

An alternative method for scoring adaptive tests, based on number-correct scores, is explored and compared with a method that relies more directly on item response theory. Using the number-correct score with necessary adjustment for intentional differences in adaptive test difficulty is a statistically viable scoring method. (SLD)

Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Item Response Theory

Functional-Level Testing: A "Must" for Valid and Accurate Assessment Results. EREAPA Publication Series No. 95-2.

PDF pending restoration

Wheeler, Patricia H. – 1995

When individuals are given tests that are too hard or too easy, the resulting scores are likely to be poor estimates of their performance. To get valid and accurate test scores that provide meaningful results, one should use functional-level testing (FLT). FLT is the practice of administering to an individual a version of a test with a difficulty…

Descriptors: Adaptive Testing, Difficulty Level, Educational Assessment, Performance

Computing Scores for Incomplete GRE General Computer Adaptive Tests.

Download full text

Slater, Sharon C.; Schaeffer, Gary A. – 1996

The General Computer Adaptive Test (CAT) of the Graduate Record Examinations (GRE) includes three operational sections that are separately timed and scored. A "no score" is reported if the examinee answers fewer than 80% of the items or if the examinee does not answer all of the items and leaves the section before time expires. The 80%…

Descriptors: Adaptive Testing, College Students, Computer Assisted Testing, Equal Education

Applications of Item Response Theory to Practical Testing Problems.

Lord, Frederic M. – 1980

The purpose of this book is to make it possible for measurement specialists to solve practical testing problems through the use of item response theory (IRT). The topics, organization, and presentation are those used in a 4-week seminar held each summer for the past several years. The material is organized to facilitate understanding; all related…

Descriptors: Adaptive Testing, Estimation (Mathematics), Evaluation Problems, Item Analysis

NAEP Validity Studies: Feasibility Studies of Two-Stage Testing in Large-Scale Educational Assessment: Implications for NAEP. Working Paper No. 2003-14

Download full text

Bock, R. Darrell; Zimowski, Michele F. – National Center for Education Statistics, 2003

This report examines the potential of adaptive testing, two?-stage testing in particular, for improving the data quality of the National Assessment of Educational Progress (NAEP). Following a discussion of the rationale for adaptive testing in assessment and a review of previous studies of two-?stage testing, this report describes a 1993 Ohio…

Descriptors: National Competency Tests, Test Validity, Feasibility Studies, Educational Assessment

A Computerized Implementation of a Flexilevel Test and Its Comparison with a Bayesian Computerized Adaptive Test.

Download full text

DeAyala, R. J.; Koch, William R. – 1986

A computerized flexilevel test was implemented and its ability estimates were compared with those of a Bayesian estimation based computerized adaptive test (CAT) as well as with known true ability estimates. Results showed that when the flexilevel test was terminated according to Lord's criterion, its ability estimates were highly and…

Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Comparative Analysis

Tailored Testing, An Application of Stochastic Approximation.

Download full text

Lord, Frederic M. – 1971

Some stochastic approximation procedures are considered in relation to the problem of choosing a sequence of test questions to accurately estimate a given examinee's standing on a psychological dimension. Illustrations are given evaluating certain procedures in a specific context. (Author/CK)

Descriptors: Academic Ability, Adaptive Testing, Computer Programs, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2

Lord, Frederic M.	2
Aviad-Levitzky, Tami	1
Bock, R. Darrell	1
Davey, Tim	1
DeAyala, R. J.	1
Glas, Cees A. W.	1
Goldstein, Zahava	1
Hambleton, Ronald K.	1
Herbert, Erin	1
Hong, Kian Sam	1
Judd, Wallace	1
Kim, Sooyeon	1
Koch, William R.	1
Lau, Paul Ngee Kiong	1
Lau, Sie Hoe	1
Laufer, Batia	1
Makransky, Guido	1
Mortensen, Erik Lykke	1
Moses, Tim	1
Rizavi, Saba	1
Rock, Donald A.	1
Schaeffer, Gary A.	1
Sireci, Stephen G.	1
Slater, Sharon C.	1
Stocking, Martha L.	1
More ▼