ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	4

Descriptor

Computer Assisted Testing	13
Test Validity	6
Validity	6
Adaptive Testing	5
Test Construction	5
Admission (School)	3
College Entrance Examinations	3
College Students	3
Comparative Analysis	3
Computer Simulation	3
Educational Assessment	3
Graduate Study	3
Higher Education	3
Test Items	3
Artificial Intelligence	2
Comparative Testing	2
Evaluation Methods	2
Generalizability Theory	2
Item Banks	2
Models	2
Scoring	2
Test Reliability	2
Test Use	2
Academic Achievement	1
Adults	1
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	13
Reports - Research	8
Reports - Evaluative	4
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

United Kingdom

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Toward Argument-Based Fairness with an Application to AI-Enhanced Educational Assessments

Peer reviewed

Direct link

A. Corinne Huggins-Manley; Brandon M. Booth; Sidney K. D'Mello – Journal of Educational Measurement, 2022

The field of educational measurement places validity and fairness as central concepts of assessment quality. Prior research has proposed embedding fairness arguments within argument-based validity processes, particularly when fairness is conceived as comparability in assessment properties across groups. However, we argue that a more flexible…

Descriptors: Educational Assessment, Persuasive Discourse, Validity, Artificial Intelligence

Validity Arguments Meet Artificial Intelligence in Innovative Educational Assessment

Peer reviewed

Direct link

Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022

We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…

Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods

Hybrid Computerized Adaptive Testing: From Group Sequential Design to Fully Sequential Design

Peer reviewed

Direct link

Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff – Journal of Educational Measurement, 2016

Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Sequential Approach

"Mental Model" Comparison of Automated and Human Scoring.

Peer reviewed

Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999

Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)

Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges

Evaluating Comparability in Computerized Adaptive Testing: Issues, Criteria and an Example.

Peer reviewed

Wang, Tianyou; Kolen, Michael J. – Journal of Educational Measurement, 2001

Reviews research literature on comparability issues in computerized adaptive testing (CAT) and synthesizes issues specific to comparability and test security. Develops a framework for evaluating comparability that contains three categories of criteria: (1) validity; (2) psychometric property/reliability; and (3) statistical assumption/test…

Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Criteria

Generalizability, Validity, and Examinee Perceptions of a Computer-Delivered Formulating-Hypotheses Test.

Peer reviewed

Bennett, Randy Elliot; Rock, Donald A. – Journal of Educational Measurement, 1995

Examined the generalizability and validity and examinee perceptions of a computer-delivered version of 8 formulating-hypotheses tasks administered to 192 graduate students. Results support previous research that has suggested that formulating-hypotheses items can broaden the abilities measured by graduate admissions measures. (SLD)

Descriptors: Admission (School), College Entrance Examinations, Computer Assisted Testing, Generalizability Theory

Improving Measurement for Graduate Admissions.

Peer reviewed

Enright, Mary K.; Rock, Donald A.; Bennett, Randy Elliot – Journal of Educational Measurement, 1998

Examined alternative-item types and section configurations for improving the discriminant and convergent validity of the Graduate Record Examination (GRE) general test using a computer-based test given to 388 examinees who had taken the GRE previously. Adding new variations of logical meaning appeared to decrease discriminant validity. (SLD)

Descriptors: Admission (School), College Entrance Examinations, College Students, Computer Assisted Testing

Computerized Cognitive Diagnostic Adaptive Testing: Effect on Remedial Instruction as Empirical Validation.

Peer reviewed

Tatsuoka, Kikumi K.; Tatsuoka, Maurice M. – Journal of Educational Measurement, 1997

Results of studies involving 478 junior high school students in two years using cognitive diagnoses done through computerized adaptive testing indicate that knowing students' knowledge states before remediation is effective, and that the rule-space method can diagnose these knowledge states effectively. (SLD)

Descriptors: Adaptive Testing, Cognitive Tests, Computer Assisted Testing, Diagnostic Tests

A Comparison of the Performance of Simulated Hierarchical and Linear Testlets.

Peer reviewed

Wainer, Howard; And Others – Journal of Educational Measurement, 1992

Computer simulations were run to measure the relationship between testlet validity and factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Making a testlet adaptive yields only modest increases in aggregate validity because of the peakedness of the typical proficiency distribution. (Author/SLD)

Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation

Computerized Adaptive and Fixed-Item Testing of Music Listening Skill: A Comparison of Efficiency, Precision, and Concurrent Validity.

Peer reviewed

Vispoel, Walter P.; And Others – Journal of Educational Measurement, 1997

Efficiency, precision, and concurrent validity of results from adaptive and fixed-item music listening tests were studied using: (1) 2,200 simulated examinees; (2) 204 live examinees; and (3) 172 live examinees. Results support the usefulness of adaptive tests for measuring skills that require aurally produced items. (SLD)

Descriptors: Adaptive Testing, Adults, College Students, Comparative Analysis

A Computer-Based Task for Measuring the Representational Component of Quantitative Proficiency.

Peer reviewed

Bennett, Randy Elliot; Sebrechts, Marc M. – Journal of Educational Measurement, 1997

A computer-delivered problem-solving task based on cognitive research literature was developed and its validity for graduate admissions assessment was studied with 107 undergraduates. Use of the test, which asked examinees to sort word-problem stems by prototypes, was supported by the findings. (SLD)

Descriptors: Admission (School), College Entrance Examinations, Computer Assisted Testing, Graduate Study

Evaluating and Predicting Survey Efficiency Using Generalizability Theory.

Peer reviewed

Johnson, Sandra; Bell, John F. – Journal of Educational Measurement, 1985

The assessment framework underlying a science performance monitoring program is process-oriented and intended to appeal to generalizability theory for a suitable estimation paradigm. Preliminary applications are described. Results suggest that computerized question-banking, domain-sampling of questions, and generalizablity theory together provide…

Descriptors: Academic Achievement, Computer Assisted Testing, Educational Assessment, Foreign Countries

Bennett, Randy Elliot	3
Rock, Donald A.	2
A. Corinne Huggins-Manley	1
Bejar, Isaac I.	1
Bell, John F.	1
Brandon M. Booth	1
Chang, Hua-Hua	1
Dorsey, David W.	1
Douglas, Jeff	1
Enright, Mary K.	1
Hamid Mohammadi	1
Hone, Anne S.	1
Johnson, Sandra	1
Kolen, Michael J.	1
Lin, Haiyan	1
Mark J. Gierl	1
Michaels, Hillary R.	1
Sebrechts, Marc M.	1
Sidney K. D'Mello	1
Tahereh Firoozi	1
Tatsuoka, Kikumi K.	1
Tatsuoka, Maurice M.	1
Vispoel, Walter P.	1
Wainer, Howard	1
More ▼