NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
Vispoel, Walter P.; Rocklin, Thomas R.; Wang, Tianyou; Bleiler, Timothy – Journal of Educational Measurement, 1999
Investigated the effectiveness of H. Wainer's (1993) strategy for obtaining positively biased ability estimates when examinees can review and change answers on computerized adaptive tests. Results, based on simulation and testing data from 87 college students, show that the Wainer strategy sometimes yields inflated ability estimates and sometimes…
Descriptors: Ability, College Students, Computer Assisted Testing, Higher Education
Peer reviewed Peer reviewed
Vispoel, Walter P. – Journal of Educational Measurement, 1998
Compared results from computer-adaptive and self-adaptive tests under conditions in which item review was and was not permitted for 379 college students. Results suggest that, when given the opportunity, most examinees will change answers, but usually only to a small portion of items, resulting in some benefit to the test taker. (SLD)
Descriptors: Adaptive Testing, College Students, Computer Assisted Testing, Higher Education
Peer reviewed Peer reviewed
Vispoel, Walter P.; Hendrickson, Amy B.; Bleiler, Timothy – Journal of Educational Measurement, 2000
Evaluated the effectiveness of vocabulary computerized adaptive tests (CATs) with restricted review in a live testing setting involving 242 college students in which special efforts were made to increase test efficiency and reduce the possibility of obtaining positively biased proficiency estimates. Results suggest the efficacy of allowing limited…
Descriptors: Adaptive Testing, Attitudes, College Students, Computer Assisted Testing
Peer reviewed Peer reviewed
Vispoel, Walter P. – Journal of Educational Measurement, 1998
Studied effects of administration mode [computer adaptive test (CAT) versus self-adaptive test (SAT)], item-by-item answer feedback, and test anxiety on results from computerized vocabulary tests taken by 293 college students. CATs were more reliable than SATs, and administration time was less when feedback was provided. (SLD)
Descriptors: Adaptive Testing, College Students, Computer Assisted Testing, Feedback
Peer reviewed Peer reviewed
Enright, Mary K.; Rock, Donald A.; Bennett, Randy Elliot – Journal of Educational Measurement, 1998
Examined alternative-item types and section configurations for improving the discriminant and convergent validity of the Graduate Record Examination (GRE) general test using a computer-based test given to 388 examinees who had taken the GRE previously. Adding new variations of logical meaning appeared to decrease discriminant validity. (SLD)
Descriptors: Admission (School), College Entrance Examinations, College Students, Computer Assisted Testing
Peer reviewed Peer reviewed
Bridgeman, Brent; Rock, Donald A. – Journal of Educational Measurement, 1993
Exploratory and confirmatory factor analyses were used to explore relationships among existing item types and three new computer-administered item types for the analytical scale of the Graduate Record Examination General Test. Results with 349 students indicate constructs the item types are measuring. (SLD)
Descriptors: College Entrance Examinations, College Students, Comparative Testing, Computer Assisted Testing
Peer reviewed Peer reviewed
Vispoel, Walter P.; And Others – Journal of Educational Measurement, 1997
Efficiency, precision, and concurrent validity of results from adaptive and fixed-item music listening tests were studied using: (1) 2,200 simulated examinees; (2) 204 live examinees; and (3) 172 live examinees. Results support the usefulness of adaptive tests for measuring skills that require aurally produced items. (SLD)
Descriptors: Adaptive Testing, Adults, College Students, Comparative Analysis
Peer reviewed Peer reviewed
Hirsch, Thomas M. – Journal of Educational Measurement, 1989
Equatings were performed on both simulated and real data sets using common-examinee design and two abilities for each examinee. Results indicate that effective equating, as measured by comparability of true scores, is possible with the techniques used in this study. However, the stability of the ability estimates proved unsatisfactory. (TJH)
Descriptors: Academic Ability, College Students, Comparative Analysis, Computer Assisted Testing
Peer reviewed Peer reviewed
Bridgeman, Brent – Journal of Educational Measurement, 1992
Examinees in a regular administration of the quantitative portion of the Graduate Record Examination responded to particular items in a machine-scannable multiple-choice format. Volunteers (n=364) used a computer to answer open-ended counterparts of these items. Scores for both formats demonstrated similar correlational patterns. (SLD)
Descriptors: Answer Sheets, College Entrance Examinations, College Students, Comparative Testing
Peer reviewed Peer reviewed
Cohen, Allan S.; And Others – Journal of Educational Measurement, 1991
Detecting differential item functioning (DIF) on test items constructed to favor 1 group over another was investigated on parameter estimates from 2 item response theory-based computer programs--BILOG and LOGIST--using data for 1,000 White and 1,000 Black college students. Use of prior distributions and marginal-maximum a posteriori estimation is…
Descriptors: Black Students, College Students, Computer Assisted Testing, Equations (Mathematics)