NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Maryland1
Texas1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Ventouras, Errikos; Triantis, Dimos; Tsiakas, Panagiotis; Stergiopoulos, Charalampos – Computers & Education, 2011
The aim of the present research was to compare the use of multiple-choice questions (MCQs) as an examination method against the oral examination (OE) method. MCQs are widely used and their importance seems likely to grow, due to their inherent suitability for electronic assessment. However, MCQs are influenced by the tendency of examinees to guess…
Descriptors: Grades (Scholastic), Scoring, Multiple Choice Tests, Test Format
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Bridgeman, Brent; Trapani, Catherine – Journal of Technology, Learning, and Assessment, 2010
A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Test Scoring Machines
Peer reviewed Peer reviewed
Direct linkDirect link
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Miller, Mark J.; Cowger, Ernest, Jr.; Young, Tony; Tobacyk, Jerome; Sheets, Tillman; Loftus, Christina – College Student Journal, 2008
This study examined the degree of similarity between scores on the Self-Directed Search and an online instrument measuring Holland types. A relatively high congruency score was found between the two measures. Implications for career counselors are discussed.
Descriptors: Career Counseling, Personality Assessment, Congruence (Psychology), Personality Traits
Wang, Jinhao; Brown, Michelle Stallone – Journal of Technology, Learning, and Assessment, 2007
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by…
Descriptors: Test Scoring Machines, Scoring, Comparative Testing, Intermode Differences
Peer reviewed Peer reviewed
Bennett, Randy Elliot; And Others – Applied Psychological Measurement, 1990
The relationship of an expert-system-scored constrained free-response item type to multiple-choice and free-response items was studied using data for 614 students on the College Board's Advanced Placement Computer Science (APCS) Examination. Implications for testing and the APCS test are discussed. (SLD)
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Computer Science
PDF pending restoration PDF pending restoration
Anderson, Paul S.; Hyers, Albert D. – 1991
Three descriptive statistics (difficulty, discrimination, and reliability) of multiple-choice (MC) test items were compared to those of a new (1980s) format of machine-scored questions. The new method, answer-bank multi-digit testing (MDT), uses alphabetized lists of up to 1,000 alternatives and approximates the completion style of assessment…
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Correlation
Peer reviewed Peer reviewed
Endler, Norman S.; Parker, James D. A. – Educational and Psychological Measurement, 1990
C. Davis and M. Cowles (1989) analyzed a total trait anxiety score on the Endler Multidimensional Anxiety Scales (EMAS)--a unidimensional construct that this multidimensional measure does not assess. Data are reanalyzed using the appropriate scoring procedure for the EMAS. Subjects included 145 undergraduates in 1 of 4 testing conditions. (SLD)
Descriptors: Anxiety, Comparative Testing, Computer Assisted Testing, Construct Validity
Bergstrom, Betty A.; Lunz, Mary E. – 1991
The level of confidence in pass/fail decisions obtained with computer adaptive tests (CATs) was compared to decisions based on paper-and-pencil tests. Subjects included 645 medical technology students from 238 educational programs across the country. The tests used in this study constituted part of the subjects' review for the certification…
Descriptors: Adaptive Testing, Certification, Comparative Testing, Computer Assisted Testing
PDF pending restoration PDF pending restoration
Hyers, Albert D.; Anderson, Paul S. – 1991
Using matched pairs of geography questions, a new testing method for machine-scored fill-in-the-blank, multiple-digit testing (MDT) questions was compared to the traditional multiple-choice (MC) style. Data were from 118 matched or parallel test items for 4 tests from 764 college students of geography. The new method produced superior results when…
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Difficulty Level
Bejar, Isaac I.; And Others – 1977
Information provided by typical and improved conventional classroom achievement tests was compared with information provided by an adaptive test covering the same subject matter. Both tests were administered to over 700 college students in a general biology course. Using the same scoring method, adaptive testing was found to yield substantially…
Descriptors: Academic Achievement, Achievement Tests, Adaptive Testing, Biology