Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Computer Assisted Testing | 11 |
True Scores | 11 |
Scoring | 5 |
Item Response Theory | 4 |
Ability | 3 |
Adaptive Testing | 3 |
Computer Simulation | 3 |
English (Second Language) | 3 |
Estimation (Mathematics) | 3 |
Automation | 2 |
College Entrance Examinations | 2 |
More ▼ |
Source
ETS Research Report Series | 2 |
Journal of Educational… | 2 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
International Educational… | 1 |
Journal of Technology,… | 1 |
Author
Zwick, Rebecca | 2 |
Attali, Yigal | 1 |
Ben-Simon, Anat | 1 |
Brown, Michelle Stallone | 1 |
Cohen, Yoav | 1 |
Haberman, Shelby J. | 1 |
Han, Yong | 1 |
Harris, Vincent | 1 |
Hicks, Marilyn M. | 1 |
Hirsch, Thomas M. | 1 |
Ji, Suozhao | 1 |
More ▼ |
Publication Type
Journal Articles | 6 |
Reports - Research | 6 |
Reports - Evaluative | 3 |
Numerical/Quantitative Data | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Graduate Record Examinations | 1 |
Law School Admission Test | 1 |
Praxis Series | 1 |
What Works Clearinghouse Rating
Han, Yong; Wu, Wenjun; Ji, Suozhao; Zhang, Lijun; Zhang, Hui – International Educational Data Mining Society, 2019
Peer-grading is commonly adopted by instructors as an effective assessment method for MOOCs (Massive Open Online Courses) and SPOCs (Small Private online course). For solving the problems brought by varied skill levels and attitudes of online students, statistical models have been proposed to improve the fairness and accuracy of peer-grading.…
Descriptors: Peer Evaluation, Grading, Online Courses, Computer Assisted Testing
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Yao, Lili; Haberman, Shelby J.; Zhang, Mo – ETS Research Report Series, 2019
Many assessments of writing proficiency that aid in making high-stakes decisions consist of several essay tasks evaluated by a combination of human holistic scores and computer-generated scores for essay features such as the rate of grammatical errors per word. Under typical conditions, a summary writing score is provided by a linear combination…
Descriptors: Prediction, True Scores, Computer Assisted Testing, Scoring

Zwick, Rebecca; And Others – Journal of Educational Measurement, 1995
In a simulation study of ability and estimation of differential item functioning (DIF) in computerized adaptive tests, Rasch-based DIF statistics were highly correlated with generating DIF, but DIF statistics tended to be slightly smaller than in the three-parameter logistic model analyses. (SLD)
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Computer Simulation
Wang, Jinhao; Brown, Michelle Stallone – Journal of Technology, Learning, and Assessment, 2007
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by…
Descriptors: Test Scoring Machines, Scoring, Comparative Testing, Intermode Differences
Attali, Yigal – ETS Research Report Series, 2007
This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first…
Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language)
Wang, Xiang-Bo; Harris, Vincent; Roussos, Louis – 2002
Multidimensionality is known to affect the accuracy of item parameter and ability estimations, which subsequently influences the computation of item characteristic curves (ICCs) and true scores. By judiciously combining sections of a Law School Admission Test (LSAT), 11 sections of varying degrees of uni- and multidimensional structures are used…
Descriptors: Ability, College Entrance Examinations, Computer Assisted Testing, Estimation (Mathematics)

Zwick, Rebecca; And Others – 1994
A previous simulation study of methods for assessing item functioning (DIF) in computer-adaptive tests (CATs) showed that modified versions of the Mantel-Haenszel and standardization methods work well with CAT data. In that study, data were generated using the three-parameter logistic (3PL) model, and this same model was assumed in obtaining item…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Computer Simulation

Lord, Frederic M. – Applied Psychological Measurement, 1977
Under given conditions, conventional testing and computer-generated repeatable testing (CGRT) are equally effective for estimating examinee ability; CGRT is more effective for estimating the mean ability level of a group and less effective for estimating ability differences among individuals. These conclusion are drawn from domain-referenced test…
Descriptors: Career Development, Computer Assisted Testing, Difficulty Level, Group Norms

Hirsch, Thomas M. – Journal of Educational Measurement, 1989
Equatings were performed on both simulated and real data sets using common-examinee design and two abilities for each examinee. Results indicate that effective equating, as measured by comparability of true scores, is possible with the techniques used in this study. However, the stability of the ability estimates proved unsatisfactory. (TJH)
Descriptors: Academic Ability, College Students, Comparative Analysis, Computer Assisted Testing
Hicks, Marilyn M. – 1989
Methods of computerized adaptive testing using conventional scoring methods in order to develop a computerized placement test for the Test of English as a Foreign Language (TOEFL) were studied. As a consequence of simulation studies during the first phase of the study, the multilevel testing paradigm was adopted to produce three test levels…
Descriptors: Adaptive Testing, Adults, Algorithms, Computer Assisted Testing