Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 1 |
Descriptor
Comparative Testing | 5 |
Test Validity | 5 |
Test Construction | 3 |
Adaptive Testing | 2 |
Computer Assisted Testing | 2 |
Correlation | 2 |
High School Students | 2 |
High Schools | 2 |
Test Items | 2 |
Test Reliability | 2 |
Achievement Tests | 1 |
More ▼ |
Source
Journal of Educational… | 5 |
Author
Wainer, Howard | 2 |
Breland, Hunter M. | 1 |
Gaynor, Judith L. | 1 |
Hamid Mohammadi | 1 |
Mark J. Gierl | 1 |
Stricker, Lawrence J. | 1 |
Tahereh Firoozi | 1 |
Publication Type
Journal Articles | 5 |
Reports - Research | 4 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
Test of Standard Written… | 1 |
What Works Clearinghouse Rating
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian

Wainer, Howard; And Others – Journal of Educational Measurement, 1992
Computer simulations were run to measure the relationship between testlet validity and factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Making a testlet adaptive yields only modest increases in aggregate validity because of the peakedness of the typical proficiency distribution. (Author/SLD)
Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation

Wainer, Howard; And Others – Journal of Educational Measurement, 1991
Hierarchical (adaptive) and linear methods of testlet construction were compared. The performance of 2,080 ninth and tenth graders on a 4-item testlet was used to predict performance on the entire test. The adaptive test was slightly superior as a predictor, but the cost of obtaining that superiority was considerable. (SLD)
Descriptors: Adaptive Testing, Algebra, Comparative Testing, High School Students

Breland, Hunter M.; Gaynor, Judith L. – Journal of Educational Measurement, 1979
Over 2,000 writing samples were collected from four undergraduate institutions and compared, where possible, with scores on a multiple-choice test. High correlations between ratings of the writing samples and multiple-choice test scores were obtained. Samples contributed substantially to the prediction of both college grades and writing…
Descriptors: Achievement Tests, Comparative Testing, Correlation, Essay Tests

Stricker, Lawrence J. – Journal of Educational Measurement, 1991
To study whether different forms of the Scholastic Aptitude Test (SAT) used since the mid-1970s varied in their correlations with academic performance criteria, 1975 and 1985 forms were administered to 1,554 and 1,753 high school juniors, respectively. The 1975 form did not have greater validity than the 1985 form. (SLD)
Descriptors: Class Rank, College Entrance Examinations, Comparative Testing, Correlation