ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	7

Descriptor

Correlation	10
College Entrance Examinations	7
Scoring	6
Graduate Study	5
Interrater Reliability	4
Reliability	4
Test Reliability	4
Automation	3
Weighted Scores	3
Computer Assisted Testing	2
Computer Software	2
Essay Tests	2
Essays	2
Evaluation Methods	2
Evaluation Research	2
Factor Analysis	2
Models	2
Psychometrics	2
Reaction Time	2
Regression (Statistics)	2
Scores	2
Scoring Formulas	2
Task Analysis	2
Test Validity	2
Arabic	1
More ▼

Source

ETS Research Report Series	4
Applied Psychological…	2
Language Testing	1

Publication Type

Journal Articles	7
Reports - Research	6
Reports - Descriptive	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	6

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	10
ACT Assessment	1
Differential Aptitude Test	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Scoring with the Computer: Alternative Procedures for Improving the Reliability of Holistic Essay Scoring

Peer reviewed

Direct link

Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013

Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…

Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests

The Impact of Sampling Approach on Population Invariance in Automated Scoring of Essays. Research Report. ETS RR-13-18

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo – ETS Research Report Series, 2013

Many testing programs use automated scoring to grade essays. One issue in automated essay scoring that has not been examined adequately is population invariance and its causes. The primary purpose of this study was to investigate the impact of sampling in model calibration on population invariance of automated scores. This study analyzed scores…

Descriptors: Automation, Scoring, Essay Tests, Sampling

Automated Trait Scores for "GRE"® Writing Tasks. Research Report. ETS RR-15-15

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…

Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar

Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits

Peer reviewed

Direct link

Semmes, Robert; Davison, Mark L.; Close, Catherine – Applied Psychological Measurement, 2011

If numerical reasoning items are administered under time limits, will two dimensions be required to account for the responses, a numerical ability dimension and a speed dimension? A total of 182 college students answered 74 numerical reasoning items. Every item was taken with and without time limits by half the students. Three psychometric models…

Descriptors: Individual Differences, Logical Thinking, Timed Tests, College Students

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models

Effect of Immediate Feedback and Revision on Psychometric Properties of Open-Ended Sentence- Completion Items. ETS GRE Board Research Report No. 03-15. ETS RR-08-16

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Powers, Don; Hawthorn, John – ETS Research Report Series, 2008

Registered examinees for the GRE® General Test answered open-ended sentence-completion items. For half of the items, participants received immediate feedback on the correctness of their answers and up to two opportunities to revise their answers. A significant feedback-and-revision effect was found. Participants were able to correct many of their…

Descriptors: College Entrance Examinations, Graduate Study, Sentences, Psychometrics

Effects of Empirical Option Weighting on Reliability and Validity of the GRE.

Download full text

Reilly, Richard R.; Jackson, Rex – 1972

Item options of shortened forms of the Graduate Record Examination Verbal and Quantitative tests were empirically weighted by two variants of a method originally attributed to Guttman. The first method assigned to each option of an item the mean standard score on the remaining items of all subjects choosing that option. The second procedure…

Descriptors: Correlation, Factor Analysis, Graduate Study, Scoring

Relationship of Admission Test Scores to Writing Performance of Native and Nonnative Speakers of English.

Download full text

Carlson, Sybil B.; And Others – 1985

Four writing samples were obtained from 638 foreign college applicants who represented three major foreign language groups (Arabic, Chinese, and Spanish), and from 60 native English speakers. All four were scored holistically, two were also scored for sentence-level and discourse-level skills, and some were scored by the Writer's Workbench…

Descriptors: Arabic, Chinese, College Entrance Examinations, Computer Software

Empirical Option Weighting with a Correction for Guessing.

Download full text

Reilly, Richard R. – 1972

Because previous reports have suggested that the lowered validity of tests scored with empirical option weights might be explained by a capitalization of the keying procedures on omitting tendencies, a procedure was devised to key options empirically with a "correction-for-guessing" constraint. Use of the new procedure with Graduate…

Descriptors: Correlation, Data Analysis, Guessing (Tests), Mathematical Applications

Item Difficulty Modeling of Paragraph Comprehension Items

Peer reviewed

Direct link

Gorin, Joanna S.; Embretson, Susan E. – Applied Psychological Measurement, 2006

Recent assessment research joining cognitive psychology and psychometric theory has introduced a new technology, item generation. In algorithmic item generation, items are systematically created based on specific combinations of features that underlie the processing required to correctly solve a problem. Reading comprehension items have been more…

Descriptors: Difficulty Level, Test Items, Modeling (Psychology), Paragraph Composition

Attali, Yigal	3
Reilly, Richard R.	2
Bridgeman, Brent	1
Carlson, Sybil B.	1
Close, Catherine	1
Davey, Tim	1
Davison, Mark L.	1
Embretson, Susan E.	1
Gorin, Joanna S.	1
Hawthorn, John	1
Jackson, Rex	1
Lewis, Will	1
Powers, Don	1
Ramineni, Chaitanya	1
Semmes, Robert	1
Sinharay, Sandip	1
Steier, Michael	1
Trapani, Catherine S.	1
Williamson, David M.	1
Zhang, Mo	1
More ▼