ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	19

Descriptor

Weighted Scores	19
Computer Assisted Testing	7
Scoring	7
Statistical Analysis	7
Correlation	6
Comparative Analysis	5
Sampling	5
Scores	5
Test Items	5
College Entrance Examinations	4
Computation	4
Error of Measurement	4
Essay Tests	4
Item Response Theory	4
Prediction	4
Regression (Statistics)	4
Writing Evaluation	4
Automation	3
Construct Validity	3
Equated Scores	3
Essays	3
Factor Analysis	3
Language Tests	3
Models	3
Testing	3
More ▼

Source

ETS Research Report Series

Publication Type

Journal Articles	19
Reports - Research	19

Education Level

Higher Education	5
Elementary Education	3
Postsecondary Education	3
Grade 4	2
Grade 8	2
Intermediate Grades	2
Junior High Schools	2
Middle Schools	2
Secondary Education	2
Elementary Secondary Education	1
Grade 10	1
High Schools	1
More ▼

Audience

Location

Kentucky

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Graduate Record Examinations	3
National Assessment of…	2
Test of English as a Foreign…	2

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Detecting the Impact of Remote Proctored At-Home Testing Using Propensity Score Weighting. Research Report. ETS RR-24-11

Peer reviewed
PDF on ERIC

Download full text

Jing Miao; Yi Cao; Michael E. Walker – ETS Research Report Series, 2024

Studies of test score comparability have been conducted at different stages in the history of testing to ensure that test results carry the same meaning regardless of test conditions. The expansion of at-home testing via remote proctoring sparked another round of interest. This study uses data from three licensure tests to assess potential mode…

Descriptors: Testing, Test Format, Computer Assisted Testing, Home Study

Effect of Statistically Matching Equating Samples for Common-Item Equating. Research Report. ETS RR-21-02

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Kim, Sooyeon – ETS Research Report Series, 2021

This study evaluated the impact of subgroup weighting for equating through a common-item anchor. We used data from a single test form to create two research forms for which the equating relationship was known. The results showed that equating was most accurate when the new form and reference form samples were weighted to be similar to the target…

Descriptors: Equated Scores, Weighted Scores, Raw Scores, Test Items

Comparisons among Approaches to Link Tests Using Random Samples Selected under Suboptimal Conditions. Research Report. ETS RR-21-14

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Walker, Michael E. – ETS Research Report Series, 2021

Equating the scores from different forms of a test requires collecting data that link the forms. Problems arise when the test forms to be linked are given to groups that are not equivalent and the forms share no common items by which to measure or adjust for this group nonequivalence. We compared three approaches to adjusting for group…

Descriptors: Equated Scores, Weighted Scores, Sampling, Multiple Choice Tests

A Note on Using Weighted Sum Scores in the P-DIF Statistic. Research Report. ETS RR-19-32

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…

Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Implementing a Contributory Scoring Approach for the "GRE"® Analytical Writing Section: A Comprehensive Empirical Investigation. Research Report. ETS RR-17-14

Peer reviewed
PDF on ERIC

Download full text

Breyer, F. Jay; Rupp, André A.; Bridgeman, Brent – ETS Research Report Series, 2017

In this research report, we present an empirical argument for the use of a contributory scoring approach for the 2-essay writing assessment of the analytical writing section of the "GRE"® test in which human and machine scores are combined for score creation at the task and section levels. The approach was designed to replace a currently…

Descriptors: College Entrance Examinations, Scoring, Essay Tests, Writing Evaluation

What Differential Weighting of Subsets of Items Does and Does Not Accomplish: Geometric Explanation. Research Report. ETS RR-14-20

Peer reviewed
PDF on ERIC

Download full text

Carlson, James E. – ETS Research Report Series, 2014

A little-known theorem, a generalization of Pythagoras's theorem, due to Pappus, is used to present a geometric explanation of various definitions of the contribution of component tests to their composite. I show that an unambiguous definition of the unique contribution of a component to the composite score variance is present if and only if the…

Descriptors: Geometric Concepts, Scores, Validity, Reliability

An Investigation of the "e-rater"® Automated Scoring Engine's Grammar, Usage, Mechanics, and Style Microfeatures and Their Aggregation Model. Research Report. ETS RR-17-04

Peer reviewed
PDF on ERIC

Download full text

Chen, Jing; Zhang, Mo; Bejar, Isaac I. – ETS Research Report Series, 2017

Automated essay scoring (AES) generally computes essay scores as a function of macrofeatures derived from a set of microfeatures extracted from the text using natural language processing (NLP). In the "e-rater"® automated scoring engine, developed at "Educational Testing Service" (ETS) for the automated scoring of essays, each…

Descriptors: Computer Assisted Testing, Scoring, Automation, Essay Tests

Weighting Test Samples in IRT Linking and Equating: Toward an Improved Sampling Design for Complex Equating. Research Report. ETS RR-13-39

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe; Jiang, Yanming; von Davier, Alina A. – ETS Research Report Series, 2013

Several factors could cause variability in item response theory (IRT) linking and equating procedures, such as the variability across examinee samples and/or test items, seasonality, regional differences, native language diversity, gender, and other demographic variables. Hence, the following question arises: Is it possible to select optimal…

Descriptors: Item Response Theory, Test Items, Sampling, True Scores

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

Examining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating. Research Report. ETS RR-12-09

Peer reviewed
PDF on ERIC

Download full text

Li, Yanmei – ETS Research Report Series, 2012

In a common-item (anchor) equating design, the common items should be evaluated for item parameter drift. Drifted items are often removed. For a test that contains mostly dichotomous items and only a small number of polytomous items, removing some drifted polytomous anchor items may result in anchor sets that no longer resemble mini-versions of…

Descriptors: Scores, Item Response Theory, Equated Scores, Simulation

Automated Trait Scores for "GRE"® Writing Tasks. Research Report. ETS RR-15-15

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of the argument and issue tasks that form the Analytical Writing measure of the "GRE"® General Test. For each of these tasks, this study explored the value added of reporting 4 trait scores for each of these 2 tasks over the total e-rater score.…

Descriptors: Scores, Computer Assisted Testing, Computer Software, Grammar

Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

Peer reviewed
PDF on ERIC

Download full text

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…

Descriptors: Scoring, Test Scoring Machines, Automation, Models

Model-Based Weighting and Comparisons: Research Report. ETS RR-08-17

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe – ETS Research Report Series, 2008

In survey research, sometimes the formation of groupings, or aggregations of cases on which to make an inference, are of importance. Of particular interest are the situations where the cases aggregated carry useful information that has been transferred from a sample employed in a previous study. For example, a school to be included in the sample…

Descriptors: Surveys, Models, High Schools, School Effectiveness

One Approach to Detecting the Invariance of Proficiency Standards over Time. Research Report. ETS RR-08-15

Peer reviewed
PDF on ERIC

Download full text

Qian, Jiahe – ETS Research Report Series, 2008

This study explores the use of a mapping technique to test the invariance of proficiency standards over time for state performance tests. First, the state proficiency standards are mapped onto the National Assessment of Educational Progress (NAEP) scale. Then, rather than looking at whether there is a deviation in proficiency standards directly,…

Descriptors: National Competency Tests, State Standards, Scores, Achievement Tests

Previous Page | Next Page »

Pages: 1 | 2

Qian, Jiahe	5
Attali, Yigal	4
Bridgeman, Brent	2
Dorans, Neil J.	2
Guo, Hongwen	2
Kim, Sooyeon	2
Lu, Ru	2
Sinharay, Sandip	2
Bejar, Isaac I.	1
Braun, Henry	1
Breyer, F. Jay	1
Carlson, James E.	1
Chen, Jing	1
Davey, Tim	1
Jiang, Yanming	1
Jing Miao	1
Li, Yanmei	1
Michael E. Walker	1
Ramineni, Chaitanya	1
Rupp, André A.	1
Trapani, Catherine S.	1
Walker, Michael E.	1
Williamson, David M.	1
Yi Cao	1
Zhang, Mo	1
More ▼