ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	17

Descriptor

Predictor Variables	18
Item Response Theory	4
Scores	4
Test Items	4
Accuracy	3
Correlation	3
Essay Tests	3
Evaluators	3
Scoring	3
Academic Achievement	2
College Entrance Examinations	2
Computation	2
Computer Assisted Testing	2
Difficulty Level	2
Educational Assessment	2
Elementary Secondary Education	2
English (Second Language)	2
Foreign Countries	2
Grade 8	2
Guessing (Tests)	2
International Assessment	2
Item Analysis	2
Least Squares Statistics	2
Meta Analysis	2
Models	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	18
Reports - Research	13
Reports - Evaluative	4
Reports - Descriptive	1

Education Level

Higher Education	6
Postsecondary Education	4
Elementary Secondary Education	2
Grade 8	1
Middle Schools	1

Audience

Location

Iran	1
New York	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
Bar Examinations	1
Graduate Record Examinations	1
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

A Method for Displaying Incremental Validity with Expectancy Charts

Peer reviewed

Direct link

Lee, Samuel David; Walmsley, Philip T.; Sackett, Paul R.; Kuncel, Nathan – Applied Measurement in Education, 2021

Providing assessment validity information to decision makers in a clear and useful format is an ongoing challenge for the educational and psychological measurement community. We identify issues with a previous approach to a graphical presentation, noting that it is mislabeled as presenting incremental validity, when in fact it displays the effects…

Descriptors: Test Validity, Predictor Variables, Charts

Understanding and Interpreting Human Scoring

Peer reviewed

Direct link

Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020

This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…

Descriptors: Scoring, Test Format, Responses, Predictor Variables

Predictive Modeling of Rater Behavior: Implications for Quality Assurance in Essay Scoring

Peer reviewed

Direct link

Bejar, Isaac I.; Li, Chen; McCaffrey, Daniel – Applied Measurement in Education, 2020

We evaluate the feasibility of developing predictive models of rater behavior, that is, "rater-specific" models for predicting the scores produced by a rater under operational conditions. In the present study, the dependent variable is the score assigned to essays by a rater, and the predictors are linguistic attributes of the essays…

Descriptors: Scoring, Essays, Behavior, Predictive Measurement

Response Demands of Reading Comprehension Test Items: A Review of Item Difficulty Modeling Studies

Peer reviewed

Direct link

Ferrara, Steve; Steedle, Jeffrey T.; Frantz, Roger S. – Applied Measurement in Education, 2022

Item difficulty modeling studies involve (a) hypothesizing item features, or item response demands, that are likely to predict item difficulty with some degree of accuracy; and (b) entering the features as independent variables into a regression equation or other statistical model to predict difficulty. In this review, we report findings from 13…

Descriptors: Reading Comprehension, Reading Tests, Test Items, Item Response Theory

Can Culture Be a Salient Predictor of Test-Taking Engagement? An Analysis of Differential Noneffortful Responding on an International College-Level Assessment of Critical Thinking

Peer reviewed

Direct link

Rios, Joseph A.; Guo, Hongwen – Applied Measurement in Education, 2020

The objective of this study was to evaluate whether differential noneffortful responding (identified via response latencies) was present in four countries administered a low-stakes college-level critical thinking assessment. Results indicated significant differences (as large as 0.90 "SD") between nearly all country pairings in the…

Descriptors: Response Style (Tests), Cultural Differences, Critical Thinking, Cognitive Tests

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Item Parameter Drift in a Time-Varying Predictor

Peer reviewed

Direct link

Lee, HyeSun – Applied Measurement in Education, 2018

The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…

Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores

Bi-Factor MIRT Observed-Score Equating for Mixed-Format Tests

Peer reviewed

Direct link

Lee, Guemin; Lee, Won-Chan – Applied Measurement in Education, 2016

The main purposes of this study were to develop bi-factor multidimensional item response theory (BF-MIRT) observed-score equating procedures for mixed-format tests and to investigate relative appropriateness of the proposed procedures. Using data from a large-scale testing program, three types of pseudo data sets were formulated: matched samples,…

Descriptors: Test Format, Multidimensional Scaling, Item Response Theory, Equated Scores

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

The Effects of Test Accommodations for English Language Learners: A Meta-Analysis

Peer reviewed

Direct link

Li, Hongli; Suen, Hoi K. – Applied Measurement in Education, 2012

A meta-analysis using Hierarchical Linear Modeling (HLM) was conducted to examine the effects of test accommodations on the test performance of English language learners (ELLs). The results indicated that test accommodations improve ELLs' test performance by about 0.157 standard deviations--a relatively small but statistically significant…

Descriptors: Testing Accommodations, English (Second Language), Second Language Learning, Limited English Speaking

Criterion-Focused Approach to Reducing Adverse Impact in College Admissions

Peer reviewed

Direct link

Sinha, Ruchi; Oswald, Frederick; Imus, Anna; Schmitt, Neal – Applied Measurement in Education, 2011

The current study examines how using a multidimensional battery of predictors (high-school grade point average (GPA), SAT/ACT, and biodata), and weighting the predictors based on the different values institutions place on various student performance dimensions (college GPA, organizational citizenship behaviors (OCBs), and behaviorally anchored…

Descriptors: Grade Point Average, Interrater Reliability, Rating Scales, College Admission

Correlates of Rapid-Guessing Behavior in Low-Stakes Testing: Implications for Test Development and Measurement Practice

Peer reviewed

Direct link

Wise, Steven L.; Pastor, Dena A.; Kong, Xiaojing J. – Applied Measurement in Education, 2009

Previous research has shown that rapid-guessing behavior can degrade the validity of test scores from low-stakes proficiency tests. This study examined, using hierarchical generalized linear modeling, examinee and item characteristics for predicting rapid-guessing behavior. Several item characteristics were found significant; items with more text…

Descriptors: Guessing (Tests), Achievement Tests, Correlation, Test Items

Modeling Group Differences in OLS and Orthogonal Regression: Implications for Differential Validity Studies

Peer reviewed

Direct link

Kane, Michael T.; Mroch, Andrew A. – Applied Measurement in Education, 2010

In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…

Descriptors: Least Squares Statistics, Regression (Statistics), Differences, Validity

Can Measuring Psychosocial Factors Promote College Success?

Peer reviewed

Direct link

Allen, Jeff; Robbins, Steven B.; Sawyer, Richard – Applied Measurement in Education, 2010

Research on the validity of psychosocial factors (PSFs) and other noncognitive predictors of college outcomes has largely ignored the practical benefits implied by the validity. We summarize evidence of the validity of PSF measures as predictors of college outcomes and then explain how this validity directly translates into improved identification…

Descriptors: Institutional Research, Academic Persistence, Validity, At Risk Students

On the Impact of Formative Assessment on Student Motivation, Achievement, and Conceptual Change

Peer reviewed

Direct link

Yin, Yue; Shavelson, Richard J.; Ayala, Carlos C.; Ruiz-Primo, Maria Araceli; Brandon, Paul R.; Furtak, Erin Marie; Tomita, Miki K.; Young, Donald B. – Applied Measurement in Education, 2008

Formative assessment was hypothesized to have a beneficial impact on students' science achievement and conceptual change, either directly or indirectly by enhancing motivation. We designed and embedded formatives assessments within an inquiry science unit. Twelve middle-school science teachers with their students were randomly assigned either to…

Descriptors: Classroom Techniques, Experimental Groups, Control Groups, Formative Evaluation

Previous Page | Next Page »

Pages: 1 | 2

Wise, Steven L.	2
Allen, Jeff	1
Arslan, Burcu	1
Ayala, Carlos C.	1
Bejar, Isaac I.	1
Brandon, Paul R.	1
Brookhart, Susan M.	1
Ferrara, Steve	1
Finn, Bridgid	1
Frantz, Roger S.	1
Furtak, Erin Marie	1
Glazer, Nancy	1
Guo, Hongwen	1
Imus, Anna	1
Kane, Michael T.	1
Kong, Xiaojing J.	1
Koziol, Natalie A.	1
Kuncel, Nathan	1
Lee, Guemin	1
Lee, HyeSun	1
Lee, Samuel David	1
Lee, Won-Chan	1
Li, Chen	1
Li, Hongli	1
Linden, Kathryn W.	1
More ▼