ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	8

Descriptor

Generalizability Theory	10
Item Response Theory	10
Scores	10
Reliability	4
Comparative Analysis	2
Computation	2
English (Second Language)	2
Evaluation	2
Measurement	2
Models	2
Psychometrics	2
Rating Scales	2
Simulation	2
Statistical Analysis	2
Test Reliability	2
Academic Achievement	1
Accuracy	1
Cognitive Ability	1
Cognitive Tests	1
College Entrance Examinations	1
Competence	1
Construct Validity	1
Correlation	1
Differences	1
Difficulty Level	1
More ▼

Source

Asia Pacific Education Review	1
College Board	1
Educational Psychologist	1
Educational and Psychological…	1
Intelligence	1
Language Assessment Quarterly	1
Online Submission	1
Society for Research on…	1

Author

Arendasy, Martin E.	1
Attali, Yigal	1
Barkaoui, Khaled	1
Custer, Michael	1
Hill, Heather	1
Kelcey, Ben	1
Kim, Jongpil	1
Kim, YoungKoung Rachel	1
Lee, Guemin	1
Lee, Yong-Won	1
Linacre, John M.	1
McGinn, Daniel	1
Park, In-Yong	1
Proctor, Thomas P.	1
Shavelson, Richard J.	1
Sommer, Markus	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	5
Speeches/Meeting Papers	3
Reports - Evaluative	2
Non-Print Media	1
Reference Materials - General	1
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

South Korea

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model

Download full text

Custer, Michael; Kim, Jongpil – Online Submission, 2023

This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…

Descriptors: Sample Size, Item Response Theory, Test Items, Computation

A Ranking Method for Evaluating Constructed Responses

Peer reviewed

Direct link

Attali, Yigal – Educational and Psychological Measurement, 2014

This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…

Descriptors: Responses, Item Response Theory, Scores, Rating Scales

On an Approach to Testing and Modeling Competence

Peer reviewed

Direct link

Shavelson, Richard J. – Educational Psychologist, 2013

E. L. Thorndike contributed significantly to the field of educational and psychological testing as well as more broadly to psychological studies in education. This article follows in his testing legacy. I address the escalating demand, across societal sectors, to measure individual and group competencies. In formulating an approach to measuring…

Descriptors: Competence, Psychology, Psychological Testing, Psychological Studies

A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Peer reviewed

Direct link

Lee, Guemin; Park, In-Yong – Asia Pacific Education Review, 2012

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

Descriptors: Generalizability Theory, Simulation, Computation, Item Response Theory

Using Multilevel Modeling in Language Assessment Research: A Conceptual Introduction

Peer reviewed

Direct link

Barkaoui, Khaled – Language Assessment Quarterly, 2013

This article critiques traditional single-level statistical approaches (e.g., multiple regression analysis) to examining relationships between language test scores and variables in the assessment setting. It highlights the conceptual, methodological, and statistical problems associated with these techniques in dealing with multilevel or nested…

Descriptors: Hierarchical Linear Modeling, Statistical Analysis, Multiple Regression Analysis, Generalizability Theory

Quantitative Differences in Retest Effects across Different Methods Used to Construct Alternate Test Forms

Peer reviewed

Direct link

Arendasy, Martin E.; Sommer, Markus – Intelligence, 2013

Allowing respondents to retake a cognitive ability test has shown to increase their test scores. Several theoretical models have been proposed to explain this effect, which make distinct assumptions regarding the measurement invariance of psychometric tests across test administration sessions with regard to narrower cognitive abilities and general…

Descriptors: Cognitive Tests, Testing, Repetition, Scores

Measurement of Classroom Teaching Quality with Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Kelcey, Ben; McGinn, Daniel; Hill, Heather – Society for Research on Educational Effectiveness, 2013

Recent policy has charged schools and districts with maintaining highly qualified teachers and differentiating among teachers in terms of their effectiveness (U.S. Department of Education, 2009). This emphasis has driven the development and implementation of teacher quality measures which are increasingly being used to evaluate teachers with…

Descriptors: Teacher Effectiveness, Measures (Individuals), Observation, Teacher Evaluation

Select Psychometric Properties and Predictive Validity of Scores on the SAT Writing Section

Download full text

Proctor, Thomas P.; Kim, YoungKoung Rachel – College Board, 2009

Presented at the national conference for the American Educational Research Association (AERA) in April 2009. This study examined the utility of scores on the SAT writing test, specifically examining the reliability of scores using generalizability and item response theories. The study also provides an overview of current predictive validity…

Descriptors: College Entrance Examinations, Writing Tests, Psychometrics, Predictive Validity

Generalizability Theory and Many-Facet Rasch Measurement.

Download full text

Linacre, John M. – 1993

Generalizability theory (G-theory) and many-facet Rasch measurement (Rasch) manage the variability inherent when raters rate examinees on test items. The purpose of G-theory is to estimate test reliability in a raw score metric. Unadjusted examinee raw scores are reported as measures. A variance component is estimated for the examinee…

Descriptors: Comparative Analysis, Equations (Mathematics), Estimation (Mathematics), Evaluators

Score Reliability of a Test Composed of Passage-Based Testlets: A Generalizability Theory Perspective.

Lee, Yong-Won – 2002

The purpose of this study was to investigate the impact of local item dependence (LID) in passage-based testlets on the test score reliability of an English as a Foreign Language (EFL) reading comprehension test from the perspective of generalizability (G) theory. Definitions and causes of LID in passage-based testlets are reviewed within the…

Descriptors: English (Second Language), Foreign Countries, Generalizability Theory, High School Students