ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Generalizability Theory	6
Models	6
Performance Based Assessment	6
Interrater Reliability	2
Science Tests	2
Test Validity	2
Clinical Experience	1
Coding	1
Computer Simulation	1
Criteria	1
Data Analysis	1
Data Collection	1
Educational Research	1
Educational Technology	1
Essays	1
Estimation (Mathematics)	1
Evaluation Methods	1
Evaluators	1
Experiments	1
Generalization	1
Grade 10	1
Grade 11	1
Grade 8	1
Growth Models	1
High School Students	1
More ▼

Source

Applied Psychological…	1
Educational and Psychological…	1
Evaluation and Program…	1
Grantee Submission	1
Journal of Educational…	1
Pearson	1

Publication Type

Journal Articles	4
Reports - Research	4
Reports - Evaluative	2
Speeches/Meeting Papers	2
Reports - Descriptive	1

Education Level

Grade 10	1
Grade 8	1
Higher Education	1

Audience

Location

Massachusetts

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

What Different Kinds of Stratification Can Reveal about the Generalizability of Data-Mined Skill Assessment Models

Peer reviewed
PDF on ERIC

Download full text

Sao Pedro, Michael A.; Baker, Ryan S. J. d.; Gobert, Janice D. – Grantee Submission, 2013

When validating assessment models built with data mining, generalization is typically tested at the student-level, where models are tested on new students. This approach, though, may fail to find cases where model performance suffers if other aspects of those cases relevant to prediction are not well represented. We explore this here by testing if…

Descriptors: Educational Research, Data Collection, Data Analysis, Generalizability Theory

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors of Performance Ratings

Peer reviewed

Direct link

Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011

Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…

Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory

An Examination of Rater Drift within a Generalizability Theory Framework

Peer reviewed

Direct link

Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009

The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…

Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience

The Case for Performance-Based Tasks without Equating

Direct link

Way, Walter D.; Murphy, Daniel; Powers, Sonya; Keng, Leslie – Pearson, 2012

Significant momentum exists for next-generation assessments to increasingly utilize technology to develop and deliver performance-based assessments. Many traditional challenges with this assessment approach still apply, including psychometric concerns related to performance-based tasks (PBTs), which include low reliability, efficiency of…

Descriptors: Task Analysis, Performance Based Assessment, Technology Uses in Education, Models

Toward a Science Performance Assessment Technology.

Peer reviewed

Shavelson, Richard J.; Solano-Flores, Guillermo; Ruiz-Primo, Maria Araceli – Evaluation and Program Planning, 1998

Research on developing technology for large-scale performance assessments in science is reported briefly, and a conceptual framework is presented for defining, generating, and evaluating science performance assessments. Types of tasks are discussed, and the technical qualities of performance assessments are discussed in the context of…

Descriptors: Educational Technology, Generalizability Theory, Models, Performance Based Assessment

A Latent-Variable Modeling Approach to Assessing Interrater Reliability, Topic Generalizability, and Validity of a Content Assessment Scoring Rubric.

Peer reviewed

Abedi, Jamal; Baker, Eva L. – Educational and Psychological Measurement, 1995

Results from a performance assessment in which 68 high school students wrote essays support the use of latent variable modeling for estimating reliability, concurrent validity, and generalizability of a scoring rubric. The latent variable modeling approach overcomes the limitations of certain conventional statistical techniques in handling…

Descriptors: Criteria, Essays, Estimation (Mathematics), Generalizability Theory

Clauser, Brian E.	2
Harik, Polina	2
Abedi, Jamal	1
Baker, Eva L.	1
Baker, Ryan S. J. d.	1
Gobert, Janice D.	1
Grabovsky, Irina	1
Keng, Leslie	1
Murphy, Daniel	1
Nandakumar, Ratna	1
Nungester, Ronald J.	1
Powers, Sonya	1
Raymond, Mark R.	1
Ruiz-Primo, Maria Araceli	1
Sao Pedro, Michael A.	1
Shavelson, Richard J.	1
Solano-Flores, Guillermo	1
Swanson, Dave	1
Way, Walter D.	1
More ▼