Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 4 |
Descriptor
Generalizability Theory | 6 |
Models | 6 |
Performance Based Assessment | 6 |
Interrater Reliability | 2 |
Science Tests | 2 |
Test Validity | 2 |
Clinical Experience | 1 |
Coding | 1 |
Computer Simulation | 1 |
Criteria | 1 |
Data Analysis | 1 |
More ▼ |
Source
Applied Psychological… | 1 |
Educational and Psychological… | 1 |
Evaluation and Program… | 1 |
Grantee Submission | 1 |
Journal of Educational… | 1 |
Pearson | 1 |
Author
Clauser, Brian E. | 2 |
Harik, Polina | 2 |
Abedi, Jamal | 1 |
Baker, Eva L. | 1 |
Baker, Ryan S. J. d. | 1 |
Gobert, Janice D. | 1 |
Grabovsky, Irina | 1 |
Keng, Leslie | 1 |
Murphy, Daniel | 1 |
Nandakumar, Ratna | 1 |
Nungester, Ronald J. | 1 |
More ▼ |
Publication Type
Journal Articles | 4 |
Reports - Research | 4 |
Reports - Evaluative | 2 |
Speeches/Meeting Papers | 2 |
Reports - Descriptive | 1 |
Education Level
Grade 10 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Audience
Location
Massachusetts | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Sao Pedro, Michael A.; Baker, Ryan S. J. d.; Gobert, Janice D. – Grantee Submission, 2013
When validating assessment models built with data mining, generalization is typically tested at the student-level, where models are tested on new students. This approach, though, may fail to find cases where model performance suffers if other aspects of those cases relevant to prediction are not well represented. We explore this here by testing if…
Descriptors: Educational Research, Data Collection, Data Analysis, Generalizability Theory
Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011
Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…
Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory
Harik, Polina; Clauser, Brian E.; Grabovsky, Irina; Nungester, Ronald J.; Swanson, Dave; Nandakumar, Ratna – Journal of Educational Measurement, 2009
The present study examined the long-term usefulness of estimated parameters used to adjust the scores from a performance assessment to account for differences in rater stringency. Ratings from four components of the USMLE[R] Step 2 Clinical Skills Examination data were analyzed. A generalizability-theory framework was used to examine the extent to…
Descriptors: Generalizability Theory, Performance Based Assessment, Performance Tests, Clinical Experience
Way, Walter D.; Murphy, Daniel; Powers, Sonya; Keng, Leslie – Pearson, 2012
Significant momentum exists for next-generation assessments to increasingly utilize technology to develop and deliver performance-based assessments. Many traditional challenges with this assessment approach still apply, including psychometric concerns related to performance-based tasks (PBTs), which include low reliability, efficiency of…
Descriptors: Task Analysis, Performance Based Assessment, Technology Uses in Education, Models

Shavelson, Richard J.; Solano-Flores, Guillermo; Ruiz-Primo, Maria Araceli – Evaluation and Program Planning, 1998
Research on developing technology for large-scale performance assessments in science is reported briefly, and a conceptual framework is presented for defining, generating, and evaluating science performance assessments. Types of tasks are discussed, and the technical qualities of performance assessments are discussed in the context of…
Descriptors: Educational Technology, Generalizability Theory, Models, Performance Based Assessment

Abedi, Jamal; Baker, Eva L. – Educational and Psychological Measurement, 1995
Results from a performance assessment in which 68 high school students wrote essays support the use of latent variable modeling for estimating reliability, concurrent validity, and generalizability of a scoring rubric. The latent variable modeling approach overcomes the limitations of certain conventional statistical techniques in handling…
Descriptors: Criteria, Essays, Estimation (Mathematics), Generalizability Theory