Publication Date
In 2025 | 5 |
Since 2024 | 19 |
Since 2021 (last 5 years) | 73 |
Since 2016 (last 10 years) | 176 |
Since 2006 (last 20 years) | 445 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 28 |
Practitioners | 2 |
Policymakers | 1 |
Students | 1 |
Location
Turkey | 14 |
Canada | 10 |
United States | 10 |
California | 9 |
Netherlands | 9 |
Australia | 6 |
Germany | 6 |
South Korea | 6 |
Iowa | 5 |
Norway | 5 |
Turkey (Ankara) | 5 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Burton, A. Mike; Kramer, Robin S. S.; Ritchie, Kay L.; Jenkins, Rob – Cognitive Science, 2016
Research in face recognition has tended to focus on discriminating between individuals, or "telling people apart." It has recently become clear that it is also necessary to understand how images of the same person can vary, or "telling people together." Learning a new face, and tracking its representation as it changes from…
Descriptors: Recognition (Psychology), Human Body, Individual Differences, Familiarity
Fosnacht, Kevin; Gonyea, Robert M. – Research & Practice in Assessment, 2018
This study utilized generalizability theory to assess the context where the National Survey of Student Engagement's (NSSE) summary measures, the Engagement Indicators, produce dependable group-level means. The dependability of NSSE group means is an important topic for the higher education assessment community given its wide utilization and usage…
Descriptors: College Freshmen, College Seniors, Learner Engagement, National Surveys
Menéndez-Varela, José-Luis; Gregori-Giralt, Eva – Assessment & Evaluation in Higher Education, 2018
Rubrics are widely used in higher education to assess performance in project-based learning environments. To date, the sources of error that may affect their reliability have not been studied in depth. Using generalisability theory as its starting-point, this article analyses the influence of the assessors and the criteria of the rubrics on the…
Descriptors: Scoring Rubrics, Student Projects, Active Learning, Reliability
Wilson, Joshua; Chen, Dandan; Sandbank, Micheal P.; Hebert, Michael – Journal of Educational Psychology, 2019
The present study examined issues pertaining to the reliability of writing assessment in the elementary grades, and among samples of struggling and nonstruggling writers. The present study also extended nascent research on the reliability and the practical applications of automated essay scoring (AES) systems in Response to Intervention frameworks…
Descriptors: Computer Assisted Testing, Automation, Scores, Writing Tests
Johnson, Austin H.; Chafouleas, Sandra M.; Briesch, Amy M. – School Psychology Quarterly, 2017
In this study, generalizability theory was used to examine the extent to which (a) time-sampling methodology, (b) number of simultaneous behavior targets, and (c) individual raters influenced variance in ratings of academic engagement for an elementary-aged student. Ten graduate-student raters, with an average of 7.20 hr of previous training in…
Descriptors: Generalizability Theory, Sampling, Elementary School Students, Learner Engagement
Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017
This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…
Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores
Keller, Lena; Preckel, Franzis; Brunner, Martin – Journal of Educational Psychology, 2021
It is well-documented that academic achievement is associated with students' self-perceptions of their academic abilities, that is, their academic self-concepts. However, low-achieving students may apply self-protective strategies to maintain a favorable academic self-concept when evaluating their academic abilities. Consequently, the relation…
Descriptors: Correlation, Academic Achievement, High Achievement, Low Achievement
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017
With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…
Descriptors: Anatomy, Science Tests, Test Items, Test Reliability
Uzun, N. Bilge; Aktas, Mehtap; Asiret, Semih; Yormaz, Seha – Asian Journal of Education and Training, 2018
The goal of this study is to determine the reliability of the performance points of dentistry students regarding communication skills and to examine the scoring reliability by generalizability theory in balanced random and fixed facet (mixed design) data, considering also the interactions of student, rater and duty. The study group of the research…
Descriptors: Foreign Countries, Generalizability Theory, Scores, Test Reliability
Matcha, Wannisa; Gasevic, Dragan; Uzir, Nora'ayu Ahmad; Jovanovic, Jelena; Pardo, Abelardo; Lim, Lisa; Maldonado-Mahauad, Jorge; Gentili, Sheridan; Perez-Sanagustin, Mar; Tsai, Yi-Shan – Journal of Learning Analytics, 2020
Generalizability of the value of methods based on learning analytics remains one of the big challenges in the field of learning analytics. One approach to testing generalizability of a method is to apply it consistently in different learning contexts. This study extends a previously published work by examining the generalizability of a learning…
Descriptors: Learning Analytics, Learning Strategies, Instructional Design, Delivery Systems
Martínez, José Felipe; Kloser, Matt; Srinivasan, Jayashri; Stecher, Brian; Edelman, Amanda – Educational and Psychological Measurement, 2022
Adoption of new instructional standards in science demands high-quality information about classroom practice. Teacher portfolios can be used to assess instructional practice and support teacher self-reflection anchored in authentic evidence from classrooms. This study investigated a new type of electronic portfolio tool that allows efficient…
Descriptors: Science Instruction, Academic Standards, Instructional Innovation, Electronic Publishing
Harrison, George M. – Journal of Educational Measurement, 2015
The credibility of standard-setting cut scores depends in part on two sources of consistency evidence: intrajudge and interjudge consistency. Although intrajudge consistency feedback has often been provided to Angoff judges in practice, more evidence is needed to determine whether it achieves its intended effect. In this randomized experiment with…
Descriptors: Interrater Reliability, Standard Setting (Scoring), Cutting Scores, Feedback (Response)
Volpe, Robert J.; Briesch, Amy M. – School Psychology Review, 2016
This study examines the dependability of two scaling approaches for using a five-item Direct Behavior Rating multi-item scale to assess student disruptive behavior. A series of generalizability theory studies were used to compare a traditional frequency-based scaling approach with an approach wherein the informant compares a target student's…
Descriptors: Scaling, Behavior Rating Scales, Behavior Problems, Student Behavior
Clauser, Jerome C.; Margolis, Melissa J.; Clauser, Brian E. – Journal of Educational Measurement, 2014
Evidence of stable standard setting results over panels or occasions is an important part of the validity argument for an established cut score. Unfortunately, due to the high cost of convening multiple panels of content experts, standards often are based on the recommendation from a single panel of judges. This approach implicitly assumes that…
Descriptors: Standard Setting (Scoring), Generalizability Theory, Replication (Evaluation), Cutting Scores