NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 31 to 45 of 452 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2018
The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…
Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Stoner, James C. – Journal of College and University Student Housing, 2019
Hiring the most capable students to serve in the RA role should be a top priority for housing departments due to their critical role as front-line student success employees. The effort to identify and hire RAs typically includes a substantial investment of personnel resources in hiring processes, where candidates are typically evaluated across…
Descriptors: Resident Advisers, College Housing, Personnel Selection, Job Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Brann, Kristy L.; Boone, William J.; Splett, Joni W.; Clemons, Courtney; Bidwell, Sarah L. – Journal of Psychoeducational Assessment, 2021
Given the important role that teachers play in supporting student mental health, it is critical teachers feel confident in their ability to fill such roles. To inform strategies intended to improve teacher confidence in supporting student mental health, a psychometrically sound tool assessing teacher school mental health self-efficacy is needed.…
Descriptors: Teacher Surveys, Test Construction, Psychometrics, Mental Health
Wang, Shichao; Li, Dongmei; Steedle, Jeffrey – ACT, Inc., 2021
Speeded tests set time limits so that few examinees can reach all items, and power tests allow most test-takers sufficient time to attempt all items. Educational achievement tests are sometimes described as "timed power tests" because the amount of time provided is intended to allow nearly all students to complete the test, yet this…
Descriptors: Timed Tests, Test Items, Achievement Tests, Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Srisopha, Kwanklao – Education Quarterly Reviews, 2022
The objectives of this research were 1) to study the relationship between student factors, English language instructor factors, and environment factors with students' achievement in English language learning and 2) to develop equations to forecast factors affecting students' achievement in English language learning. The population used in this…
Descriptors: Second Language Learning, Second Language Instruction, Foreign Countries, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Wesolowski, Brian C. – Journal of Educational Measurement, 2019
The purpose of this study was to build a Random Forest supervised machine learning model in order to predict musical rater-type classifications based upon a Rasch analysis of raters' differential severity/leniency related to item use. Raw scores (N = 1,704) from 142 raters across nine high school solo and ensemble festivals (grades 9-12) were…
Descriptors: Item Response Theory, Prediction, Classification, Artificial Intelligence
Peer reviewed Peer reviewed
Direct linkDirect link
Albano, Anthony D.; Christ, Theodore J.; Cai, Liuhan – Measurement: Interdisciplinary Research and Perspectives, 2018
Traditional psychometric methods have primarily been developed and applied in the context of high-stakes, large-scale testing. However, these methods are increasingly being used with classroom assessments, including progress monitoring measures where numerous test forms are administered over the course of an academic year. This article provides an…
Descriptors: Progress Monitoring, Hierarchical Linear Modeling, Equated Scores, Raw Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018
In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…
Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wei, Youhua; Morgan, Rick – ETS Research Report Series, 2016
As an alternative to common-item equating when common items do not function as expected, the single-group growth model (SGGM) scaling uses common examinees or repeaters to link test scores on different forms. The SGGM scaling assumes that, for repeaters taking adjacent administrations, the conditional distribution of scale scores in later…
Descriptors: Equated Scores, Growth Models, Scaling, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Fujimoto, Ken A.; Gordon, Rachel A.; Peng, Fang; Hofer, Kerry G. – AERA Open, 2018
Classroom quality measures, such as the Early Childhood Environment Rating Scale, Revised (ECERS-R), are widely used in research, practice, and policy. Increasingly, these uses have been for purposes not originally intended, such as contributing to consequential policy decisions. The current study adds to the recent evidence of problems with the…
Descriptors: Rating Scales, Early Childhood Education, Educational Quality, Preschool Curriculum
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ssemakula, Mukasa E.; Liao, Gene Y.; Sawilowsky, Shlomo – American Journal of Engineering Education, 2018
There is a major trend in engineering education to provide students with realistic hands-on learning experiences. This paper reports on the results of work done to develop standardized test instruments to use for student learning outcomes assessment in an experiential hands-on manufacturing engineering and technology environment. The specific…
Descriptors: Test Construction, Psychometrics, Test Validity, Standardized Tests
Fujimoto, Ken A.; Gordon, Rachel A.; Peng, Fang; Hofer, Kerry G. – Grantee Submission, 2018
Classroom quality measures, such as the Early Childhood Environment Rating Scale, Revised (ECERS-R), are widely used in research, practice, and policy. Increasingly, these uses have been for purposes not originally intended, such as contributing to consequential policy decisions. The current study adds to recent evidence of problems with the…
Descriptors: Rating Scales, Educational Quality, Early Childhood Education, Preschool Curriculum
Peer reviewed Peer reviewed
Direct linkDirect link
Hessel, Annina K.; Schroeder, Sascha – Discourse Processes: A Multidisciplinary Journal, 2020
This experiment investigated interactions between lower- and higher-level processing when reading in a second language (L2). We conducted an eye-tracking experiment with the within-subject manipulation inconsistency (to tap higher-level coherence-building) crossed with a within-subject manipulation of word-processing difficulty (to alter the ease…
Descriptors: Second Language Learning, Second Language Instruction, Reading Processes, Eye Movements
Peer reviewed Peer reviewed
Direct linkDirect link
Terao, Takahiro; Ishii, Hidetoki – SAGE Open, 2020
This study aimed to compare selection patterns of distractors (incorrect options) according to test taker proficiency regarding Japanese students' summarization skills of an English paragraph. Participants included 414 undergraduate students, and the test comprised three summarization process types--deletion, generalization, and integration.…
Descriptors: Comparative Analysis, English (Second Language), Second Language Instruction, Second Language Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  31