Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
Error of Measurement | 3 |
Evaluators | 3 |
Test Items | 3 |
Interrater Reliability | 2 |
Bias | 1 |
Cutting Scores | 1 |
Elementary School Students | 1 |
English (Second Language) | 1 |
Factor Analysis | 1 |
Foreign Countries | 1 |
Goodness of Fit | 1 |
More ▼ |
Author
Cox, Kyle | 1 |
Ito, Kyoko | 1 |
Kelcey, Ben | 1 |
Sykes, Robert C. | 1 |
Wang, Shanshan | 1 |
Wang, Zhen | 1 |
Wudthayagorn, Jirada | 1 |
Publication Type
Journal Articles | 2 |
Reports - Research | 2 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Wudthayagorn, Jirada – LEARN Journal: Language Education and Acquisition Research Network, 2018
The purpose of this study was to map the Chulalongkorn University Test of English Proficiency, or the CU-TEP, to the Common European Framework of Reference (CEFR) by employing a standard setting methodology. Thirteen experts judged 120 items of the CU-TEP using the Yes/No Angoff technique. The experts decided whether or not a borderline student at…
Descriptors: Guidelines, Rating Scales, English (Second Language), Language Tests
Kelcey, Ben; Wang, Shanshan; Cox, Kyle – Society for Research on Educational Effectiveness, 2016
Valid and reliable measurement of unobserved latent variables is essential to understanding and improving education. A common and persistent approach to assessing latent constructs in education is the use of rater inferential judgment. The purpose of this study is to develop high-dimensional explanatory random item effects models designed for…
Descriptors: Test Items, Models, Evaluators, Longitudinal Studies
Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008
Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…
Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring