NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 4 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015
This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…
Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures
Peer reviewed Peer reviewed
Direct linkDirect link
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items
Smith, Richard M. – 1982
There have been many attempts to formulate a procedure for extracting information from incorrect responses to multiple choice items, i.e., the assessment of partial knowledge. The results of these attempts can be described as inconsistent at best. It is hypothesized that these inconsistencies arise from three methodological problems: the…
Descriptors: Difficulty Level, Evaluation Methods, Goodness of Fit, Guessing (Tests)