Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 11 |
Descriptor
Models | 15 |
Scoring Formulas | 15 |
Evaluation Methods | 5 |
Scores | 4 |
Foreign Countries | 3 |
Item Response Theory | 3 |
Scoring | 3 |
Statistical Analysis | 3 |
Test Items | 3 |
Test Reliability | 3 |
Test Scoring Machines | 3 |
More ▼ |
Source
Author
Attali, Yigal | 1 |
Bridgeman, Brent | 1 |
Burton, Richard F. | 1 |
Chen, Lei | 1 |
Cohen, Allan | 1 |
Davey, Tim | 1 |
Davis, Larry | 1 |
Engell, Sebastian | 1 |
Evanini, Keelan | 1 |
Frey, Andreas | 1 |
Gottfredson, Linda S. | 1 |
More ▼ |
Publication Type
Journal Articles | 15 |
Reports - Research | 8 |
Reports - Evaluative | 4 |
Reports - Descriptive | 2 |
Opinion Papers | 1 |
Education Level
Higher Education | 5 |
Postsecondary Education | 3 |
Adult Education | 1 |
Grade 11 | 1 |
Grade 7 | 1 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
Denmark | 1 |
Germany | 1 |
Thailand | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
General Aptitude Test Battery | 1 |
Graduate Record Examinations | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018
The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…
Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators
Wise, Steven L.; Kingsbury, G. Gage – Journal of Educational Measurement, 2016
This study examined the utility of response time-based analyses in understanding the behavior of unmotivated test takers. For the data from an adaptive achievement test, patterns of observed rapid-guessing behavior and item response accuracy were compared to the behavior expected under several types of models that have been proposed to represent…
Descriptors: Achievement Tests, Student Motivation, Test Wiseness, Adaptive Testing
Zechner, Klaus; Chen, Lei; Davis, Larry; Evanini, Keelan; Lee, Chong Min; Leong, Chee Wee; Wang, Xinhao; Yoon, Su-Youn – ETS Research Report Series, 2015
This research report presents a summary of research and development efforts devoted to creating scoring models for automatically scoring spoken item responses of a pilot administration of the Test of English-for-Teaching ("TEFT"™) within the "ELTeach"™ framework.The test consists of items for all four language modalities:…
Descriptors: Scoring, Scoring Formulas, Speech Communication, Task Analysis
Pollio, Marty; Hochbein, Craig – Teachers College Record, 2015
Background/Context: From two decades of research on the grading practices of teachers in secondary schools, researchers discovered that teachers evaluated students on numerous factors that do not validly assess a student's achievement level in a specific content area. These consistent findings suggested that traditional grading practices evolved…
Descriptors: Standardized Tests, Academic Standards, Grading, Scores
Taskinen, Päivi H.; Steimel, Jochen; Gräfe, Linda; Engell, Sebastian; Frey, Andreas – Peabody Journal of Education, 2015
This study examined students' competencies in engineering education at the university level. First, we developed a competency model in one specific field of engineering: process dynamics and control. Then, the theoretical model was used as a frame to construct test items to measure students' competencies comprehensively. In the empirical…
Descriptors: Models, Engineering Education, Test Items, Outcome Measures
Wang, Tsung Juang – Teaching in Higher Education, 2011
Virtual world technology is now being incorporated into various higher education programs, often with enthusiastic claims about the improvement of students' abilities to experience learning problems and tasks in computer-mediated virtual reality through the use of computer-generated personal agents or avatars. The interactivity of the avatars with…
Descriptors: Constructivism (Learning), Learning Problems, Computer Simulation, Scoring Formulas
Tiantong, Monchai; Teemuangsai, Sanit – International Education Studies, 2013
One of the benefits of using collaborative learning is enhancing learning achievement and increasing social skills, and the second benefits is as the more students work together in collaborative groups, the more they understand, retain, and feel better about themselves and their peers, moreover working together in a collaborative environment…
Descriptors: Foreign Countries, Cooperative Learning, Teamwork, Integrated Learning Systems
Attali, Yigal – Applied Psychological Measurement, 2011
Recently, Attali and Powers investigated the usefulness of providing immediate feedback on the correctness of answers to constructed response questions and the opportunity to revise incorrect answers. This article introduces an item response theory (IRT) model for scoring revised responses to questions when several attempts are allowed. The model…
Descriptors: Feedback (Response), Item Response Theory, Models, Error Correction
Kreiner, Svend – Applied Psychological Measurement, 2011
To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…
Descriptors: Item Analysis, Correlation, Item Response Theory, Models
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012
Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…
Descriptors: Scoring, Test Scoring Machines, Automation, Models

Green, J. R.; And Others – British Journal of Educational Psychology, 1981
A simple unbalanced block model is proposed for examination marks, as an improvement on the usual implicit model. The new model is applied to some real data and is found, by the usual normal linear theory F test, to give a highly significant improvement. Some alternative models are also considered. (Author)
Descriptors: Achievement Rating, Achievement Tests, Models, Scoring Formulas
Stricker, Lawrence J.; Rock, Donald A. – ETS Research Report Series, 2008
This study assessed the invariance in the factor structure of the "Test of English as a Foreign Language"™ Internet-based test (TOEFL® iBT) across subgroups of test takers who differed in native language and exposure to the English language. The subgroups were defined by (a) Indo-European and Non-Indo-European language family, (b)…
Descriptors: Factor Structure, English (Second Language), Language Tests, Computer Assisted Testing
van den Brink, Wulfert – Evaluation in Education: International Progress, 1982
Binomial models for domain-referenced testing are compared, emphasizing the assumptions underlying the beta-binomial model. Advantages and disadvantages are discussed. A proposed item sampling model is presented which takes the effect of guessing into account. (Author/CM)
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Sampling, Measurement Techniques
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items

Gottfredson, Linda S. – American Psychologist, 1994
Focuses on score adjustment by racial or ethnic group (race norming) in employment testing, and provides a history of the original controversy. The author analyzes race-based adjustments in test scores and discusses how personnel-selection science is being compromised in an effort to reconcile contradictory legal demands. (GLR)
Descriptors: Compliance (Legal), Court Litigation, Employment Practices, Equal Opportunities (Jobs)