NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Peer reviewed Peer reviewed
Direct linkDirect link
Cook, Robert J.; Durning, Steven J. – AERA Online Paper Repository, 2016
In an effort to better align item development to goals of assessing higher-order tasks and decision making, complex decision trees were developed to follow clinical reasoning scripts and used as models on which multiple-choice questions could be built. This approach is compatible with best-practice assessment frameworks like Evidence Centered…
Descriptors: Multiple Choice Tests, Decision Making, Models, Task Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Ketterlin-Geller, Leanne R.; Yovanoff, Paul; Jung, EunJu; Liu, Kimy; Geller, Josh – Educational Assessment, 2013
In this article, we highlight the need for a precisely defined construct in score-based validation and discuss the contribution of cognitive theories to accurately and comprehensively defining the construct. We propose a framework for integrating cognitively based theoretical and empirical evidence to specify and evaluate the construct. We apply…
Descriptors: Test Validity, Construct Validity, Scores, Evidence
Jia, Yujie – ProQuest LLC, 2013
This study employed Bachman and Palmer's (2010) Assessment Use Argument framework to investigate to what extent the use of a second language oral test as an exit test in a Hong Kong university can be justified. It also aimed to help test developers of this oral test identify the most critical areas in the current test design that might need…
Descriptors: Test Use, Language Tests, Oral Language, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Wilson, E. Vance; Sheetz, Steven D. – Computers & Education, 2010
This paper presents an initial test of the group task demands-resources (GTD-R) model of group task performance among IT students. We theorize that demands and resources in group work influence formation of perceived group work pressure (GWP) and that heightened levels of GWP inhibit group task performance. A prior study identified 11 factors…
Descriptors: Burnout, Group Dynamics, Models, Group Activities
Stephenson, Robert W.; And Others – 1973
A new, more specific language for describing work activities, based upon the duty module (clusters of tasks that tend to go together occupationally and organizationally in meaningful ways) is being designed for the Army. The purpose is to improve communications between resource and requirement planners and program operators. The paper proposes two…
Descriptors: Evaluation Methods, Individual Testing, Military Personnel, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Butler, Timothy; Waldroop, James – Journal of Career Assessment, 2004
The authors argue that an effective way to describe the manifestation of interest patterns within a particular work domain is through a nuanced description of interests in terms of the essential functional activities common to that domain. Focusing on the domain of business work and studying a large sample of business professionals over a 15-year…
Descriptors: Psychometrics, Interest Inventories, Business, Vocational Interests
Peer reviewed Peer reviewed
Bhaskar, R.; Dillard, Jesse F. – Instructional Science, 1983
Description of an objective method for assigning weights to questions on examinations includes discussions of classical test theory, knowledge organization, and how task analysis can be used to identify knowledge elements required to solve specific problems, rank them, and assign objective weights to exam questions using a Pareto distribution (7…
Descriptors: Accounting, Epistemology, Evaluation Methods, Item Analysis
Southern Association of Colleges and Schools, Atlanta, GA. – 1983
This volume contains the initial draft of a model for assessing students in vocational education programs in Georgia. Addressed in the first section of the draft are some of the components that are believed to be critical in the development of a model for assessing vocational student achievement, including selecting a program for use in developing…
Descriptors: Academic Achievement, Behavioral Objectives, Criterion Referenced Tests, Guidelines
Bormuth, John R. – 1979
A procedure is demonstrated for constructing tables showing, for each score on a commercial reading achievement test, the percentage of real-world materials that the testee is likely to comprehend with at least a criterion level of proficiency, the percentages of students in a local or national sample who can competently comprehend a given…
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Equivalency Tests, Expectancy Tables