Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 12 |
Since 2006 (last 20 years) | 49 |
Descriptor
Evaluation Methods | 129 |
Models | 129 |
Test Construction | 129 |
Test Validity | 31 |
Student Evaluation | 28 |
Foreign Countries | 23 |
Educational Assessment | 21 |
Test Reliability | 19 |
Elementary Secondary Education | 18 |
Measurement Techniques | 18 |
Test Items | 18 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
Australia | 4 |
Pennsylvania | 4 |
California | 3 |
Netherlands | 3 |
Germany | 2 |
New Mexico | 2 |
United Kingdom | 2 |
United Kingdom (England) | 2 |
Virginia | 2 |
Brazil | 1 |
Costa Rica | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022
Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…
Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Pullen, Reyne; Thickett, Stuart C.; Bissember, Alex C. – Chemistry Education Research and Practice, 2018
In chemistry curricula, both the role of the laboratory program and the method of assessment used are subject to scrutiny and debate. The ability to identify clearly defined competencies for the chemistry laboratory program is crucial, given the numerous other disciplines that rely on foundation-level chemistry knowledge and practical skills. In…
Descriptors: Undergraduate Study, College Science, Chemistry, Science Laboratories
Paz, Jennica; Kim, Eui Kyung; Dowdy, Erin; Furlong, Michael J.; Hinton, Tameisha; Piqueras, José A.; Rodríguez-Jiménez, Tíscar; Marzo, Juan C.; Coates, Susan – Grantee Submission, 2020
The assessment of psychosocial strengths in children and adolescents has predominately focused on the measurement of single traits and constructs, such as grit (Christopoulou, Lakioti, Pezirkianidis, Karakasidou, & Stalikas, 2018), optimism (Oberle, Guhn, Gadermann, Thomson, & Schonert-Reichl, 2018), hope (Pedrotti, 2018), and gratitude…
Descriptors: Psychological Patterns, Student Characteristics, Screening Tests, Holistic Approach
Aydin, Selami; Harputlu, Leyla; Çelik, Seyda Savran; Ustuk, Özgehan; Güzel, Serhat; Genç, Deniz – Online Submission, 2016
Measurement of children's behaviors in an educational and research context is a problematic and complex area. It is also evident that adapting scales to measure children's behaviors in an educational and research context is a complex process due to several reasons. First, cultural elements constitute a considerable problem. Second, it is difficult…
Descriptors: Child Behavior, Models, Test Construction, Test Validity
Lewis, Todd F. – Measurement and Evaluation in Counseling and Development, 2017
American Educational Research Association (AERA) standards stipulate that researchers show evidence of the internal structure of instruments. Confirmatory factor analysis (CFA) is one structural equation modeling procedure designed to assess construct validity of assessments that has broad applicability for counselors interested in instrument…
Descriptors: Educational Research, Factor Analysis, Structural Equation Models, Construct Validity
Varela, Otmar; Mead, Esther – Journal of Education for Business, 2018
Popular teamwork assessments have been strongly criticized on the grounds of poor psychometric properties and their disconnect with conceptual models of teamwork. These issues raise concerns with respect to our ability to evaluate efforts devoted to advancing teamwork in academia. We report the development of a teamwork assessment that builds on…
Descriptors: Teamwork, Evaluation Methods, Test Validity, Psychometrics
Hughes, John; Petscher, Yaacov – Regional Educational Laboratory Southeast, 2016
The high rate of students taking developmental education courses suggests that many students graduate from high school unready to meet college expectations. A college readiness screener can help colleges and school districts better identify students who are not ready for college credit courses. The primary audience for this guide is leaders and…
Descriptors: College Readiness, Screening Tests, Test Construction, Predictor Variables
DiCerbo, Kristen E. – Journal of Applied Testing Technology, 2017
While game-based assessment offers new potential for understanding the processes students use to solve problems, it also presents new challenges in uncovering which player actions provide evidence that contributes to understanding about students' knowledge, skill, and attributes that we are interested in assessing. A development process that…
Descriptors: Educational Games, Evaluation Methods, Educational Technology, Technology Uses in Education
von Wangenheim, Christiane G.; Petri, Giani; Zibertti, André W.; Borgatto, Adriano F.; Hauck, Jean C. R.; Pacheco, Fernando S.; Filho, Raul Missfeldt – Informatics in Education, 2017
The objective of this article is to present the development and evaluation of dETECT (Evaluating TEaching CompuTing), a model for the evaluation of the quality of instructional units for teaching computing in middle school based on the students' perception collected through a measurement instrument. The dETECT model was systematically developed…
Descriptors: Units of Study, Course Evaluation, Case Studies, Evaluation Methods
van den Berg, M.; Harskamp, E. G.; Suhre, C. J. M. – Educational Studies, 2016
In the last two decades Dutch primary school students scored below expectation in international mathematics tests. An explanation for this may be that teachers fail to adequately assess their students' understanding of learning goals and provide timely feedback. To improve the teachers' formative assessment practice, researchers, curriculum…
Descriptors: Foreign Countries, Elementary School Students, Scores, Mathematics Achievement
Huffman, Tanner; Burke, Barry – Technology and Engineering Teacher, 2015
Engineering byDesign™ (EbD) provides teachers and students with reflective, formative, and comprehensive summative assessment tools throughout the curriculum. Each lesson, unit, and course is created with a list of essential questions that are meant to guide students to a deeper understanding of national standards across the STEM domains (STL,…
Descriptors: Technological Literacy, Engineering Education, Student Evaluation, Evaluation Methods
Andjelic, Svetlana; Cekerevac, Zoran – Education and Information Technologies, 2014
This article presents the original model of the computer adaptive testing and grade formation, based on scientifically recognized theories. The base of the model is a personalized algorithm for selection of questions depending on the accuracy of the answer to the previous question. The test is divided into three basic levels of difficulty, and the…
Descriptors: Computer Assisted Testing, Educational Technology, Grades (Scholastic), Test Construction
Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016
Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…
Descriptors: Evaluation Methods, Test Construction, Design, Scaling
Perry, John L.; Clough, Peter J.; Crust, Lee; Nabb, Sam L.; Nicholls, Adam R. – Research Quarterly for Exercise and Sport, 2015
Purpose: A new measure of sportspersonship, which differentiates between compliance and principled approaches, was developed and initially validated in 3 studies. Method: Study 1 developed items, assessed content validity, and proposed a model. Study 2 tested the factorial validity of the model on an independent sample. Study 3 further tested the…
Descriptors: Program Development, Program Validation, Physical Education, Compliance (Legal)