NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 28 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Carolyn Clarke – in education, 2024
This ethnographic case study, situated in Newfoundland and Labrador, Canada, examined the effects of full-scale provincial testing on families, its influences on homework, and familial accountability for teaching and learning. Data were drawn from family interviews, as well as letters and documents regarding homework. Teachers sensed a significant…
Descriptors: Academic Standards, Accountability, Testing, Homework
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Guo, Hongwen; Rios, Joseph A.; Ling, Guangming; Wang, Zhen; Gu, Lin; Yang, Zhitong; Liu, Lydia O. – ETS Research Report Series, 2022
Different variants of the selected-response (SR) item type have been developed for various reasons (i.e., simulating realistic situations, examining critical-thinking and/or problem-solving skills). Generally, the variants of SR item format are more complex than the traditional multiple-choice (MC) items, which may be more challenging to test…
Descriptors: Test Format, Test Wiseness, Test Items, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…
Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Slepkov, A. D.; Van Bussel, M. L.; Fitze, K. M.; Burr, W. S. – SAGE Open, 2021
There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the…
Descriptors: Foreign Countries, Undergraduate Students, Student Evaluation, Multiple Choice Tests
Nazli Uygun Emil – ProQuest LLC, 2020
Validity of a measurement refers to appropriate test score meanings, uses, and interpretations (Messick, 1989; Kane, 1992). There are different approaches to validity: an evidentiary aspect of validity is one requiring gathering statistical evidence to evaluate test score meaning. A common approach to validation is comparisons of test score equity…
Descriptors: Educational Quality, Mathematics Tests, Test Validity, Test Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Chen, Michelle Y.; Flasko, Jennifer J. – Canadian Journal of Applied Linguistics / Revue canadienne de linguistique appliquée, 2020
Seeking evidence to support content validity is essential to test validation. This is especially the case in contexts where test scores are interpreted in relation to external proficiency standards and where new test content is constantly being produced to meet test administration and security demands. In this paper, we describe a modified…
Descriptors: Foreign Countries, Reading Tests, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Flynn, Alison B.; Featherstone, Ryan B. – Chemistry Education Research and Practice, 2017
This study investigated students' successes, strategies, and common errors in their answers to questions that involved the electron-pushing (curved arrow) formalism (EPF), part of organic chemistry's language. We analyzed students' answers to two question types on midterms and final exams: (1) draw the electron-pushing arrows of a reaction step,…
Descriptors: Organic Chemistry, Error Patterns, Science Tests, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Buono, Stephanie; Jang, Eunice Eunhee – Educational Assessment, 2021
Increasing linguistic diversity in classrooms has led researchers to examine the validity and fairness of standardized achievement tests, specifically concerning whether test score interpretations are free of bias and score use is fair for all students. This study examined whether mathematics achievement test items that contain complex language…
Descriptors: English Language Learners, Standardized Tests, Achievement Tests, Culture Fair Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Abashidze, Dato; McDonough, Kim; Gao, Yang – Second Language Research, 2022
Recent research that explored how input exposure and learner characteristics influence novel L2 morphosyntactic pattern learning has exposed participants to either text or static images rather than dynamic visual events. Furthermore, it is not known whether incorporating eye gaze cues into dynamic visual events enhances dual pattern learning.…
Descriptors: Second Language Learning, Second Language Instruction, Language Patterns, Morphology (Languages)
Peer reviewed Peer reviewed
Direct linkDirect link
McIntosh, James – Scandinavian Journal of Educational Research, 2019
This article examines whether the way that PISA models item outcomes in mathematics affects the validity of its country rankings. As an alternative to PISA methodology a two-parameter model is applied to PISA mathematics item data from Canada and Finland for the year 2012. In the estimation procedure item difficulty and dispersion parameters are…
Descriptors: Foreign Countries, Achievement Tests, Secondary School Students, International Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wolkowitz, Amanda A.; Davis-Becker, Susan L.; Gerrow, Jack D. – Journal of Applied Testing Technology, 2016
The purpose of this study was to investigate the impact of a cheating prevention strategy employed for a professional credentialing exam that involved releasing over 7,000 active and retired exam items. This study evaluated: 1) If any significant differences existed between examinee performance on released versus non-released items; 2) If item…
Descriptors: Cheating, Test Content, Test Items, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Kam, Chester Chun Seng – Educational and Psychological Measurement, 2016
To measure the response style of acquiescence, researchers recommend the use of at least 15 items with heterogeneous content. Such an approach is consistent with its theoretical definition and is a substantial improvement over traditional methods. Nevertheless, measurement of acquiescence can be enhanced by two additional considerations: first, to…
Descriptors: Test Items, Response Style (Tests), Test Content, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Roduta Roberts, Mary; Alves, Cecilia B.; Chu, Man-Wai; Thompson, Margaret; Bahry, Louise M.; Gotzmann, Andrea – Applied Measurement in Education, 2014
The purpose of this study was to evaluate the adequacy of three cognitive models, one developed by content experts and two generated from student verbal reports for explaining examinee performance on a grade 3 diagnostic mathematics test. For this study, the items were developed to directly measure the attributes in the cognitive model. The…
Descriptors: Foreign Countries, Mathematics Tests, Cognitive Processes, Models
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Yan; Zumbo, Bruno D.; Gustafson, Paul; Huang, Yi; Kroc, Edward; Wu, Amery D. – Practical Assessment, Research & Evaluation, 2016
A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g.,…
Descriptors: Test Items, Test Bias, Probability, Scores
Previous Page | Next Page »
Pages: 1  |  2