Publication Date
In 2025 | 2 |
Since 2024 | 11 |
Since 2021 (last 5 years) | 24 |
Since 2016 (last 10 years) | 40 |
Since 2006 (last 20 years) | 65 |
Descriptor
Source
Author
Publication Type
Education Level
Location
Canada | 4 |
Illinois | 4 |
Iran | 4 |
China | 3 |
Japan | 3 |
Ohio | 3 |
Pennsylvania | 3 |
Turkey | 3 |
United Kingdom | 3 |
Australia | 2 |
Bangladesh | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 2 |
Bilingual Education Act 1968 | 1 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024
This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…
Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures
Amanda A. Wolkowitz; Russell Smith – Practical Assessment, Research & Evaluation, 2024
A decision consistency (DC) index is an estimate of the consistency of a classification decision on an exam. More specifically, DC estimates the percentage of examinees that would have the same classification decision on an exam if they were to retake the same or a parallel form of the exam again without memory of taking the exam the first time.…
Descriptors: Testing, Test Reliability, Replication (Evaluation), Decision Making
Hurford, David P.; Wines, Autumn – Australian Journal of Learning Difficulties, 2022
The purpose of the present study was to examine the potential that parents could effectively administer an online dyslexia evaluation tool (ODET) to their children. To this end, four groups consisting of parents and trained staff were compared. Sixty-three children (36 females and 27 males) participated. The children in each group were assessed…
Descriptors: Test Reliability, Computer Assisted Testing, Dyslexia, Screening Tests
Mehmet Kanik – International Journal of Assessment Tools in Education, 2024
ChatGPT has surged interest to cause people to look for its use in different tasks. However, before allowing it to replace humans, its capabilities should be investigated. As ChatGPT has potential for use in testing and assessment, this study aims to investigate the questions generated by ChatGPT by comparing them to those written by a course…
Descriptors: Artificial Intelligence, Testing, Multiple Choice Tests, Test Construction
Catherine Mata; Katharine Meyer; Lindsay Page – Annenberg Institute for School Reform at Brown University, 2024
This article examines the risk of crossover contamination in individual-level randomization, a common concern in experimental research, in the context of a large-enrollment college course. While individual-level randomization is more efficient for assessing program effectiveness, it also increases the potential for control group students to cross…
Descriptors: Chemistry, Science Instruction, Undergraduate Students, Large Group Instruction
Jeff Allen; Ty Cruce – ACT Education Corp., 2025
This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…
Descriptors: College Entrance Examinations, Testing, Change, Scores
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Makaruk, Hubert; Porter, Jared M.; Cieslinski, Igor – Measurement in Physical Education and Exercise Science, 2021
This study examined the test-retest reliability of the standing long jump (SLJ) and the countermovement jump (CMJ) following consistent and non-consistent attentional focus cuing instructions in physically active young adults (n = 30). The systematic error (as standardize change in mean), random error (as typical error), the Bland and Altman…
Descriptors: Attention Control, Test Reliability, Performance Tests, Physical Activities
Mansooreh Hosseinnia; Zahra Kafi – Language Testing in Asia, 2024
As testing involves various aspects of education as well as the ones who are involved like instructors, students, managers, teacher trainers, testers, and decision-makers, it comes to be highly crucial to develop ethical tests. In addition, as some methods of testing are more favored and practiced compared to others without considering the ethical…
Descriptors: Test Construction, Test Validity, Ethics, Testing
Mücahit Öztürk – Open Praxis, 2024
This study examined the problems that pre-service teachers face in the online assessment process and their suggestions for solutions to these problems. The participants were 136 pre-service teachers who have been experiencing online assessment for a long time and who took the Foundations of Open and Distance Learning course. This research is a…
Descriptors: Foreign Countries, Preservice Teacher Education, Preservice Teachers, Distance Education
Yousuf, Mustafa S.; Miles, Katherine; Harvey, Heather; Al-Tamimi, Mohammad; Badran, Darwish – Journal of University Teaching and Learning Practice, 2022
Exams should be valid, reliable, and discriminative. Multiple informative methods are used for exam analysis. Displaying analysis results numerically, however, may not be easily comprehended. Using graphical analysis tools could be better for the perception of analysis results. Two such methods were employed: standardized x-bar control charts with…
Descriptors: Multiple Choice Tests, Testing, Test Reliability, Test Validity
Practices in Instrument Use and Development in "Chemistry Education Research and Practice" 2010-2021
Lazenby, Katherine; Tenney, Kristin; Marcroft, Tina A.; Komperda, Regis – Chemistry Education Research and Practice, 2023
Assessment instruments that generate quantitative data on attributes (cognitive, affective, behavioral, "etc.") of participants are commonly used in the chemistry education community to draw conclusions in research studies or inform practice. Recently, articles and editorials have stressed the importance of providing evidence for the…
Descriptors: Chemistry, Periodicals, Journal Articles, Science Education
Alper Gülay; Emre Cumali; Damla Cumali – International Journal of Contemporary Educational Research, 2024
This qualitative phenomenological study explores the experiences of parents of children with special needs in Turkey, specifically their encounters with Guidance and Research Centers (GRCs) during the process of obtaining educational assessment reports. Through semi-structured interviews with 25 parents, the study reveals complex emotions and…
Descriptors: Foreign Countries, Special Needs Students, Parent Attitudes, Parent Participation
F. Alethea Marti; Nadereh Pourat; Christopher Lee; Bonnie T. Zima – Administration and Policy in Mental Health and Mental Health Services Research, 2022
While many standardized assessment measures exist to track child mental health treatment outcomes, the degree to which such tools have been adequately tested for reliability and validity across race, ethnicity, and class is uneven. This paper examines the corpus of published tests of psychometric properties for the ten standardized measures used…
Descriptors: Mental Health, Outcome Measures, Psychometrics, Standardized Tests