Publication Date
| In 2026 | 0 |
| Since 2025 | 68 |
| Since 2022 (last 5 years) | 174 |
Descriptor
Source
Author
| Bethan Burge | 3 |
| Louise Benson | 3 |
| Aaron Soo Ping Chow | 2 |
| Andrew P. Jaciw | 2 |
| Angela Acuña | 2 |
| Dan Murphy | 2 |
| Mariann Lemke | 2 |
| A. E. Ades | 1 |
| A. R. Georgeson | 1 |
| Abdallahi Bouhabib | 1 |
| Abdou L. J. Jammeh | 1 |
| More ▼ | |
Publication Type
Education Level
Audience
| Researchers | 2 |
| Policymakers | 1 |
| Practitioners | 1 |
Location
| United Kingdom (England) | 8 |
| China | 7 |
| United States | 5 |
| Indonesia | 4 |
| Israel | 4 |
| South Korea | 4 |
| Canada | 3 |
| Japan | 3 |
| Taiwan | 3 |
| Tennessee | 3 |
| Turkey | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Supplemental Nutrition… | 1 |
Assessments and Surveys
| Program for International… | 2 |
| Program for the International… | 2 |
| Trends in International… | 2 |
| Autism Spectrum Quotient | 1 |
| Dynamic Indicators of Basic… | 1 |
| Kaufman Brief Intelligence… | 1 |
| Measures of Academic Progress | 1 |
What Works Clearinghouse Rating
Silvia Testa; Renato Miceli; Renato Miceli – Educational Measurement: Issues and Practice, 2025
Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE--that may only be used when the individuals taking the two tests come from the same…
Descriptors: Comparative Testing, Heuristics, Problem Solving, Personality Traits
Jing Miao; Yi Cao; Michael E. Walker – ETS Research Report Series, 2024
Studies of test score comparability have been conducted at different stages in the history of testing to ensure that test results carry the same meaning regardless of test conditions. The expansion of at-home testing via remote proctoring sparked another round of interest. This study uses data from three licensure tests to assess potential mode…
Descriptors: Testing, Test Format, Computer Assisted Testing, Home Study
Jared S. Soileau; Gregory P. Tapis; Spencer C. Usrey; Thomas Z. Webb – Accounting Education, 2025
While the CPA exam is uniform for all jurisdictions, individual jurisdictions are allowed to set the requirements to sit for the exam. These requirements vary by jurisdiction, with some being more restrictive than others. We analyze the decision process of international candidates to proxy for candidates that can choose jurisdictions with less…
Descriptors: Accounting, Foreign Students, Licensing Examinations (Professions), Certification
Jeff Coon; Paulina N. Silva; Alexander Etz; Barbara W. Sarnecka – Journal of Cognition and Development, 2025
Bayesian methods offer many advantages when applied to psychological research, yet they may seem esoteric to researchers who are accustomed to traditional methods. This paper aims to lower the barrier of entry for developmental psychologists who are interested in using Bayesian methods. We provide worked examples of how to analyze common study…
Descriptors: Developmental Psychology, Bayesian Statistics, Research Methodology, Psychological Studies
Karyssa A. Courey; Frederick L. Oswald; Steven A. Culpepper – Practical Assessment, Research & Evaluation, 2024
Historically, organizational researchers have fully embraced frequentist statistics and null hypothesis significance testing (NHST). Bayesian statistics is an underused alternative paradigm offering numerous benefits for organizational researchers and practitioners: e.g., accumulating direct evidence for the null hypothesis (vs. 'fail to reject…
Descriptors: Bayesian Statistics, Statistical Distributions, Researchers, Institutional Research
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Amanda A. Wolkowitz; Russell Smith – Practical Assessment, Research & Evaluation, 2024
A decision consistency (DC) index is an estimate of the consistency of a classification decision on an exam. More specifically, DC estimates the percentage of examinees that would have the same classification decision on an exam if they were to retake the same or a parallel form of the exam again without memory of taking the exam the first time.…
Descriptors: Testing, Test Reliability, Replication (Evaluation), Decision Making
Xuelan Qiu; Jimmy de la Torre; You-Gan Wang; Jinran Wu – Educational Measurement: Issues and Practice, 2024
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed,…
Descriptors: Item Response Theory, Personality Traits, Personality Measures, Personality Assessment
Abdou L. J. Jammeh; Claude Karegeya; Savita Ladage – Education and Information Technologies, 2025
Clicker-integrated instruction is the current innovation in teaching and learning. Several studies used this technology to investigate learning processes, while others mainly used it to asses for learning, facilitation of group discussion and students' participation. All applications require creativity and analytical thinking and very much…
Descriptors: Chemistry, Science Instruction, Audience Response Systems, Computer Assisted Instruction
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Wen Xin Zhang; John J. H. Lin; Ying-Shao Hsu – Journal of Computer Assisted Learning, 2025
Background Study: Assessing learners' inquiry-based skills is challenging as social, political, and technological dimensions must be considered. The advanced development of artificial intelligence (AI) makes it possible to address these challenges and shape the next generation of science education. Objectives: The present study evaluated the SSI…
Descriptors: Artificial Intelligence, Computer Assisted Testing, Inquiry, Active Learning
Eva Ulrychová; Renata Majovská; Petr Tesar – Journal on Efficiency and Responsibility in Education and Science, 2024
The article deals with the results of mathematics examinations at the University of Finance and Administration in Prague before, during, and immediately after the COVID-19 pandemic-related restrictions. The first objective is to evaluate whether the non-standard forms of testing (correspondence and online), used on an emergency basis during the…
Descriptors: Foreign Countries, COVID-19, Pandemics, Mathematics Tests
Nataly Beribisky; Gregory R. Hancock – Educational and Psychological Measurement, 2024
Fit indices are descriptive measures that can help evaluate how well a confirmatory factor analysis (CFA) model fits a researcher's data. In multigroup models, before between-group comparisons are made, fit indices may be used to evaluate measurement invariance by assessing the degree to which multiple groups' data are consistent with increasingly…
Descriptors: Factor Analysis, Research Methodology, Comparative Testing, Measurement
Catherine Mata; Katharine Meyer; Lindsay Page – Annenberg Institute for School Reform at Brown University, 2024
This article examines the risk of crossover contamination in individual-level randomization, a common concern in experimental research, in the context of a large-enrollment college course. While individual-level randomization is more efficient for assessing program effectiveness, it also increases the potential for control group students to cross…
Descriptors: Chemistry, Science Instruction, Undergraduate Students, Large Group Instruction
Wim J. van der Linden; Luping Niu; Seung W. Choi – Journal of Educational and Behavioral Statistics, 2024
A test battery with two different levels of adaptation is presented: a within-subtest level for the selection of the items in the subtests and a between-subtest level to move from one subtest to the next. The battery runs on a two-level model consisting of a regular response model for each of the subtests extended with a second level for the joint…
Descriptors: Adaptive Testing, Test Construction, Test Format, Test Reliability

Peer reviewed
Direct link
