Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 7 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 106 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 3 |
| Researchers | 3 |
| Students | 1 |
Location
| California | 4 |
| Texas | 4 |
| New York | 3 |
| United States | 3 |
| Australia | 2 |
| China | 2 |
| Denmark | 2 |
| Finland | 2 |
| Florida | 2 |
| France | 2 |
| Germany | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 3 |
| No Child Left Behind Act 2001 | 2 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Uto, Masaki; Aomi, Itsuki; Tsutsumi, Emiko; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2023
In automated essay scoring (AES), essays are automatically graded without human raters. Many AES models based on various manually designed features or various architectures of deep neural networks (DNNs) have been proposed over the past few decades. Each AES model has unique advantages and characteristics. Therefore, rather than using a single-AES…
Descriptors: Prediction, Scores, Computer Assisted Testing, Scoring
Shin, Jinnie; Gierl, Mark J. – Journal of Applied Testing Technology, 2022
Automated Essay Scoring (AES) technologies provide innovative solutions to score the written essays with a much shorter time span and at a fraction of the current cost. Traditionally, AES emphasized the importance of capturing the "coherence" of writing because abundant evidence indicated the connection between coherence and the overall…
Descriptors: Computer Assisted Testing, Scoring, Essays, Automation
Markus T. Jansen; Ralf Schulze – Educational and Psychological Measurement, 2024
Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent…
Descriptors: Factor Analysis, Models, Item Analysis, Evaluation Methods
Eunsook Kim; Nathaniel von der Embse – Journal of Experimental Education, 2024
Using data from multiple informants has long been considered best practice in education. However, multiple informants often disagree on similar constructs, complicating decision-making. Polynomial regression and response-surface analysis (PRA) is often used to test the congruence effect between multiple informants on an outcome. However, PRA…
Descriptors: Congruence (Psychology), Information Sources, Best Practices, Regression (Statistics)
Sami Baral; Eamon Worden; Wen-Chiang Lim; Zhuang Luo; Christopher Santorelli; Ashish Gurung; Neil Heffernan – Grantee Submission, 2024
The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research have explored methodologies to enhance the effectiveness of feedback to students in various ways. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated…
Descriptors: Automation, Scoring, Computer Assisted Testing, Natural Language Processing
Thomas, Gary – British Educational Research Journal, 2021
Natural scientists are relaxed about the multiple forms experiment takes in their various fields. Yet in education we have for many years constrained our notion of experiment. This methodological circumscription has been self-imposed on the grounds that experiment of a particular, well-defined form offers the clearest evidence of a link between…
Descriptors: Educational Experiments, Models, Intervention, Context Effect
Christopher D. Wilson; Kevin C. Haudek; Jonathan F. Osborne; Zoë E. Buck Bracey; Tina Cheuk; Brian M. Donovan; Molly A. M. Stuhlsatz; Marisol M. Santiago; Xiaoming Zhai – Journal of Research in Science Teaching, 2024
Argumentation is fundamental to science education, both as a prominent feature of scientific reasoning and as an effective mode of learning--a perspective reflected in contemporary frameworks and standards. The successful implementation of argumentation in school science, however, requires a paradigm shift in science assessment from the…
Descriptors: Middle School Students, Competence, Science Process Skills, Persuasive Discourse
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory
Raykov, Tenko; Marcoulides, George A.; Huber, Chuck – Measurement: Interdisciplinary Research and Perspectives, 2020
It is demonstrated that the popular three-parameter logistic model can lead to markedly inaccurate individual ability level estimates for mixture populations. A theoretically and empirically important setting is initially considered where (a) in one of two subpopulations (latent classes) the two-parameter logistic model holds for each item in a…
Descriptors: Item Response Theory, Models, Measurement Techniques, Item Analysis
Levin, Nathan A. – Journal of Educational Data Mining, 2021
The Big Data for Education Spoke of the NSF Northeast Big Data Innovation Hub and ETS co-sponsored an educational data mining competition in which contestants were asked to predict efficient time use on the NAEP 8th grade mathematics computer-based assessment, based on the log file of a student's actions on a prior portion of the assessment. In…
Descriptors: Learning Analytics, Data Collection, Competition, Prediction
Michael Gilraine; Jeffrey Penney – Annenberg Institute for School Reform at Brown University, 2021
An administrative rule allowed students who failed an exam to retake it shortly after, triggering strong `teach to the test' incentives to raise these students' test scores for the retake. We develop a model that accounts for truncation and find that these students score 0.14 standard deviations higher on the retest. Using a regression…
Descriptors: Tests, Models, Scores, Test Coaching
Ayodele, Alicia Nicole – ProQuest LLC, 2017
Within polytomous items, differential item functioning (DIF) can take on various forms due to the number of response categories. The lack of invariance at this level is referred to as differential step functioning (DSF). The most common DSF methods in the literature are the adjacent category log odds ratio (AC-LOR) estimator and cumulative…
Descriptors: Statistical Analysis, Test Bias, Test Items, Scores
Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M. – Journal of Psychoeducational Assessment, 2018
We investigated the classification accuracy of learning disability (LD) identification methods premised on the identification of an intraindividual pattern of processing strengths and weaknesses (PSW) method using multiple indicators for all latent constructs. Known LD status was derived from latent scores; values at the observed level identified…
Descriptors: Accuracy, Learning Disabilities, Classification, Identification
Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018
Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Descriptors: Error of Measurement, Testing, Scores, Models

Peer reviewed
Direct link
