NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Robinson, Daniel H. – Educational Psychology Review, 2021
In an article published in an open-access journal, (Pennebaker et al. "PLoS One," 8(11), e79774, 2013) reported that an innovative computer-based system that included daily online testing resulted in better student performance in other concurrent courses and a reduction in achievement gaps between lower and upper middle-class students.…
Descriptors: Computer Assisted Testing, Academic Achievement, Student Evaluation, College Students
Peer reviewed Peer reviewed
Direct linkDirect link
Yang Jiang; Mo Zhang; Jiangang Hao; Paul Deane; Chen Li – Journal of Educational Measurement, 2024
The emergence of sophisticated AI tools such as ChatGPT, coupled with the transition to remote delivery of educational assessments in the COVID-19 era, has led to increasing concerns about academic integrity and test security. Using AI tools, test takers can produce high-quality texts effortlessly and use them to game assessments. It is thus…
Descriptors: Integrity, Artificial Intelligence, Technology Uses in Education, Ethics
Peer reviewed Peer reviewed
Direct linkDirect link
Esther Ulitzsch; Steffi Pohl; Lale Khorramdel; Ulf Kroehne; Matthias von Davier – Journal of Educational and Behavioral Statistics, 2024
Questionnaires are by far the most common tool for measuring noncognitive constructs in psychology and educational sciences. Response bias may pose an additional source of variation between respondents that threatens validity of conclusions drawn from questionnaire data. We present a mixture modeling approach that leverages response time data from…
Descriptors: Item Response Theory, Response Style (Tests), Questionnaires, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Uto, Masaki; Okano, Masashi – IEEE Transactions on Learning Technologies, 2021
In automated essay scoring (AES), scores are automatically assigned to essays as an alternative to grading by humans. Traditional AES typically relies on handcrafted features, whereas recent studies have proposed AES models based on deep neural networks to obviate the need for feature engineering. Those AES models generally require training on a…
Descriptors: Essays, Scoring, Writing Evaluation, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Schack, Edna O.; Dueber, David; Thomas, Jonathan Norris; Fisher, Molly H.; Jong, Cindy – AERA Online Paper Repository, 2019
Scoring of teachers' noticing responses is typically burdened with rater bias and reliance upon interrater consensus. The authors sought to make the scoring process more objective, equitable, and generalizable. The development process began with a description of response characteristics for each professional noticing component disconnected from…
Descriptors: Models, Teacher Evaluation, Observation, Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Nygren, Thomas; Guath, Mona – International Association for Development of the Information Society, 2018
In this study we investigate the abilities to determine credibility of digital news among 532 teenagers. Using an online test we assess to what extent teenagers are able to determine the credibility of different sources, evaluate credible and biased uses of evidence, and corroborate information. Many respondents fail to identify the credibility of…
Descriptors: Credibility, Information Sources, Information Literacy, News Reporting
Nixi Wang – ProQuest LLC, 2022
Measurement errors attributable to cultural issues are complex and challenging for educational assessments. We need assessment tests sensitive to the cultural heterogeneity of populations, and psychometric methods appropriate to address fairness and equity concerns. Built on the research of culturally responsive assessment, this dissertation…
Descriptors: Culturally Relevant Education, Testing, Equal Education, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
He, Tung-hsien – SAGE Open, 2019
This study employed a mixed-design approach and the Many-Facet Rasch Measurement (MFRM) framework to investigate whether rater bias occurred between the onscreen scoring (OSS) mode and the paper-based scoring (PBS) mode. Nine human raters analytically marked scanned scripts and paper scripts using a six-category (i.e., six-criterion) rating…
Descriptors: Computer Assisted Testing, Scoring, Item Response Theory, Essays
Peer reviewed Peer reviewed
Direct linkDirect link
Yu, Fu Yun; Sung, Shannon – Educational Technology & Society, 2019
This study examined whether different identity revelation conditions result in different online targeting behavior among peer-assessors through a pretest and posttest quasi-experimental research design. Students from six fifth-grade classes (N = 196) participated in online learning tasks where they generated and selected peer-generated questions…
Descriptors: Peer Evaluation, Grade 5, Elementary School Students, Educational Technology
Peer reviewed Peer reviewed
Direct linkDirect link
Witt, Jessica K.; Brockmole, James R. – Journal of Experimental Psychology: Human Perception and Performance, 2012
Stereotypes, expectations, and emotions influence an observer's ability to detect and categorize objects as guns. In light of recent work in action-perception interactions, however, there is another unexplored factor that may be critical: The action choices available to the perceiver. In five experiments, participants determined whether another…
Descriptors: Weapons, Identification, Stereotypes, Visual Perception
Peer reviewed Peer reviewed
Direct linkDirect link
Yen, Yung-Chin; Ho, Rong-Guey; Liao, Wen-Wei; Chen, Li-Ju – Educational Technology & Society, 2012
In a test, the testing score would be closer to examinee's actual ability when careless mistakes were corrected. In CAT, however, changing the answer of one item in CAT might cause the following items no longer appropriate for estimating the examinee's ability. These inappropriate items in a reviewable CAT might in turn introduce bias in ability…
Descriptors: Foreign Countries, Adaptive Testing, Computer Assisted Testing, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Kite, Mary E., Ed. – Society for the Teaching of Psychology, 2012
This book compiles several essays about effective evaluation of teaching. Contents of this publication include: (1) Conducting Research on Student Evaluations of Teaching (William E. Addison and Jeffrey R. Stowell); (2) Choosing an Instrument for Student Evaluation of Instruction (Jared W. Keeley); (3) Formative Teaching Evaluations: Is Student…
Descriptors: Feedback (Response), Student Evaluation of Teacher Performance, Online Courses, Teacher Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Sax, Linda J.; Gilmartin, Shannon K.; Lee, Jenny J.; Hagedorn, Linda Serra – Community College Journal of Research and Practice, 2008
This study was designed to examine response rates and bias among a sample of community college students who received a district-wide survey by standard mail or e-mail. Findings suggest that predictors of response and types of responses are not appreciably different across paper and online mail-out samples when these samples are "matched" in terms…
Descriptors: College Students, Response Style (Tests), Response Rates (Questionnaires), Community Colleges
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal – ETS Research Report Series, 2007
This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first…
Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language)
Previous Page | Next Page »
Pages: 1  |  2