NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Practitioners1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 38 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ersen, Rabia Karatoprak; Lee, Won-Chan – Journal of Educational Measurement, 2023
The purpose of this study was to compare calibration and linking methods for placing pretest item parameter estimates on the item pool scale in a 1-3 computerized multistage adaptive testing design in terms of item parameter recovery. Two models were used: embedded-section, in which pretest items were administered within a separate module, and…
Descriptors: Pretesting, Test Items, Computer Assisted Testing, Adaptive Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023
Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…
Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items
Yu Wang – ProQuest LLC, 2024
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…
Descriptors: Multiple Choice Tests, Cognitive Tests, Cognitive Measurement, Educational Diagnosis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022
Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…
Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Areum; Krieger, Florian; Borgonovi, Francesca; Greiff, Samuel – Large-scale Assessments in Education, 2023
Process data are becoming more and more popular in education research. In the field of computer-based assessments of collaborative problem solving (ColPS), process data have been used to identify students' test-taking strategies while working on the assessment, and such data can be used to complement data collected on accuracy and overall…
Descriptors: Behavior Patterns, Cooperative Learning, Problem Solving, Reaction Time
Li, Haiying; Cai, Zhiqiang; Graesser, Arthur – Grantee Submission, 2018
In this study we developed and evaluated a crowdsourcing-based latent semantic analysis (LSA) approach to computerized summary scoring (CSS). LSA is a frequently used mathematical component in CSS, where LSA similarity represents the extent to which the to-be-graded target summary is similar to a model summary or a set of exemplar summaries.…
Descriptors: Computer Assisted Testing, Scoring, Semantics, Evaluation Methods
Yue Huang – ProQuest LLC, 2023
Automated writing evaluation (AWE) is a cutting-edge technology-based intervention designed to help teachers meet their challenges in writing classrooms and improve students' writing proficiency. The fast development of AWE systems, along with the encouragement of technology use in the U.S. K-12 education system by the Common Core State Standards…
Descriptors: Computer Assisted Testing, Writing Tests, Automation, Writing Evaluation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Eckerly, Carol; Smith, Russell; Sowles, John – Practical Assessment, Research & Evaluation, 2018
The Discrete Option Multiple Choice (DOMC) item format was introduced by Foster and Miller (2009) with the intent of improving the security of test content. However, by changing the amount and order of the content presented, the test taking experience varies by test taker, thereby introducing potential fairness issues. In this paper we…
Descriptors: Culture Fair Tests, Multiple Choice Tests, Testing, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Herget, Debbie; Dalton, Ben; Kinney, Saki; Smith, W. Zachary; Wilson, David; Rogers, Jim – National Center for Education Statistics, 2019
The Progress in International Reading Literacy Study (PIRLS) is an international comparative study of student performance in reading literacy at the fourth grade. PIRLS 2016 marks the fourth iteration of the study, which has been conducted every 5 years since 2001. New to the PIRLS assessment in 2016, ePIRLS provides a computer-based extension to…
Descriptors: Achievement Tests, Grade 4, Reading Achievement, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Ping – Journal of Educational and Behavioral Statistics, 2017
Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…
Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sahin, Alper; Weiss, David J. – Educational Sciences: Theory and Practice, 2015
This study aimed to investigate the effects of calibration sample size and item bank size on examinee ability estimation in computerized adaptive testing (CAT). For this purpose, a 500-item bank pre-calibrated using the three-parameter logistic model with 10,000 examinees was simulated. Calibration samples of varying sizes (150, 250, 350, 500,…
Descriptors: Adaptive Testing, Computer Assisted Testing, Sample Size, Item Banks
Peer reviewed Peer reviewed
Direct linkDirect link
Zehner, Fabian; Sälzer, Christine; Goldhammer, Frank – Educational and Psychological Measurement, 2016
Automatic coding of short text responses opens new doors in assessment. We implemented and integrated baseline methods of natural language processing and statistical modelling by means of software components that are available under open licenses. The accuracy of automatic text coding is demonstrated by using data collected in the "Programme…
Descriptors: Educational Assessment, Coding, Automation, Responses
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Fask, Alan; Englander, Fred; Wang, Zhaobo – Practical Assessment, Research & Evaluation, 2015
There has been a remarkable growth in distance learning courses in higher education. Despite indications that distance learning courses are more vulnerable to cheating behavior than traditional courses, there has been little research studying whether online exams facilitate a relatively greater level of cheating. This article examines this issue…
Descriptors: Distance Education, Introductory Courses, Statistics, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Price, Paul C.; Kimura, Nicole M.; Smith, Andrew R.; Marshall, Lindsay D. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2014
Previous research has shown that people exhibit a sample size bias when judging the average of a set of stimuli on a single dimension. The more stimuli there are in the set, the greater people judge the average to be. This effect has been demonstrated reliably for judgments of the average likelihood that groups of people will experience negative,…
Descriptors: Sample Size, Statistical Bias, Visual Perception, Pictorial Stimuli
Previous Page | Next Page »
Pages: 1  |  2  |  3