NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)3
Since 2006 (last 20 years)7
Publication Type
Reports - Descriptive12
Journal Articles10
Speeches/Meeting Papers1
Audience
Researchers1
Location
Russia1
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Youn-Jeng; Asilkalkan, Abdullah – Measurement: Interdisciplinary Research and Perspectives, 2019
About 45 R packages to analyze data using item response theory (IRT) have been developed over the last decade. This article introduces these 45 R packages with their descriptions and features. It also describes possible advanced IRT models using R packages, as well as dichotomous and polytomous IRT models, and R packages that contain applications…
Descriptors: Item Response Theory, Data Analysis, Computer Software, Test Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Gorbunova, Tatiana N. – European Journal of Contemporary Education, 2017
The subject of the research is to build methodologies to evaluate the student knowledge by testing. The author points to the importance of feedback about the mastering level in the learning process. Testing is considered as a tool. The object of the study is to create the test system models for defence practice problems. Special attention is paid…
Descriptors: Testing, Evaluation Methods, Feedback (Response), Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Uto, Masaki; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2016
As an assessment method based on a constructivist approach, peer assessment has become popular in recent years. However, in peer assessment, a problem remains that reliability depends on the rater characteristics. For this reason, some item response models that incorporate rater parameters have been proposed. Those models are expected to improve…
Descriptors: Item Response Theory, Peer Evaluation, Bayesian Statistics, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Feldman, Moshe; Lazzara, Elizabeth H.; Vanderbilt, Allison A.; DiazGranados, Deborah – Journal of Continuing Education in the Health Professions, 2012
Competency-based assessment and an emphasis on obtaining higher-level outcomes that reflect physicians' ability to demonstrate their skills has created a need for more advanced assessment practices. Simulation-based assessments provide medical education planners with tools to better evaluate the 6 Accreditation Council for Graduate Medical…
Descriptors: Performance Based Assessment, Physicians, Accuracy, High Stakes Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Won-Chan – Applied Psychological Measurement, 2007
This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…
Descriptors: Simulation, Error of Measurement, Scoring, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Galindo-Garre, Francisca; Vermunt, Jeroen K. – Psychometrika, 2004
This paper presents a row-column (RC) association model in which the estimated row and column scores are forced to be in agreement with a priori specified ordering. Two efficient algorithms for finding the order-restricted maximum likelihood (ML) estimates are proposed and their reliability under different degrees of association is investigated by…
Descriptors: Mathematics, Test Reliability, Computation, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Segall, Daniel O. – Journal of Educational and Behavioral Statistics, 2004
A new sharing item response theory (SIRT) model is presented that explicitly models the effects of sharing item content between informants and test takers. This model is used to construct adaptive item selection and scoring rules that provide increased precision and reduced score gains in instances where sharing occurs. The adaptive item selection…
Descriptors: Scoring, Item Analysis, Item Response Theory, Adaptive Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Atkins, David C.; Bedics, Jamie D.; Mcglinchey, Joseph B.; Beauchaine, Theodore P. – Journal of Consulting and Clinical Psychology, 2005
Measures of clinical significance are frequently used to evaluate client change during therapy. Several alternatives to the original method devised by N. S. Jacobson, W. C. Follette, & D. Revenstorf (1984) have been proposed, each purporting to increase accuracy. However, researchers have had little systematic guidance in choosing among…
Descriptors: Psychotherapy, Statistical Significance, Outcomes of Treatment, Behavior Change
Peer reviewed Peer reviewed
Clyman, Stephen G.; Orr, Nancy A. – Academic Medicine, 1990
The process proposed for the development and use of computer-based testing, including simulation and multiple-choice questions, as part of the National Board of Medical Examiners' certification sequence is outlined. Summary reports of first-phase pilot testing in six medical schools are appended. (MSE)
Descriptors: Computer Assisted Testing, Higher Education, Licensing Examinations (Professions), Medical Education
Lomask, Michal S.; And Others – 1993
An experimental Interactive Video Disc (IVD) assessment program, funded partially by the National Science Foundation, was developed to assess science teachers' knowledge of safe management of lab facilities and activities. The IVD program contains two phases: (1) panoramic view of the lab room, including safety equipment and storage of chemicals;…
Descriptors: Evaluation Methods, High Schools, Interactive Video, Junior High School Students