Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 18 |
Descriptor
Comparative Analysis | 35 |
Test Construction | 35 |
Simulation | 24 |
Test Items | 14 |
Computer Assisted Testing | 12 |
Computer Simulation | 12 |
Item Response Theory | 11 |
Adaptive Testing | 8 |
Mathematical Models | 8 |
Item Analysis | 6 |
Item Banks | 6 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 21 |
Reports - Research | 19 |
Reports - Evaluative | 13 |
Speeches/Meeting Papers | 7 |
Dissertations/Theses -… | 2 |
Collected Works - Proceedings | 1 |
Information Analyses | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 2 |
Early Childhood Education | 1 |
Elementary Secondary Education | 1 |
Secondary Education | 1 |
Audience
Administrators | 1 |
Teachers | 1 |
Location
Arkansas | 1 |
Austria | 1 |
Belgium | 1 |
Botswana | 1 |
Brazil | 1 |
China (Shanghai) | 1 |
Cyprus | 1 |
Czech Republic | 1 |
Egypt | 1 |
Indonesia | 1 |
Ireland | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Major Field Achievement Test… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024
This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…
Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Luo, Xiao; Kim, Doyoung – Journal of Educational Measurement, 2018
The top-down approach to designing a multistage test is relatively understudied in the literature and underused in research and practice. This study introduced a route-based top-down design approach that directly sets design parameters at the test level and utilizes the advanced automated test assembly algorithm seeking global optimality. The…
Descriptors: Computer Assisted Testing, Test Construction, Decision Making, Simulation
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Abdullah, Mahmoud M. S.; Abdel-Gawad, Rehab A. El-sayed; Ibrahim, Ibrahim Badry Marwany – Online Submission, 2023
This study investigated the effectiveness of using a social learning programme facilitated by Facebook to develop some creative writing skills and motivation to learn English among secondary-one school students. Seventy students in secondary-one grade in Al-Shahid Hussein A. Abdul-Raouf Mixed Secondary School in Al-Maabda in the second semester of…
Descriptors: Creative Writing, Teaching Methods, English (Second Language), Second Language Learning
Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016
The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction
Hula, William D.; Kellough, Stacey; Fergadiotis, Gerasimos – Journal of Speech, Language, and Hearing Research, 2015
Purpose: The purpose of this study was to develop a computerized adaptive test (CAT) version of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996), to reduce test length while maximizing measurement precision. This article is a direct extension of a companion article (Fergadiotis, Kellough, & Hula, 2015),…
Descriptors: Computer Assisted Testing, Adaptive Testing, Naming, Test Construction
Green, Jeffrey J.; Stone, Courtenay Clifford; Zegeye, Abera – Journal of Education for Business, 2014
Colleges and universities are being asked by numerous sources to provide assurance of learning assessments of their students and programs. Colleges of business have responded by using a plethora of assessment tools, including the Major Field Test in Business. In this article, the authors show that the use of the Major Field Test in Business for…
Descriptors: Business Administration Education, Student Evaluation, Accreditation (Institutions), Comparative Analysis
Amin, Bunga Dara; Mahmud, Alimuddin; Muris – Journal of Education and Practice, 2016
This research aims to produce a learning instrument based on hypermedia which is valid, interesting, practical, and effective as well as to know its influence on the problem based skill of students Mathematical and Science Faculty, Makassar State University. This research is a research and development at (R&D) type. The development procedure…
Descriptors: Test Construction, Science Tests, Physics, Hypermedia
Chen, Pei-Hua; Chang, Hua-Hua; Wu, Haiyan – Educational and Psychological Measurement, 2012
Two sampling-and-classification-based procedures were developed for automated test assembly: the Cell Only and the Cell and Cube methods. A simulation study based on a 540-item bank was conducted to compare the performance of the procedures with the performance of a mixed-integer programming (MIP) method for assembling multiple parallel test…
Descriptors: Test Items, Selection, Test Construction, Item Response Theory
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Finkelman, Matthew D.; Kim, Wonsuk; Roussos, Louis; Verschoor, Angela – Applied Psychological Measurement, 2010
Automated test assembly (ATA) has been an area of prolific psychometric research. Although ATA methodology is well developed for unidimensional models, its application alongside cognitive diagnosis models (CDMs) is a burgeoning topic. Two suggested procedures for combining ATA and CDMs are to maximize the cognitive diagnostic index and to use a…
Descriptors: Automation, Test Construction, Programming, Models
Tian, Feng – ProQuest LLC, 2011
There has been a steady increase in the use of mixed-format tests, that is, tests consisting of both multiple-choice items and constructed-response items in both classroom and large-scale assessments. This calls for appropriate equating methods for such tests. As Item Response Theory (IRT) has rapidly become mainstream as the theoretical basis for…
Descriptors: Item Response Theory, Comparative Analysis, Equated Scores, Statistical Analysis
Chen, Tzu-An – ProQuest LLC, 2010
This simulation study compared the performance of two multilevel measurement testlet (MMMT) models: Beretvas and Walker's (2008) two-level MMMT model and Jiao, Wang, and Kamata's (2005) three-level model. Several conditions were manipulated (including testlet length, sample size, and the pattern of the testlet effects) to assess the impact on the…
Descriptors: Simulation, Item Response Theory, Comparative Analysis, Models
Murphy, Daniel L.; Dodd, Barbara G.; Vaughn, Brandon K. – Applied Psychological Measurement, 2010
This study examined the performance of the maximum Fisher's information, the maximum posterior weighted information, and the minimum expected posterior variance methods for selecting items in a computerized adaptive testing system when the items were grouped in testlets. A simulation study compared the efficiency of ability estimation among the…
Descriptors: Simulation, Adaptive Testing, Item Analysis, Item Response Theory