Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 20 |
Descriptor
Simulation | 65 |
Testing Problems | 65 |
Test Items | 20 |
Computer Assisted Testing | 19 |
Adaptive Testing | 15 |
Test Construction | 15 |
Item Response Theory | 14 |
Evaluation Methods | 12 |
Scores | 10 |
Scoring | 9 |
Testing | 9 |
More ▼ |
Source
Author
Sinharay, Sandip | 5 |
Davey, Tim | 3 |
Stocking, Martha L. | 3 |
Choi, Seung W. | 2 |
Guarino, Cassandra M. | 2 |
Kim, Dong-In | 2 |
Parshall, Cynthia G. | 2 |
Pommerich, Mary | 2 |
Reckase, Mark D. | 2 |
Robitzsch, Alexander | 2 |
Stacy, Brian W. | 2 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 2 |
Secondary Education | 2 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Indiana Statewide Testing for… | 2 |
Program for International… | 2 |
SAT (College Admission Test) | 2 |
Armed Services Vocational… | 1 |
Minnesota Teacher Attitude… | 1 |
National Assessment of… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Lee, Chansoon; Qian, Hong – Educational and Psychological Measurement, 2022
Using classical test theory and item response theory, this study applied sequential procedures to a real operational item pool in a variable-length computerized adaptive testing (CAT) to detect items whose security may be compromised. Moreover, this study proposed a hybrid threshold approach to improve the detection power of the sequential…
Descriptors: Computer Assisted Testing, Adaptive Testing, Licensing Examinations (Professions), Item Response Theory
Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023
One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…
Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests
Chun Wang; Ping Chen; Shengyu Jiang – Journal of Educational Measurement, 2020
Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait [theta] estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to…
Descriptors: Test Construction, Test Items, Adaptive Testing, Maximum Likelihood Statistics
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2018
Wollack, Cohen, and Eckerly suggested the "erasure detection index" (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for…
Descriptors: Deception, Identification, Testing Problems, Cheating
Schneider, W. Joel; Roman, Zachary – Journal of Psychoeducational Assessment, 2018
We used data simulations to test whether composites consisting of cohesive subtest scores are more accurate than composites consisting of divergent subtest scores. We demonstrate that when multivariate normality holds, divergent and cohesive scores are equally accurate. Furthermore, excluding divergent scores results in biased estimates of…
Descriptors: Statistical Data, Simulation, Testing, Scores
Sinharay, Sandip – Grantee Submission, 2017
Wollack, Cohen, and Eckerly (2015) suggested the "erasure detection index" (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly (2017) extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This paper suggests two modifications of…
Descriptors: Deception, Identification, Testing Problems, Cheating
Sinharay, Sandip – Applied Measurement in Education, 2017
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis
Guarino, Cassandra M.; Reckase, Mark D.; Stacy, Brian W.; Wooldridge, Jeffrey M. – Journal of Research on Educational Effectiveness, 2015
We study the properties of two specification tests that have been applied to a variety of estimators in the context of value-added measures (VAMs) of teacher and school quality: the Hausman test for choosing between student-level random and fixed effects, and a test for feedback (sometimes called a "falsification test"). We discuss…
Descriptors: Teacher Effectiveness, Educational Quality, Evaluation Methods, Tests
Debeer, Dries; Janssen, Rianne; De Boeck, Paul – Journal of Educational Measurement, 2017
When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree-based IRT framework for modeling responses and omissions…
Descriptors: Item Response Theory, Test Items, Responses, Testing Problems
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014
With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Feinberg, Richard A. – ProQuest LLC, 2012
Subscores, also known as domain scores, diagnostic scores, or trait scores, can help determine test-takers' relative strengths and weaknesses and appropriately focus remediation. However, subscores often have poor psychometric properties, particularly reliability and distinctiveness (Folske, Gessaroli, & Swanson, 1999; Monaghan, 2006;…
Descriptors: Simulation, Tests, Testing, Scores
Guarino, Cassandra M.; Reckase, Mark D.; Stacy, Brian W.; Wooldridge, Jeffrey M. – Education Policy Center at Michigan State University, 2014
We study the properties of two specification tests that have been applied to a variety of estimators in the context of value-added measures (VAMs) of teacher and school quality: the Hausman test for choosing between random and fixed effects and a test for feedback (sometimes called a "falsification test"). We discuss theoretical…
Descriptors: Achievement Gains, Evaluation Methods, Teacher Effectiveness, Educational Quality
Zopluoglu, Cengiz; Davenport, Ernest C., Jr. – Online Submission, 2011
The purpose of this study was to examine the effects of answer copying on the ability level estimates of cheater examinees in answer copying pairs. The study generated answer copying pairs for each of 1440 conditions, source ability (12) x cheater ability (12) x amount of copying (10). The average difference between the ability level estimates…
Descriptors: Cheating, Multiple Choice Tests, Ability, High Stakes Tests