Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 17 |
Descriptor
Pretesting | 29 |
Test Items | 29 |
Test Construction | 18 |
Item Response Theory | 10 |
Item Analysis | 8 |
Adaptive Testing | 7 |
Computer Assisted Testing | 7 |
Item Banks | 7 |
Test Format | 6 |
College Entrance Examinations | 5 |
Difficulty Level | 5 |
More ▼ |
Source
Author
Benjamin W. Domingue | 2 |
Joshua B. Gilbert | 2 |
Luke W. Miratrix | 2 |
Mridul Joshi | 2 |
White, David M. | 2 |
Ackerman, Terry | 1 |
Adams, Betty A. J. | 1 |
Blumberg, Fran | 1 |
Bolt, Dan | 1 |
Choe, Edison M. | 1 |
Cobern, William W. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 4 |
Elementary Education | 3 |
Postsecondary Education | 3 |
Early Childhood Education | 2 |
Grade 2 | 2 |
Primary Education | 2 |
Elementary Secondary Education | 1 |
Intermediate Grades | 1 |
Audience
Researchers | 3 |
Students | 2 |
Practitioners | 1 |
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 2 |
Graduate Management Admission… | 1 |
Graduate Record Examinations | 1 |
Law School Admission Test | 1 |
National Assessment of… | 1 |
New Jersey High School… | 1 |
What Works Clearinghouse Rating
Ersen, Rabia Karatoprak; Lee, Won-Chan – Journal of Educational Measurement, 2023
The purpose of this study was to compare calibration and linking methods for placing pretest item parameter estimates on the item pool scale in a 1-3 computerized multistage adaptive testing design in terms of item parameter recovery. Two models were used: embedded-section, in which pretest items were administered within a separate module, and…
Descriptors: Pretesting, Test Items, Computer Assisted Testing, Adaptive Testing
Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing
TsungHan Ho – Applied Measurement in Education, 2023
An operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in…
Descriptors: Bayesian Statistics, Regression (Statistics), Test Items, Pretesting
Lim, Hwanggyu; Choe, Edison M. – Journal of Educational Measurement, 2023
The residual differential item functioning (RDIF) detection framework was developed recently under a linear testing context. To explore the potential application of this framework to computerized adaptive testing (CAT), the present study investigated the utility of the RDIF[subscript R] statistic both as an index for detecting uniform DIF of…
Descriptors: Test Items, Computer Assisted Testing, Item Response Theory, Adaptive Testing
Cobern, William W.; Adams, Betty A. J. – International Journal of Assessment Tools in Education, 2020
What follows is a practical guide for establishing the validity of a survey for research purposes. The motivation for providing this guide is our observation that researchers, not necessarily being survey researchers per se, but wanting to use a survey method, lack a concise resource on validity. There is far more to know about surveys and survey…
Descriptors: Surveys, Test Validity, Test Construction, Test Items
Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Journal of Educational and Behavioral Statistics, 2025
Analyzing heterogeneous treatment effects (HTEs) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and preintervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…
Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics
Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2024
Analyzing heterogeneous treatment effects (HTE) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and pre-intervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…
Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics
Howard, Matt C. – Practical Assessment, Research & Evaluation, 2018
Scale pretests analyze the suitability of individual scale items for further analysis, whether through judging their face validity, wording concerns, and/or other aspects. The current article reviews scale pretests, separated by qualitative and quantitative methods, in order to identify the differences, similarities, and even existence of the…
Descriptors: Pretesting, Measures (Individuals), Test Items, Statistical Analysis
Hilton, Charlotte Emma – International Journal of Social Research Methodology, 2017
The development of questionnaires, surveys and psychometric scales is an iterative research process that includes a number of carefully planned stages. Pretesting is a method of checking that questions work as intended and are understood by those individuals who are likely to respond to them. However, detailed reports of appropriate methods to…
Descriptors: Questionnaires, Pretesting, Interviews, Test Construction
Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E. – Grantee Submission, 2018
This paper describes the development and validation of a set of three assessment instruments that can be used to assess students' progress on the energy concept (ASPECt) from fourth through twelfth grade. Rasch analysis techniques were used throughout the development process to guide the construction of an item bank and the selection of items for…
Descriptors: Energy, Test Content, Test Items, Program Validation
Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017
In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…
Descriptors: Scores, Test Items, Test Bias, International Assessment
Wheadon, Jacob; Wright, Geoff A.; West, Richard E.; Skaggs, Paul – Journal of Technology Education, 2017
This study discusses the need, development, and validation of the Innovation Test Instrument (ITI). This article outlines how the researchers identified the content domain of the assessment and created test items. Then, it describes initial validation testing of the instrument. The findings suggest that the ITI is a good first step in creating an…
Descriptors: Innovation, Program Validation, Evaluation Needs, Test Construction
Liu, Jinghua; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2014
Maintaining score interchangeability and scale consistency is crucial for any testing programs that administer multiple forms across years. The use of a multiple linking design, which involves equating a new form to multiple old forms and averaging the conversions, has been proposed to control scale drift. However, the use of multiple linking…
Descriptors: Comparative Analysis, Reliability, Test Construction, Equated Scores
Davey, Tim; Lee, Yi-Hsuan – ETS Research Report Series, 2011
Both theoretical and practical considerations have led the revision of the Graduate Record Examinations® (GRE®) revised General Test, here called the rGRE, to adopt a multistage adaptive design that will be continuously or nearly continuously administered and that can provide immediate score reporting. These circumstances sharply constrain the…
Descriptors: Context Effect, Scoring, Equated Scores, College Entrance Examinations
Masters, James S. – ProQuest LLC, 2010
With the need for larger and larger banks of items to support adaptive testing and to meet security concerns, large-scale item generation is a requirement for many certification and licensure programs. As part of the mass production of items, it is critical that the difficulty and the discrimination of the items be known without the need for…
Descriptors: Test Items, Adaptive Testing, Pretesting, Program Effectiveness
An Investigation of Scale Drift for Arithmetic Assessment of ACCUPLACER®. Research Report No. 2010-2
Deng, Hui; Melican, Gerald – College Board, 2010
The current study was designed to extend the current literature to study scale drift in CAT as part of improving quality control and calibration process for ACCUPLACER, a battery of large-scale adaptive placement tests. The study aims to evaluate item parameter drift using empirical data that span four years from the ACCUPLACER Arithmetic…
Descriptors: Student Placement, Adaptive Testing, Computer Assisted Testing, Mathematics Tests
Previous Page | Next Page »
Pages: 1 | 2