Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 6 |
Descriptor
Source
Applied Measurement in… | 7 |
Author
Abulela, Mohammed A. A. | 1 |
Alves, Cecilia B. | 1 |
Bahry, Louise M. | 1 |
Chis, Liliana | 1 |
Chu, Man-Wai | 1 |
Clauser, Brian E. | 1 |
Gotzmann, Andrea | 1 |
Harik, Polina | 1 |
Janssen, Rianne | 1 |
Margolis, Melissa J. | 1 |
McManus, I. C. | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 6 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 3 |
Elementary Education | 1 |
Grade 3 | 1 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
What Works Clearinghouse Rating
Takahiro Terao – Applied Measurement in Education, 2024
This study aimed to compare item characteristics and response time between stimulus conditions in computer-delivered listening tests. Listening materials had three variants: regular videos, frame-by-frame videos, and only audios without visuals. Participants were 228 Japanese high school students who were requested to complete one of nine…
Descriptors: Computer Assisted Testing, Audiovisual Aids, Reaction Time, High School Students
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017
In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…
Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability
Nijlen, Daniel Van; Janssen, Rianne – Applied Measurement in Education, 2015
In this study it is investigated to what extent contextualized and non-contextualized mathematics test items have a differential impact on examinee effort. Mixture item response theory (IRT) models are applied to two subsets of items from a national assessment on mathematics in the second grade of the pre-vocational track in secondary education in…
Descriptors: Mathematics Tests, Measurement, Item Response Theory, Test Items
Roduta Roberts, Mary; Alves, Cecilia B.; Chu, Man-Wai; Thompson, Margaret; Bahry, Louise M.; Gotzmann, Andrea – Applied Measurement in Education, 2014
The purpose of this study was to evaluate the adequacy of three cognitive models, one developed by content experts and two generated from student verbal reports for explaining examinee performance on a grade 3 diagnostic mathematics test. For this study, the items were developed to directly measure the attributes in the cognitive model. The…
Descriptors: Foreign Countries, Mathematics Tests, Cognitive Processes, Models
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009
Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…
Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

Ponsoda, Vicente; Olea, Julio; Rodriguez, Maria Soledad; Revuelta, Javier – Applied Measurement in Education, 1999
Compared easy and difficult versions of self-adapted tests (SAT) and computerized adapted tests. No significant differences were found among the tests for estimated ability or posttest state anxiety in studies with 187 Spanish high school students, although other significant differences were found. Discusses implications for interpreting test…
Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing