Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Cipani, Ennio – Charles C. Thomas, Publisher, Ltd, 2011
Challenging behaviors and poor student performance are often attributed to many of society's ills. As a result of the presence of such factors in some students' lives, changing these students' behavior in the classroom is seen as futile unless one can change their nonschool environment. To facilitate the reader's capability to develop intervention…
Descriptors: Behavior Problems, Test Items, Student Behavior, Intervention
Patterson, Margaret Becker; Higgins, Jennifer; Bozman, Martha; Katz, Michael – Adult Basic Education and Literacy Journal, 2011
We conducted a pilot study to see how the GED Mathematics Test could be administered on computer with embedded accessibility tools. We examined test scores and test-taker experience. Nineteen GED test centers across five states and 216 randomly assigned GED Tests candidates participated in the project. GED candidates completed two GED mathematics…
Descriptors: Pilot Projects, Mathematics Tests, High School Equivalency Programs, Test Wiseness
Romhild, Anja; Kenyon, Dorry; MacGregor, David – Language Assessment Quarterly, 2011
This study examined the role of domain-general and domain-specific linguistic knowledge in the assessment of academic English language proficiency using a latent variable modeling approach. The goal of the study was to examine if modeling of domain-specific variance results in improved model fit and well-defined latent factors. Analyses were…
Descriptors: Concept Formation, English (Second Language), Language Proficiency, Second Language Learning
Thurman, Carol – ProQuest LLC, 2009
The increased use of polytomous item formats has led assessment developers to pay greater attention to the detection of differential item functioning (DIF) in these items. DIF occurs when an item performs differently for two contrasting groups of respondents (e.g., males versus females) after controlling for differences in the abilities of the…
Descriptors: Test Items, Monte Carlo Methods, Test Bias, Educational Testing
Hendrickson, Amy; Huff, Kristen; Luecht, Ric – College Board, 2009
[Slides] presented at the Annual Meeting of National Council on Measurement in Education (NCME) in San Diego, CA in April 2009. This presentation describes how the vehicles for gathering student evidence--task models and test specifications--are developed.
Descriptors: Test Items, Test Construction, Evidence, Achievement
Suh, Youngsuk; Mroch, Andrew A.; Kane, Michael T.; Ripkey, Douglas R. – Measurement: Interdisciplinary Research and Perspectives, 2009
In this study, a data base containing the responses of 40,000 candidates to 90 multiple-choice questions was used to mimic data sets for 50-item tests under the "nonequivalent groups with anchor test" (NEAT) design. Using these smaller data sets, we evaluated the performance of five linear equating methods for the NEAT design with five levels of…
Descriptors: Test Items, Equated Scores, Methods, Differences
Wells, Craig S.; Baldwin, Su; Hambleton, Ronald K.; Sireci, Stephen G.; Karatonis, Ana; Jirka, Stephen – Applied Measurement in Education, 2009
Score equity assessment is an important analysis to ensure inferences drawn from test scores are comparable across subgroups of examinees. The purpose of the present evaluation was to assess the extent to which the Grade 8 NAEP Math and Reading assessments for 2005 were equivalent across selected states. More specifically, the present study…
Descriptors: National Competency Tests, Test Bias, Equated Scores, Grade 8
Bobbio, Tatiana; Gabbard, Carl; Cacola, Priscila – Early Childhood Research & Practice, 2009
Motor development attains landmark significance during early childhood. Although early childhood educators may be familiar with the gross-motor skill category, the subcategory of interlimb coordination needs greater attention than it typically receives from teachers of young children. Interlimb coordination primarily involves movements requiring…
Descriptors: Test Items, Young Children, Psychomotor Skills, Motor Development
Wang, Wen-Chung; Shih, Ching-Lin; Yang, Chih-Chien – Educational and Psychological Measurement, 2009
This study implements a scale purification procedure onto the standard MIMIC method for differential item functioning (DIF) detection and assesses its performance through a series of simulations. It is found that the MIMIC method with scale purification (denoted as M-SP) outperforms the standard MIMIC method (denoted as M-ST) in controlling…
Descriptors: Test Items, Measures (Individuals), Test Bias, Evaluation Research
Suto, W. M. Irenka; Nadas, Rita – Research Papers in Education, 2009
It has long been established that marking accuracy in public examinations varies considerably among subjects and markers. This is unsurprising, given the diverse cognitive strategies that the marking process can entail, but what makes some questions harder to mark accurately than others? Are there distinct but subtle features of questions and…
Descriptors: National Curriculum, Physics, Interviews, Examiners
Petry, Katja; Maes, Bea; Vlaskamp, Carla – Research in Developmental Disabilities: A Multidisciplinary Journal, 2009
Because of a shortage of valid instruments to measure the QOL of people with profound multiple disabilities (PMD), the QOL-PMD was developed. In the present study, possibilities for item reduction as well as the psychometric properties of the questionnaire were examined. One hundred and forty-seven informants of people with PMD participated in the…
Descriptors: Multiple Disabilities, Quality of Life, Construct Validity, Questionnaires
van der Linden, Wim J. – Applied Psychological Measurement, 2009
An adaptive testing method is presented that controls the speededness of a test using predictions of the test takers' response times on the candidate items in the pool. Two different types of predictions are investigated: posterior predictions given the actual response times on the items already administered and posterior predictions that use the…
Descriptors: Simulation, Adaptive Testing, Vocational Aptitude, Bayesian Statistics
Klein Entink, R. H.; Fox, J. P.; van der Linden, W. J. – Psychometrika, 2009
Response times on test items are easily collected in modern computerized testing. When collecting both (binary) responses and (continuous) response times on test items, it is possible to measure the accuracy and speed of test takers. To study the relationships between these two constructs, the model is extended with a multivariate multilevel…
Descriptors: Test Items, Markov Processes, Item Response Theory, Measurement Techniques
Miyazaki, Kei; Hoshino, Takahiro; Mayekawa, Shin-ichi; Shigemasu, Kazuo – Psychometrika, 2009
This study proposes a new item parameter linking method for the common-item nonequivalent groups design in item response theory (IRT). Previous studies assumed that examinees are randomly assigned to either test form. However, examinees can frequently select their own test forms and tests often differ according to examinees' abilities. In such…
Descriptors: Test Format, Item Response Theory, Test Items, Test Bias
von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the author points out few issues, one being that there are models mislabeled as diagnostic, which deal with linear decompositions of item difficulties rather than estimating multidimensional skill variables. The author discusses the issue that there are many new names for essentially well-known models for multiple simultaneous…
Descriptors: Test Items, Probability, Models, Diagnostic Tests

Direct link
Peer reviewed
