Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Alsubait, Tahani; Parsia, Bijan; Sattler, Uli – Research in Learning Technology, 2012
Different computational models for generating analogies of the form "A is to B as C is to D" have been proposed over the past 35 years. However, analogy generation is a challenging problem that requires further research. In this article, we present a new approach for generating analogies in Multiple Choice Question (MCQ) format that can be used…
Descriptors: Computer Assisted Testing, Programming, Computer Software, Computer Software Evaluation
Shin, Chingwei David; Chien, Yuehmei; Way, Walter Denny – Pearson, 2012
Content balancing is one of the most important components in the computerized adaptive testing (CAT) especially in the K to 12 large scale tests that complex constraint structure is required to cover a broad spectrum of content. The purpose of this study is to compare the weighted penalty model (WPM) and the weighted deviation method (WDM) under…
Descriptors: Computer Assisted Testing, Elementary Secondary Education, Test Content, Models
College Board, 2012
Looking beyond the right or wrong answer is imperative to the development of effective educational environments conducive to Pre-AP work in math. This presentation explores a system of evaluation in math that provides a personalized, student-reflective model correlated to consortia-based assessment. Using examples of students' work that includes…
Descriptors: Student Evaluation, Mathematics Instruction, Correlation, Educational Assessment
Zebehazy, Kim T.; Zigmond, Naomi; Zimmerman, George J. – Journal of Visual Impairment & Blindness, 2012
Introduction: This study investigated differential item functioning (DIF) of test items on Pennsylvania's Alternate System of Assessment (PASA) for students with visual impairments and severe cognitive disabilities and what the reasons for the differences may be. Methods: The Wilcoxon signed ranks test was used to analyze differences in the scores…
Descriptors: Test Bias, Test Items, Alternative Assessment, Visual Impairments
Wu, Johnny; King, Kevin M.; Witkiewitz, Katie; Racz, Sarah Jensen; McMahon, Robert J. – Psychological Assessment, 2012
Research has shown that boys display higher levels of childhood conduct problems than girls, and Black children display higher levels than White children, but few studies have tested for scalar equivalence of conduct problems across gender and race. The authors conducted a 2-parameter item response theory (IRT) model to examine item…
Descriptors: Item Analysis, Test Bias, Test Items, Item Response Theory
Gierl, Mark J.; Lai, Hollis – International Journal of Testing, 2012
Automatic item generation represents a relatively new but rapidly evolving research area where cognitive and psychometric theories are used to produce tests that include items generated using computer technology. Automatic item generation requires two steps. First, test development specialists create item models, which are comparable to templates…
Descriptors: Foreign Countries, Psychometrics, Test Construction, Test Items
Ruiz-Primo, Maria Araceli; Li, Min; Wills, Kellie; Giamellaro, Michael; Lan, Ming-Chih; Mason, Hillary; Sands, Deanna – Journal of Research in Science Teaching, 2012
The purpose of this article is to address a major gap in the instructional sensitivity literature on how to develop instructionally sensitive assessments. We propose an approach to developing and evaluating instructionally sensitive assessments in science and test this approach with one elementary life-science module. The assessment we developed…
Descriptors: Effect Size, Inferences, Student Centered Curriculum, Test Construction
Kachchaf, Rachel; Solano-Flores, Guillermo – Applied Measurement in Education, 2012
We examined how rater language background affects the scoring of short-answer, open-ended test items in the assessment of English language learners (ELLs). Four native English and four native Spanish-speaking certified bilingual teachers scored 107 responses of fourth- and fifth-grade Spanish-speaking ELLs to mathematics items administered in…
Descriptors: Error of Measurement, English Language Learners, Scoring, Bilingual Teachers
Tessier, Anne-Michelle – Language Acquisition: A Journal of Developmental Linguistics, 2012
This article provides experimental evidence for the claim in Hayes (2004) and McCarthy (1998) that language learners are biased to assume that morphological paradigms should be phonologically-uniform--that is, that derived words should retain all the phonological properties of their bases. The evidence comes from an artificial language…
Descriptors: Test Items, Phonemes, Phonology, Artificial Languages
Huang, Xiaoting – ProQuest LLC, 2010
In recent decades, the use of large-scale standardized international assessments has increased drastically as a way to evaluate and compare the quality of education across countries. In order to make valid international comparisons, the primary requirement is to ensure the measurement equivalence between the different language versions of these…
Descriptors: Test Bias, Comparative Testing, Foreign Countries, Measurement
Frederickx, Sofie; Tuerlinckx, Francis; De Boeck, Paul; Magis, David – Journal of Educational Measurement, 2010
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the…
Descriptors: Test Bias, Models, Test Items, Difficulty Level
Suto, Irenka; Nadas, Rita – Research in Education, 2010
Our aim was to deepen understanding of public examinations, exploring how marking task demands influence examiners' cognition and ultimately their marking accuracy. To do this, we identified features of examinations that trigger or demand the use of cognitive marking strategies entailing "reflective" judgements. Kelly's Repertory Grid…
Descriptors: Scoring, Accuracy, Examiners, Cognitive Processes
Wang, Lijuan – Journal of Educational and Behavioral Statistics, 2010
This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
Descriptors: Item Response Theory, Models, Test Items, Psychometrics
Cheng, Ying – Educational and Psychological Measurement, 2010
This article proposes a new item selection method, namely, the modified maximum global discrimination index (MMGDI) method, for cognitive diagnostic computerized adaptive testing (CD-CAT). The new method captures two aspects of the appeal of an item: (a) the amount of contribution it can make toward adequate coverage of every attribute and (b) the…
Descriptors: Cognitive Tests, Diagnostic Tests, Computer Assisted Testing, Adaptive Testing
Koretz, Daniel; Beguin, Anton – Measurement: Interdisciplinary Research and Perspectives, 2010
Test-based accountability is now the cornerstone of U.S. education policy, and it is becoming more important in many other nations as well. Educators sometimes respond to test-based accountability in ways that produce score inflation. In the past, score inflation has usually been evaluated by comparing trends in scores on a high-stakes test to…
Descriptors: Accountability, High Stakes Tests, Test Construction, Scores

Peer reviewed
Direct link
