Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Miller, Michael B.; Guerin, Scott A.; Wolford, George L. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2011
The false memory effect produced by the Deese/Roediger & McDermott (DRM) paradigm is reportedly impervious to warnings to avoid false alarming to the critical lures (D. A. Gallo, H. L. Roediger III, & K. B. McDermott, 2001). This finding has been used as strong evidence against models that attribute the false alarms to a decision…
Descriptors: Models, Memory, Recognition (Psychology), Test Items
Charlton, Shawn R.; Gossett, Bradley D.; Charlton, Veda A. – Psychological Record, 2011
Temporal discounting, the loss in perceived value associated with delayed outcomes, correlates with a number of personality measures, suggesting that an item-level analysis of trait measures might provide a more detailed understanding of discounting. The current report details two studies that investigate the utility of such an item-level…
Descriptors: Personality Measures, Test Items, Item Analysis, Delay of Gratification
Fukuhara, Hirotaka; Kamata, Akihito – Applied Psychological Measurement, 2011
A differential item functioning (DIF) detection method for testlet-based data was proposed and evaluated in this study. The proposed DIF model is an extension of a bifactor multidimensional item response theory (MIRT) model for testlets. Unlike traditional item response theory (IRT) DIF models, the proposed model takes testlet effects into…
Descriptors: Item Response Theory, Test Bias, Test Items, Bayesian Statistics
Oliveri, Maria E.; Ercikan, Kadriye – Applied Measurement in Education, 2011
In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of…
Descriptors: Foreign Countries, English, French, Tests
Jones, Andrew T. – Applied Psychological Measurement, 2011
Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…
Descriptors: Test Items, Item Analysis, Cutting Scores, Statistics
Liu, Jinghua; Sinharay, Sandip; Holland, Paul; Feigenbaum, Miriam; Curley, Edward – Educational and Psychological Measurement, 2011
Two different types of anchors are investigated in this study: a mini-version anchor and an anchor that has a less spread of difficulty than the tests to be equated. The latter is referred to as a midi anchor. The impact of these two different types of anchors on observed score equating are evaluated and compared with respect to systematic error…
Descriptors: Equated Scores, Test Items, Difficulty Level, Statistical Bias
Tiffin-Richards, Simon P.; Pant, Hans Anand; Koller, Olaf – Educational Measurement: Issues and Practice, 2013
Cut-scores were set by expert judges on assessments of reading and listening comprehension of English as a foreign language (EFL), using the bookmark standard-setting method to differentiate proficiency levels defined by the Common European Framework of Reference (CEFR). Assessments contained stratified item samples drawn from extensive item…
Descriptors: Foreign Countries, English (Second Language), Language Tests, Standard Setting (Scoring)
Kutluca, Tamer – Educational Research and Reviews, 2013
The aim of this study is to investigate the effect of dynamic geometry software GeoGebra on Van Hiele geometry understanding level of students at 11th grade geometry course. The study was conducted with pre and posttest control group quasi-experimental method. The sample of the study was 42 eleventh grade students studying in the spring term of…
Descriptors: Mathematics Instruction, Geometry, Foreign Countries, Computer Software
Berliner, David C. – Teacher Educator, 2013
In the United States, but not only here, the movement to evaluate teachers based on student test scores has received powerful political and parental support. The logic is simple. From one testing occasion to another students should show growth in their knowledge and skill. Similar types of students should show similar patterns of growth. Those…
Descriptors: Teacher Evaluation, Merit Pay, Evaluation Problems, Models
Seo, Dong Gi; Weiss, David J. – Educational and Psychological Measurement, 2013
The usefulness of the l[subscript z] person-fit index was investigated with achievement test data from 20 exams given to more than 3,200 college students. Results for three methods of estimating ? showed that the distributions of l[subscript z] were not consistent with its theoretical distribution, resulting in general overfit to the item response…
Descriptors: Achievement Tests, College Students, Goodness of Fit, Item Response Theory
Dowdy, Erin; Furlong, Michael J.; Sharkey, Jill D. – Journal of Emotional and Behavioral Disorders, 2013
This study examined the potential utility of adding items that assessed youths' emotional and behavioral disorders to a commonly used surveillance survey. The goal was to evaluate whether the added items could enhance understanding of youths' involvement in high-risk behaviors. A sample of 3,331 adolescents in Grades 8, 10, and 12 from four…
Descriptors: Behavior Disorders, Adolescents, Addictive Behavior, Surveys
Gluga, Richard; Kay, Judy; Lister, Raymond; Kleitman, Simon; Kleitman, Sabina – Computer Science Education, 2013
To design an effective computer science curriculum, educators require a systematic method of classifying the difficulty level of learning activities and assessment tasks. This is important for curriculum design and implementation and for communication between educators. Different educators must be able to use the method consistently, so that…
Descriptors: Computer Science Education, Cognitive Development, Difficulty Level, Test Items
Berk, Ronald A. – Journal of Faculty Development, 2013
One of the simplest indicators of teaching or course effectiveness is student ratings on one or more global items from the entire rating scale. That approach seems intuitively sound and easy to use. Global items have even been recommended by a few researchers to get a quick-read, at-a-glance summary for summative decisions about faculty. The…
Descriptors: Rating Scales, Student Evaluation of Teacher Performance, Item Analysis, Test Items
Michael S. Brewer; Grant E. Gardner – American Biology Teacher, 2013
Teaching population genetics provides a bridge between genetics and evolution by using examples of the mechanisms that underlie changes in allele frequencies over time. Existing methods of teaching these concepts often rely on computer simulations or hand calculations, which distract students from the material and are problematic for those with…
Descriptors: Evolution, Teaching Methods, Scientific Concepts, Scientific Principles
Svetina, Dubravka; Rutkowski, Leslie – Large-scale Assessments in Education, 2014
Background: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences…
Descriptors: Test Items, Item Response Theory, Item Analysis, Regression (Statistics)

Peer reviewed
Direct link
