Publication Date
| In 2026 | 0 |
| Since 2025 | 222 |
| Since 2022 (last 5 years) | 1091 |
| Since 2017 (last 10 years) | 2601 |
| Since 2007 (last 20 years) | 4962 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 227 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2005
In an Angoff standard-setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item constituting the test. In many cases, these item performance estimates are made twice, with information shared with the judges between estimates. Especially for long tests,…
Descriptors: Test Items, Probability, Standard Setting (Scoring)
Hagtvet, Knut A.; Solhaug, Trond – Scandinavian Journal of Educational Research, 2005
Recent literature on parcel indicators in measurement models used in covariance structural modelling has mainly been concerned with statistical properties of parameter estimates. Less attention has been paid to measurement properties for inferring the assumed latent construct. The present study illustrates a two-facet measurement model that…
Descriptors: Secondary School Students, Methods, Test Items
Johnson, Elizabeth K.; Jusczyk, Peter W.; Cutler, Anne; Norris, Dennis – Cognitive Psychology, 2003
The Possible Word Constraint limits the number of lexical candidates considered in speech recognition by stipulating that input should be parsed into a string of lexically viable chunks. For instance, an isolated single consonant is not a feasible word candidate. Any segmentation containing such a chunk is disfavored. Five experiments using the…
Descriptors: Test Items, Infants, Word Recognition, Experiments
Borsboom, Denny; Mellenbergh, Gideon J.; Van Heerden, Jaap – Applied Psychological Measurement, 2002
In this article, a distinction is made between absolute and relative measurement. Absolute measurement refers to the measurement of traits on a group-invariant scale, and relative measurement refers to the within-group measurement of traits, where the scale of measurement is expressed in terms of the within-group position on a trait. Relative…
Descriptors: Test Items, Measures (Individuals), Test Theory
Sinharay, Sandip; Lu, Ying – ETS Research Report Series, 2007
Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item…
Descriptors: Correlation, Test Items, Item Response Theory, Goodness of Fit
Moses, Tim; Yang, Wen-Ling; Wilson, Christine – Journal of Educational Measurement, 2007
This study explored the use of kernel equating for integrating and extending two procedures proposed for assessing item order effects in test forms that have been administered to randomly equivalent groups. When these procedures are used together, they can provide complementary information about the extent to which item order effects impact test…
Descriptors: Advanced Placement, Equated Scores, Test Items, Item Analysis
Lee, Won-Chan – Applied Psychological Measurement, 2007
This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…
Descriptors: Simulation, Error of Measurement, Scoring, Test Items
Davis, Jon D. – Mathematical Thinking and Learning: An International Journal, 2007
One classroom using two units from a "Standards"-based curriculum was the focus of a study designed to examine the effects of real-world contexts, delays in the introduction of formal mathematics terminology, and multiple function representations on student understanding. Students developed their own terminology for y-intercept, which was tightly…
Descriptors: Class Activities, Mathematics Education, Test Items, Learning Activities
Camp, Gino; Pecher, Diane; Schmidt, Henk G. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2007
Retrieval practice with particular items from memory can impair the recall of related items on a later memory test. This retrieval-induced forgetting effect has been ascribed to inhibitory processes (M. C. Anderson & B. A. Spellman, 1995). A critical finding that distinguishes inhibitory from interference explanations is that forgetting is found…
Descriptors: Memory, Cues, Test Items, Recall (Psychology)
Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R. – Physical Review Special Topics - Physics Education Research, 2007
We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…
Descriptors: Physics, Statistical Analysis, Multivariate Analysis, College Students
Kasintorn, Tanachit – ProQuest LLC, 2009
The purpose of this study was to develop a test of academic readiness for first grade instruction in Thailand. Test of Academic Readiness (TAR) consists of six domains: verbal, visual, memory, math, logical, and general knowledge. Two pilot studies were carried out and a main study tested items in those domains. Rasch model was used to assess the…
Descriptors: Content Validity, Reading Readiness Tests, Doctoral Dissertations, Foreign Countries
Ives, Sarah Elizabeth – ProQuest LLC, 2009
The purposes of this study were to investigate preservice mathematics teachers' orientations, content knowledge, and pedagogical content knowledge of probability; the relationships among these three aspects; and the usefulness of tasks with respect to examining these aspects of knowledge. The design of the study was a multi-case study of five…
Descriptors: Preservice Teachers, Test Items, Mathematics Teachers, Probability
Wang, Jing – ProQuest LLC, 2009
The ultimate goal of physics education research (PER) is to develop a theoretical framework to understand and improve the learning process. In this journey of discovery, assessment serves as our headlamp and alpenstock. It sometimes detects signals in student mental structures, and sometimes presents the difference between expert understanding and…
Descriptors: Test Items, Mathematical Models, Educational Testing, Physics
National Assessment Governing Board, 2009
As the ongoing national indicator of what American students know and can do, the National Assessment of Educational Progress (NAEP) in Reading regularly collects achievement information on representative samples of students in grades 4, 8, and 12. The information that NAEP provides about student achievement helps the public, educators, and…
Descriptors: National Competency Tests, Reading Tests, Test Items, Test Format
Huang, Chiungjung – Educational and Psychological Measurement, 2009
This study examined the percentage of task-sampling variability in performance assessment via a meta-analysis. In total, 50 studies containing 130 independent data sets were analyzed. Overall results indicate that the percentage of variance for (a) differential difficulty of task was roughly 12% and (b) examinee's differential performance of the…
Descriptors: Test Bias, Research Design, Performance Based Assessment, Performance Tests

Peer reviewed
Direct link
