Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Grunert, Megan L.; Raker, Jeffrey R.; Murphy, Kristen L.; Holme, Thomas A. – Journal of Chemical Education, 2013
The concept of assigning partial credit on multiple-choice test items is considered for items from ACS Exams. Because the items on these exams, particularly the quantitative items, use common student errors to define incorrect answers, it is possible to assign partial credits to some of these incorrect responses. To do so, however, it becomes…
Descriptors: Multiple Choice Tests, Scoring, Scoring Rubrics, Science Tests
Dodonova, Yulia A.; Dodonov, Yury S. – Intelligence, 2013
Using more complex items than those commonly employed within the information-processing approach, but still easier than those used in intelligence tests, this study analyzed how the association between processing speed and accuracy level changes as the difficulty of the items increases. The study involved measuring cognitive ability using Raven's…
Descriptors: Difficulty Level, Intelligence Tests, Cognitive Ability, Accuracy
Hua, Jing; Gu, Guixiong; Meng, Wei; Wu, Zhuochun – Research in Developmental Disabilities: A Multidisciplinary Journal, 2013
The aim of this paper was to examine the validity and reliability of age band 1 of the Movement Assessment Battery for Children-Second Edition (MABC-2) in preparation for its standardization in mainland China. Interrater and test-retest reliability of the MABC-2 was estimated using Intraclass Correlation Coefficient (ICC). Cronbach's alpha for…
Descriptors: Factor Analysis, Test Items, Foreign Countries, Psychometrics
Hong, Eunsook; Peng, Yun; O'Neil, Harold F., Jr.; Wu, Junbin – Journal of Creative Behavior, 2013
The study examined the effects of gender and item content of domain-general and domain-specific creative-thinking tests on four subscale scores of creative-thinking (fluency, flexibility, originality, and elaboration). Chinese tenth-grade students (234 males and 244 females) participated in the study. Domain-general creative thinking was measured…
Descriptors: Creative Thinking, Creativity Tests, Gender Differences, Test Items
Luecht, Richard M. – Journal of Applied Testing Technology, 2013
Assessment engineering is a new way to design and implement scalable, sustainable and ideally lower-cost solutions to the complexities of designing and developing tests. It represents a merger of sorts between cognitive task modeling and engineering design principles--a merger that requires some new thinking about the nature of score scales, item…
Descriptors: Engineering, Test Construction, Test Items, Models
Kim, Jihye; Oshima, T. C. – Educational and Psychological Measurement, 2013
In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…
Descriptors: Test Bias, Test Items, Statistical Analysis, Error of Measurement
Han, Kyung T. – Applied Psychological Measurement, 2013
Most computerized adaptive testing (CAT) programs do not allow test takers to review and change their responses because it could seriously deteriorate the efficiency of measurement and make tests vulnerable to manipulative test-taking strategies. Several modified testing methods have been developed that provide restricted review options while…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Testing
Store, Davie – ProQuest LLC, 2013
The impact of particular types of context effects on actual scores is less understood although there has been some research carried out regarding certain types of context effects under the nonequivalent anchor test (NEAT) design. In addition, the issue of the impact of item context effects on scores has not been investigated extensively when item…
Descriptors: Test Items, Equated Scores, Accuracy, Item Response Theory
Maydeu-Olivares, Alberto; Montano, Rosa – Psychometrika, 2013
We investigate the performance of three statistics, R [subscript 1], R [subscript 2] (Glas in "Psychometrika" 53:525-546, 1988), and M [subscript 2] (Maydeu-Olivares & Joe in "J. Am. Stat. Assoc." 100:1009-1020, 2005, "Psychometrika" 71:713-732, 2006) to assess the overall fit of a one-parameter logistic model…
Descriptors: Foreign Countries, Item Response Theory, Statistics, Data Analysis
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas – Psychometrika, 2013
Most dichotomous item response models share the assumption of latent monotonicity, which states that the probability of a positive response to an item is a nondecreasing function of a latent variable intended to be measured. Latent monotonicity cannot be evaluated directly, but it implies manifest monotonicity across a variety of observed scores,…
Descriptors: Item Response Theory, Statistical Inference, Probability, Psychometrics
Shah, Lisa; Hao, Jie; Schneider, Jeremy; Fallin, Rebekah; Cortes, Kimberly Linenberger; Ray, Herman E.; Rushton, Gregory T. – Journal of Chemical Education, 2018
Teachers play a critical role in the preparation of future science, technology, engineering, and mathematics majors and professionals. What teachers know about their discipline (i.e., content knowledge) has been identified as an important aspect of instructional effectiveness; however, studies have not yet assessed the content knowledge of…
Descriptors: Science Teachers, Science Instruction, Chemistry, Introductory Courses
Singer, Judith D., Ed.; Braun, Henry I., Ed.; Chudowsky, Naomi, Ed. – National Academy of Education, 2018
Results from international large-scale assessments (ILSAs) garner considerable attention in the media, academia, and among policy makers. Although there is widespread recognition that ILSAs can provide useful information, there is debate about what types of comparisons are the most meaningful and what could be done to assure more sound…
Descriptors: International Education, Educational Assessment, Educational Policy, Data Interpretation
Kalender, Ilker – International Journal of Higher Education, 2015
Student evaluations of teaching (SET) have been the principal instrument to elicit students' opinions in higher education institutions. Many decisions, including high-stake ones, are made based on SET scores reported by students. In this respect, reliability of SET scores is of considerable importance. This paper has an argument that there are…
Descriptors: Higher Education, Reliability, Test Items, Measurement
Reckase, Mark D.; McCrory, Raven; Floden, Robert E.; Ferrini-Mundy, Joan; Senk, Sharon L. – Educational Assessment, 2015
Numerous researchers have suggested that there are multiple mathematical knowledge and skill areas needed by teachers in order for them to be effective teachers of mathematics: knowledge of the mathematics that are the goals of instruction, advanced mathematics beyond the instructional material, and mathematical knowledge that is specific to what…
Descriptors: Algebra, Knowledge Base for Teaching, Multidimensional Scaling, Psychometrics
Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu – Educational and Psychological Measurement, 2015
Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…
Descriptors: Item Response Theory, Test Format, Language Usage, Test Items

Peer reviewed
Direct link
