Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Shilna, V.; Gafoor, K. Abdul – Online Submission, 2016
Learning chemistry is a hard task for many secondary school students; thus, students find it tough to score better marks in chemistry. Researchers have identified many reasons and suggested lots of alternatives to overcome difficulties in chemistry. This paper focuses on whether the test item construction has any role in the response pattern of…
Descriptors: Cognitive Style, Short Term Memory, Multiple Choice Tests, Science Tests
Liu, Yuanyuan – English Language Teaching, 2020
Writing anxiety is one of the most essential factors influencing language learning. The current study is to explore the effect of sentence-making practice on reducing writing anxiety of two classes of adult EFL learners, one in low-intermediate level (LI learners), the other in high-intermediate level (HI learners). Two classes received two-week…
Descriptors: Writing Apprehension, Second Language Learning, Second Language Instruction, English (Second Language)
Smajic, Adnan; Merritt, Stephanie; Banister, Christina; Blinebry, Amanda – Teaching of Psychology, 2014
Laboratory studies have established a negative relationship between the color red and academic performance. This research examined whether this effect would generalize to classroom performance and whether anxiety and negative affect might mediate the effect. In two studies, students taking classroom exams were randomly assigned an exam color. We…
Descriptors: Color, Anxiety, Performance, Tests
Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014
The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…
Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation
Oliveri, María Elena; Ercikan, Kadriye; Zumbo, Bruno D.; Lawless, René – International Journal of Testing, 2014
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response…
Descriptors: Test Bias, Comparative Analysis, Reading Tests, Effect Size
Guo, Hongwen; Puhan, Gautam; Walker, Michael – ETS Research Report Series, 2013
In this study we investigated when an equating conversion line is problematic in terms of gaps and clumps. We suggest using the conditional standard error of measurement (CSEM) to measure the scale scores that are inappropriate in the overall raw-to-scale transformation.
Descriptors: Equated Scores, Test Items, Evaluation Criteria, Error of Measurement
Schulz, E. Matthew – Measurement: Interdisciplinary Research and Perspectives, 2013
In this article, E. Matthew Schulz responds to Adam Wyse's article, "Construct Maps as a Foundation for Standard Setting." In doing so, he asserts that one of the most important ideas in Wyse's work is that information used in standard setting needs to be better represented through the use of graphics. However, he's not…
Descriptors: Standard Setting (Scoring), Maps, Item Response Theory, Test Items
Chen, Jinsong; de la Torre, Jimmy – Applied Psychological Measurement, 2013
Polytomous attributes, particularly those defined as part of the test development process, can provide additional diagnostic information. The present research proposes the polytomous generalized deterministic inputs, noisy, "and" gate (pG-DINA) model to accommodate such attributes. The pG-DINA model allows input from substantive experts…
Descriptors: Models, Cognitive Tests, Diagnostic Tests, Computation
Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013
In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…
Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items
Grand, James A.; Golubovich, Juliya; Ryan, Ann Marie; Schmitt, Neal – Organizational Behavior and Human Decision Processes, 2013
In organizational and educational practices, sensitivity reviews are commonly advocated techniques for reducing test bias and enhancing fairness. In the present paper, results from two studies are reported which investigate how effective individuals are at detecting problematic test content and the influence such content has on important testing…
Descriptors: Test Items, Test Content, Test Bias, Individual Differences
Meyer, Joseph F.; Faust, Kyle A.; Faust, David; Baker, Aaron M.; Cook, Nathan E. – International Journal of Mental Health and Addiction, 2013
Even when relatively infrequent, careless and random responding (C/RR) can have robust effects on individual and group data and thereby distort clinical evaluations and research outcomes. Given such potential adverse impacts and the broad use of self-report measures when appraising addictions and addictive behavior, the detection of C/RR can…
Descriptors: Addictive Behavior, Response Style (Tests), Test Items, Validity
Geerlings, Hanneke; van der Linden, Wim J.; Glas, Cees A. W. – Applied Psychological Measurement, 2013
Optimal test-design methods are applied to rule-based item generation. Three different cases of automated test design are presented: (a) test assembly from a pool of pregenerated, calibrated items; (b) test generation on the fly from a pool of calibrated item families; and (c) test generation on the fly directly from calibrated features defining…
Descriptors: Test Construction, Test Items, Item Banks, Automation
Cheng, Ying; Chen, Peihua; Qian, Jiahe; Chang, Hua-Hua – Applied Psychological Measurement, 2013
Differential item functioning (DIF) analysis is an important step in the data analysis of large-scale testing programs. Nowadays, many such programs endorse matrix sampling designs to reduce the load on examinees, such as the balanced incomplete block (BIB) design. These designs pose challenges to the traditional DIF analysis methods. For example,…
Descriptors: Test Bias, Equated Scores, Test Items, Effect Size
Koen, Joshua D.; Yonelinas, Andrew P. – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2013
Koen and Yonelinas (2010) contrasted the recollection and encoding variability accounts of the finding that old items are associated with more variable memory strength than new items. The study indicated that (a) increasing encoding variability did not lead to increased measures of old item variance, and (b) old item variance was directly related…
Descriptors: Recall (Psychology), Memory, Cognitive Processes, Models
Matlock, Ki Lynn – ProQuest LLC, 2013
When test forms that have equal total test difficulty and number of items vary in difficulty and length within sub-content areas, an examinee's estimated score may vary across equivalent forms, depending on how well his or her true ability in each sub-content area aligns with the difficulty of items and number of items within these areas.…
Descriptors: Test Items, Difficulty Level, Ability, Test Content

Peer reviewed
Direct link
