Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Gierl, Mark J.; Leighton, Jacqueline P.; Wang, Changjiang; Zhou, Jiawen; Gokiert, Rebecca; Tan, Adele – College Board, 2009
The purpose of the study is to present research focused on validating the four algebra cognitive models in Gierl, Wang, et al., using student response data collected with protocol analysis methods to evaluate the knowledge structures and processing skills used by a sample of SAT test takers.
Descriptors: Algebra, Mathematics Tests, College Entrance Examinations, Student Attitudes
Solano-Flores, Guillermo; Li, Min – Educational Assessment, 2009
We investigated language variation and score variation in the testing of English language learners, native Spanish speakers. We gave students the same set of National Assessment of Educational Progress mathematics items in both their first language and their second language. We examined the amount of score variation due to the main and interaction…
Descriptors: Scores, Testing, Second Language Learning, English (Second Language)
Turner, Steven L. – Middle School Journal (J3), 2009
Over the last decade, high-stakes test preparation has crept into the inventory of developmentally responsive middle level instructional practices. Amid calls for increased accountability and more rigorous curriculum and academic standards, the middle school movement now finds itself in a spotlight of intense scrutiny. This article examines the…
Descriptors: Standardized Tests, High Stakes Tests, Academic Standards, Accountability
Huang, Yueh-Min; Lin, Yen-Ting; Cheng, Shu-Chen – Computers & Education, 2009
With the rapid growth of computer and mobile technology, it is a challenge to integrate computer based test (CBT) with mobile learning (m-learning) especially for formative assessment and self-assessment. In terms of self-assessment, computer adaptive test (CAT) is a proper way to enable students to evaluate themselves. In CAT, students are…
Descriptors: Self Evaluation (Individuals), Test Items, Formative Evaluation, Educational Assessment
Hanson, Bradley A.; Feinstein, Zachary S. – 1995
This paper discusses loglinear models for assessing differential item functioning (DIF). Loglinear and logit models that have been suggested for studying DIF are reviewed, and loglinear formulations of the logit models are given. A polynomial loglinear model for assessing DIF is introduced. Two examples using the polynomial loglinear model for…
Descriptors: Equated Scores, Item Bias, Test Format, Test Items
Chang, Hua-Hua; Mazzeo, John – 1993
The item response function (IRF) for a polytomously scored item is defined as a weighted sum of the item category response functions (ICRF, the probability of getting a particular score for a randomly sampled examinee of ability theta). This paper establishes the correspondence between an IRF and a unique set of ICRFs for two of the most commonly…
Descriptors: Classification, Item Response Theory, Scores, Scoring
Woodruff, David J. – 1995
The one observation per cell two-way items by examinees random effects analysis of variance (ANOVA) with all error components zero is considered. The estimated variance components are expressed as functions of the inter-item covariance matrix and the inter-examinee covariance matrix. These expressions show that under the random effects model if…
Descriptors: Analysis of Variance, Estimation (Mathematics), Matrices, Test Items
Woodruff, David J. – 1995
The one observation per cell two-way items by examinees random effects analysis of variance (ANOVA) with all error components zero is considered. The estimated variance components are expressed as functions of the inter-item covariance matrix and the inter-examinee covariance matrix. These expressions show that under the random effects model if…
Descriptors: Analysis of Variance, Estimation (Mathematics), Matrices, Test Items
Peer reviewedCliff, Norman – Journal of Educational Statistics, 1984
The proposed coefficient is derived by assuming that the average Goodman-Kruskal gamma between items of identical difficulty would be the same for items of different difficulty. An estimate of covariance between items of identical difficulty leads to an estimate of the correlation between two tests with identical distributions of difficulty.…
Descriptors: Difficulty Level, Mathematical Formulas, Test Items, Test Reliability
Veldkamp, Bernard P. – 2002
This paper discusses optimal test construction, which deals with the selection of items from a pool to construct a test that performs optimally with respect to the objective of the test and simultaneously meets all test specifications. Optimal test construction problems can be formulated as mathematical decision models. Algorithms and heuristics…
Descriptors: Algorithms, Item Banks, Selection, Test Construction
Huang, Chi-Yu; Lohss, William E.; Lin, Chuan-Ju; Shin, David – 2002
This study was conducted to compare the usefulness of three item response theory (IRT) calibration packages (BILOG, BILOG-MG, and PIC) for examinations that include common and specialty components. Because small sample sizes and different mean abilities between specialty components are the most frequent problems that licensure/certification…
Descriptors: Item Response Theory, Licensing Examinations (Professions), Test Items
Burstein, Jill; Wolff, Susanne; Lu, Chi – 2001
The research described in this paper shows the use of lexical semantic techniques for automated scoring of short-answer and essay responses from performance-based test items. Researchers used lexical semantic techniques in order to identify the meaningful content of free-text responses for small data sets. One data set involved 172 training…
Descriptors: Essays, Performance Based Assessment, Scoring, Test Items
van der Linden, Wim J.; Veldkamp, Bernard P.; Reese, Lynda M. – 2000
Presented is an integer-programming approach to item pool design that can be used to calculate an optimal blueprint for an item pool to support an existing testing program. The results are optimal in the sense that they minimize the efforts involved in actually producing the items as revealed by current item writing patterns. Also presented is an…
Descriptors: Item Banks, Test Construction, Test Items, Testing Programs
Stout, William; Ackerman, Terry; Bolt, Dan; Froelich, Amy Goodwin; Heck, Dan – 2003
This study evaluated the practical benefit, if any, of using collateral information for one item type when statistically analyzing pretest items of some other item type. The criterion for evaluation of pretest item calibration accuracy was the reduction achieved by the use of collateral information in the number of test takers that must be…
Descriptors: Item Response Theory, Pretesting, Test Construction, Test Items
Luppescu, Stuart – 2002
This study compared the ability of hierarchical linear modeling (HLM) to detect differential item functioning (DIF) to standard DIF detection methods, such as Rasch difficulty difference. The big advantages to using HLM for DIF detection are that the person abilities so produced are adjusted for any DIF in the items, and the DIF can then be…
Descriptors: Item Bias, Item Response Theory, Simulation, Test Items

Direct link
