Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Hogan, Thomas P.; Murphy, Gavin – Applied Measurement in Education, 2007
We determined the recommendations for preparing and scoring constructed-response (CR) test items in 25 sources (textbooks and chapters) on educational and psychological measurement. The project was similar to Haladyna's (2004) analysis for multiple-choice items. We identified 12 recommendations for preparing CR items given by multiple sources,…
Descriptors: Test Items, Scoring, Test Construction, Educational Indicators
Yang, Xiangdong – Educational and Psychological Measurement, 2007
This article investigates several methods of identifying individual guessers from their response data. Both the posterior probability method and the likelihood ratio method are based on the two-state mixture modeling approach to response times. The accuracy method is based on response accuracy data. Results from the simulation study showed that…
Descriptors: Probability, Simulation, Test Items, Models
Garoff-Eaton, Rachel J.; Kensinger, Elizabeth A.; Schacter, Daniel L. – Learning & Memory, 2007
False recognition, broadly defined as a claim to remember something that was not encountered previously, can arise for multiple reasons. For instance, a distinction can be made between conceptual false recognition (i.e., false alarms resulting from semantic or associative similarities between studied and tested items) and perceptual false…
Descriptors: Semantics, Recognition (Psychology), Correlation, Neurological Organization
Li, Chi-Sing; Trusty, Jerry; Lampe, Richard; Lin, Yu Fen – International Journal of Educational Leadership Preparation, 2008
The researchers of this study investigated how frequently a set of 17 non-academic behavioral indicators were used to determine impairment of master's-level counseling students that resulted in remediation and termination. Thirty-five academic unit leaders of the Council for Accreditation of Counseling and Related Educational Programs (CACREP)…
Descriptors: Remedial Instruction, Counselor Training, Expulsion, Special Needs Students
Wolf, Mikyung Kim; Herman, Joan L.; Kim, Jinok; Abedi, Jamal; Leon, Seth; Griffin, Noelle; Bachman, Patina L.; Chang, Sandy M.; Farnsworth, Tim; Jung, Hyekyung; Nollner, Julie; Shin, Hye Won – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2008
This research project addresses the validity of assessments used to measure the performance of English language learners (ELLs), such as those mandated by the No Child Left Behind Act of 2001 (NCLB, 2002). The goals of the research are to help educators understand and improve ELL performance by investigating the validity of their current…
Descriptors: Validity, Second Language Learning, Researchers, Language Proficiency
Rodebaugh, Thomas L.; Holaway, Robert M.; Heimberg, Richard G. – Assessment, 2008
Despite favorable psychometric properties, the Generalized Anxiety Disorder Questionnaire for the "Diagnostic and Statistical Manual of Mental Disorders" (4th ed.) (GAD-Q-IV) does not have a known factor structure, which calls into question use of its original weighted scoring system (usually referred to as the dimensional score).…
Descriptors: Mental Disorders, Factor Structure, Measures (Individuals), Scoring
Malabonga, Valerie; Kenyon, Dorry M.; Carlo, Maria; August, Diane; Louguit, Mohammed – Language Testing, 2008
This paper describes the development and validation of the Cognate Awareness Test (CAT), which measures cognate awareness in Spanish-speaking English Language Learners (ELLs) in fourth and fifth grade. An investigation of differential performance on the two subtests of the CAT (cognates and noncognates) provides evidence that the instrument is…
Descriptors: Speech Communication, Second Language Learning, Grade 4, Grade 5
Stocking, Martha L. – 1993
In the context of paper and pencil testing, the frequency of the exposure of items is usually controlled through policies that regulate both the reuse of test forms and the frequency with which a candidate may retake the test. In the context of computerized adaptive testing, where item pools are large and expensive to produce and testing can be on…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Models
Wang, Tianyou – 1996
In this paper, formulas for computing the weights that maximize the reliability of a test with multiple parts are derived using a congeneric model. A direct derivation for the three-part test and case and a two-step derivation for the n-part case are presented, and results for these two approaches are shown to be consistent for the three-part…
Descriptors: Computation, Equations (Mathematics), Matrices, Performance Based Assessment
Johanson, George A.; Gips, Crystal J. – 1993
The decision to use a forced-choice test item format versus an item format where choice is not forced (e.g., a Likert scale) might best be determined by the nature of the information sought since the difficult decisions required for forced-choice formats may result in a different scaling than an unforced method. If a forced choice is desired,…
Descriptors: Administrator Attitudes, Comparative Analysis, Likert Scales, Principals
Holland, Paul W. – 1989
A simple technique, developed by A. Phillips (1987) is used to approximate the covariance between the Mantel-Haenszel log-odds-ratio estimator for a 2 x 2 x k table and the sample marginal proportions. These results are then applied to obtain an approximate variance estimate of an adjusted risk difference based on the Mantel-Haenszel odds-ratio…
Descriptors: Difficulty Level, Estimation (Mathematics), Item Bias, Risk
Shen, Linjun – 1997
Three aspects of the usual approach to assessing local item dependency, Yen's "Q" (H. Huynh, H. Michaels, and S. Ferrara, 1995), deserve further investigation. Pearson correlation coefficients do not distribute normally when the coefficients are large, and thus cannot quantify the dependency well. In the second place, the accuracy of…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Reliability
Meijer, Rob R.; Sijtsma, Klaas – 1994
Methods for detecting item score patterns that are unlikely (aberrant) given that a parametric item response theory (IRT) model gives an adequate description of the data or given the responses of the other persons in the group are discussed. The emphasis here is on the latter group of statistics. These statistics can be applied when a…
Descriptors: Foreign Countries, Identification, Item Response Theory, Nonparametric Statistics
Spray, Judith; Miller, Tim – 1994
Computer simulations under three conditions of polytomous differential item functioning (DIF) compared the ability of three different statistical procedures to detect nonuniform DIF. The procedures were a nominal and an ordinal extension of the Mantel-Haenszel statistic, and logistic discriminant function analysis. Results showed that only the…
Descriptors: Computer Simulation, Identification, Item Bias, Sample Size
Bergstrom, Betty A.; Lunz, Mary E. – 1998
This paper addresses questions of whether positively- and negatively-worded items measure the same construct and whether the rating scale categories "strongly agree" to "strongly disagree" are used in the same way for both types of items. Item response theory (IRT), specifically the Andrich Rating Scale Model (B. Wright and G.…
Descriptors: Adults, Item Response Theory, Rating Scales, Research Methodology

Peer reviewed
Direct link
