Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Mislevy, Robert J. – 1991
This paper lays out a framework for comparing the qualities and the quantities of information about student competence provided by multiple-choice and free-response test items. After discussing the origins of multiple-choice testing and recent influences for change, the paper outlines an "inference network" approach to test theory, in…
Descriptors: Cognitive Psychology, Competence, Elementary Secondary Education, Inferences
Morrison, Carol A.; Fitzpatrick, Steven J. – 1992
An attempt was made to determine which item response theory (IRT) equating method results in the least amount of equating error or "scale drift" when equating scores across one or more test forms. An internal anchor test design was employed with five different test forms, each consisting of 30 items, 10 in common with the base test and 5…
Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Error of Measurement
Emmert, Philip; And Others – 1992
A study examined the interrelationships among the Listening Practices Feedback Report (LPFR) items to determine if an underlying theoretical construct of listening perceptions are measured by the LPFR. As part of LPFR listening training sessions, LPFR respondent and associate(s) scores were gathered from several major companies and organizations.…
Descriptors: Evaluation Methods, Evaluation Problems, Factor Analysis, Interpersonal Communication
Bogan, Evelyn Doody; Yen, Wendy M. – 1983
Four multidimensional data configurations and one unidimensional data configuration were simulated for three differences in mean difficulty between two tests to be equated. Two chi-square statistics, Q1 and Q2, were examined for their ability to detect multidimensionality. Results indicated that Q1 did not discriminate between any of the…
Descriptors: Difficulty Level, Equated Scores, Goodness of Fit, Latent Trait Theory
McCaig, Roger A. – 1982
This paper was prepared for school officials and researchers who plan to conduct an assessment of student writing but have limited field experience with this activity. The paper identifies twelve critical questions assessors should consider, and it explores issues involved in reaching a decision about each from the perspectives of measurement…
Descriptors: Administrators, Educational Researchers, Elementary Secondary Education, Evaluation Methods
Jacobs, Suzanne E. – 1986
Effective writing assessment involves judging how well a writer is encouraged by the classroom's social context to pull together ideas and to bring experience to bear on abstractions. Four main points can be made to justify this view. First, assessment by standardized test determines a teach-and-test model of instruction. But a curriculum that…
Descriptors: Elementary Secondary Education, Holistic Approach, Standardized Tests, Teaching Models
Thorndike, Robert L. – 1986
The general ability factor (G), as enunciated by Charles Spearman in the model of cognitive functioning, has been the foundation of psychometric theory and test making practices for 80 years. Through these decades, some psychologists disagreed with this theory, especially Godfrey Thompson and E. L. Thorndike. Nevertheless, various aptitude tests…
Descriptors: Aptitude Tests, Cognitive Measurement, Cognitive Processes, Intelligence Tests
Coffman, William E. – 1986
The symposium, "Taming the Rasch Tiger: Using Item Response Theory in Practical Educational Measurement," was organized to deemphasize the technical complexities of item response theory (IRT) and to show the audience how IRT can be used in practical educational measurement. Four papers from the symposium are summarized and comments are…
Descriptors: Achievement Tests, Adaptive Testing, Computer Assisted Testing, Item Banks
Levine, Michael V. – 1982
Significant to a latent trait or item response theory analysis of a mental test is the determination of exactly what is being quantified. The following are practical problems to be considered in the formulation of a good theory: (1) deciding whether two tests measure the same trait or traits; (2) analyzing the relative contributions of a pair of…
Descriptors: Item Analysis, Latent Trait Theory, Mathematical Models, Measurement Techniques
de Jong, John H. A. L. – Taaltoetsen: Toegepaste taalwetenschapin artikelen 31, 1988
The one-parameter psychometric model known as the Rasch model is described and examined. The basic principles underlying the model and the concepts of unidimensionality, local stochastic independence, and additivity are explained in non-mathematical terms. The requirements of measurement procedures, the measurement of latent traits, the control on…
Descriptors: English (Second Language), French, Language Tests, Listening Comprehension Tests
Brittain, Mary M.; Brittain, Clay V. – 1981
A behavioral domain is well-defined when it is clear to both test developers and test users which categories of performance should or should not be considered for potential test items. Only those tests that are keyed to well-defined domains meet the definition of criterion-referenced tests. The greatest proliferation of criterion-referenced tests…
Descriptors: Criterion Referenced Tests, Reading Achievement, Reading Tests, Test Construction
Divgi, D. R. – 1980
A method is proposed for providing an absolute, in contrast to comparative, evaluation of how well two tests are equated by transforming their raw scores into a particular common scale. The method is direct, not requiring creation of a standard for comparison; expresses its results in scaled rather than raw scores, and allows examination of the…
Descriptors: Equated Scores, Evaluation Criteria, Item Analysis, Latent Trait Theory
Mardell-Czudnowski, Carol; And Others – Canadian Journal for Exceptional Children, 1987
Thirty French-speaking Quebec preschoolers were administered translated versions of the Developmental Indicators for the Assessment of Learning-Revised (DIAL-R) and the Kaufman Assessment Battery for Children (K-ABC). Discussion focuses on aspects of translation and content requiring further modification, and on issues regarding use of translated…
Descriptors: Diagnostic Tests, Educational Diagnosis, Foreign Countries, Preschool Children
Peer reviewedBracken, Bruce A. – Journal of School Psychology, 1988
Notes that significantly different results frequently exist between tests that purport to measure the same skill when the same child is tested on both instruments. Considers discrepancies related to examinee, examiner, examinee-examiner interactions, environment, and psychometric characteristics of the tests employed. Cites 10 major psychometric…
Descriptors: Educational Diagnosis, Individual Differences, Psychological Evaluation, Psychological Testing
Peer reviewedHuynh, Huynh; Casteel, Jim – Journal of Educational Statistics, 1985
Two approaches, the minimax approach and the Rasch procedure, are described for the simultaneous determination of passing scores for subtests when the passing score for the total test is known. (Author/LMO)
Descriptors: Cutting Scores, Educational Assessment, Elementary Secondary Education, Latent Trait Theory


