Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedDouglas, Dan; Selinker, Larry – Language Testing, 1985
Discusses an alternative framework for handling language testing and proposes some tentative hypotheses concerning principles of language testing. Suggests that taking account of both interlanguage domain engagement and contextualization in testing research, production, and interpretation allows for a richer conceptualization of the language…
Descriptors: English (Second Language), Interference (Language), Interlanguage, Language Proficiency
Peer reviewedFeldt, Leonard S.; Spray, Judith A. – Research Quarterly for Exercise and Sport, 1983
The reliabilities of two types of measurement plans were compared across six hypothetical distributions of true scores or abilities. The measurement plans were: (1) fixed-length, where the number of trials for all examinees is set in advance; and (2) trials-to-criterion, where examinees must keep trying until they complete a given number of trials…
Descriptors: Criterion Referenced Tests, Evaluation Methods, Higher Education, Measurement Techniques
Mislevy, Robert J. – 1994
Recent developments in cognitive and educational psychology, such as increased appreciation of the situated nature of learning and understanding, call for broader ranges of student models and types of data than those standard in testing today. We must specify how what we observe on the test is related to competence as we conceptualize it, and…
Descriptors: Evaluation Criteria, Inferences, Information Needs, Language Aptitude
Peer reviewedLinn, Robert L. – Educational Measurement: Issues and Practice, 1982
Confusion in the terminology used in criterion-referenced measurement specifications and development and standard setting and the attendant role of cut-off scores are shown to need practical clarification through psychometric research on test applications and consequences. (CM)
Descriptors: Academic Standards, Criterion Referenced Tests, Cutting Scores, Measurement Objectives
Peer reviewedWhitely, Susan E. – Intelligence, 1980
This article examines the potential contribution of latent trait models to the study of intelligence. Nontechnical introductions to both unidimensional and multidimensional latent trait models are given. Multidimensional latent trait models can be used to test alternative multiple component theories of test item processing. (Author/CTM)
Descriptors: Ability, Aptitude Tests, Cognitive Processes, Intelligence
Peer reviewedHayward, Malcolm – Teaching English in the Two-Year College, 1989
Describes students' reactions to questions on essay tests. Examines relationships in attitudes and essay questions, the effects of readability on student attitudes, and the effects of question length and rhetorical structure. Concludes that certain rhetorical features of essay questions affect how students respond on tests. (KEH)
Descriptors: Educational Research, English Instruction, Essay Tests, Evaluation Methods
Peer reviewedTittle, Carol Kehr – Educational Measurement: Issues and Practice, 1989
An expanded framework for validating tests is needed to include the perspectives of teachers and students as well as of test makers and scientists. The development of educational assessments must take place within an understanding of how tests are used in context. (SLD)
Descriptors: Educational Assessment, Elementary Secondary Education, Evaluation Utilization, Learning Processes
Peer reviewedDouglas, Dan – Annual Review of Applied Linguistics, 1995
Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…
Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests
Peer reviewedWhitehead, Bruce; Santee, Phillip – Clearing House, 1994
Discusses the use of standardized test results as a guide to developing curriculum content. Discusses such a plan being used (and offers data gathered) at Hellgate Elementary School, Montana, as an example. (JC)
Descriptors: Criterion Referenced Tests, Curriculum Development, Educational Research, Elementary Education
Peer reviewedCrowley, Susan L.; And Others – Educational and Psychological Measurement, 1994
Dependability of the Children's Depression Inventory (CDI) was studied using both generalizability and classical test score analyses with a sample of 164 elementary school students. Results suggest that sources of error variance interact to decrease dependability of CDI scores. Depression in children might be better assessed through multiple…
Descriptors: Children, Clinical Diagnosis, Comparative Analysis, Depression (Psychology)
Peer reviewedGitomer, Drew H.; Yamamoto, Kentaro – Journal of Educational Measurement, 1991
A model integrating latent trait and latent class theories in characterizing individual performance on the basis of qualitative understanding is presented. This HYBRID model is illustrated through experiments with 119 Air Force technicians taking a paper-and-pencil test and 136 Air Force technicians taking a computerized test. (SLD)
Descriptors: Comparative Testing, Computer Assisted Testing, Educational Assessment, Item Response Theory
Peer reviewedElliott, B. J. – History of Education, 1991
Describes development of qualifying examinations in English and Welsh secondary schools. Evaluates forms and quality of testing in history. Presents suggestions by experts and changes made in England and Wales over 21-year period. Finds that most problems stemmed from desires to cover long periods of time while providing the depth of coverage…
Descriptors: College Entrance Examinations, Educational Attainment, Educational History, Evaluation Utilization
McDonald, Roderick P. – Alberta Journal of Educational Research, 2003
The concept of a behavior domain is a reasonable and essential foundation for psychometric work based on true score theory, the linear model of common factor analysis, and the nonlinear models of item response theory. Investigators applying these models to test data generally treat the true scores or factors or traits as abstractive psychological…
Descriptors: Factor Analysis, Error of Measurement, True Scores, Psychometrics
Lam, Peter; Foong, Yoke-Yeen – 1996
An important principle in constructing rating scales is to develop items that reflect various degrees of the "pro" (positive) and "contra" (negative) aspects of the trait being measured. Where both positive and negative items are pooled, they can be arranged in order along the trait continuum, but for classical and item…
Descriptors: Attitude Measures, Foreign Countries, Internship Programs, Item Response Theory
Thompson, Bruce; Crowley, Susan – 1994
Most training programs in education and psychology focus on classical test theory techniques for assessing score dependability. This paper discusses generalizability theory and explores its concepts using a small heuristic data set. Generalizability theory subsumes and extends classical test score theory. It is able to estimate the magnitude of…
Descriptors: Analysis of Variance, Cutting Scores, Decision Making, Error of Measurement

Direct link
