Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedWhitely, Susan E. – Intelligence, 1980
This article examines the potential contribution of latent trait models to the study of intelligence. Nontechnical introductions to both unidimensional and multidimensional latent trait models are given. Multidimensional latent trait models can be used to test alternative multiple component theories of test item processing. (Author/CTM)
Descriptors: Ability, Aptitude Tests, Cognitive Processes, Intelligence
Peer reviewedHayward, Malcolm – Teaching English in the Two-Year College, 1989
Describes students' reactions to questions on essay tests. Examines relationships in attitudes and essay questions, the effects of readability on student attitudes, and the effects of question length and rhetorical structure. Concludes that certain rhetorical features of essay questions affect how students respond on tests. (KEH)
Descriptors: Educational Research, English Instruction, Essay Tests, Evaluation Methods
Peer reviewedTittle, Carol Kehr – Educational Measurement: Issues and Practice, 1989
An expanded framework for validating tests is needed to include the perspectives of teachers and students as well as of test makers and scientists. The development of educational assessments must take place within an understanding of how tests are used in context. (SLD)
Descriptors: Educational Assessment, Elementary Secondary Education, Evaluation Utilization, Learning Processes
Peer reviewedDouglas, Dan – Annual Review of Applied Linguistics, 1995
Reviews recent theoretical, methodological, and analytical developments in language testing, focusing on more refined models of language ability, reliability and validity, performance testing, innovative test formats, new applications of Item Response Theory and Generalizability Theory to test performance. An annotated bibliography discusses seven…
Descriptors: Annotated Bibliographies, Evaluation Methods, Language Proficiency, Language Tests
Peer reviewedWhitehead, Bruce; Santee, Phillip – Clearing House, 1994
Discusses the use of standardized test results as a guide to developing curriculum content. Discusses such a plan being used (and offers data gathered) at Hellgate Elementary School, Montana, as an example. (JC)
Descriptors: Criterion Referenced Tests, Curriculum Development, Educational Research, Elementary Education
Peer reviewedCrowley, Susan L.; And Others – Educational and Psychological Measurement, 1994
Dependability of the Children's Depression Inventory (CDI) was studied using both generalizability and classical test score analyses with a sample of 164 elementary school students. Results suggest that sources of error variance interact to decrease dependability of CDI scores. Depression in children might be better assessed through multiple…
Descriptors: Children, Clinical Diagnosis, Comparative Analysis, Depression (Psychology)
Peer reviewedGitomer, Drew H.; Yamamoto, Kentaro – Journal of Educational Measurement, 1991
A model integrating latent trait and latent class theories in characterizing individual performance on the basis of qualitative understanding is presented. This HYBRID model is illustrated through experiments with 119 Air Force technicians taking a paper-and-pencil test and 136 Air Force technicians taking a computerized test. (SLD)
Descriptors: Comparative Testing, Computer Assisted Testing, Educational Assessment, Item Response Theory
Peer reviewedElliott, B. J. – History of Education, 1991
Describes development of qualifying examinations in English and Welsh secondary schools. Evaluates forms and quality of testing in history. Presents suggestions by experts and changes made in England and Wales over 21-year period. Finds that most problems stemmed from desires to cover long periods of time while providing the depth of coverage…
Descriptors: College Entrance Examinations, Educational Attainment, Educational History, Evaluation Utilization
Watson, Kathy; Baranowski, Tom; Thompson, Debbe – Health Education Research, 2006
Perceived self-efficacy (SE) for eating fruit and vegetables (FV) is a key variable mediating FV change in interventions. This study applies item response modeling (IRM) to a fruit, juice and vegetable self-efficacy questionnaire (FVSEQ) previously validated with classical test theory (CTT) procedures. The 24-item (five-point Likert scale) FVSEQ…
Descriptors: Self Efficacy, Ethnic Groups, Questionnaires, Likert Scales
Cizek, Gregory J.; Crocker, Linda; Frisbie, David A.; Mehrens, William A.; Stiggins, Richard J. – Educational Measurement: Issues and Practice, 2006
The authors describe the significant contributions of Robert Ebel to educational measurement theory and its applications. A biographical sketch details Ebel's roots and professional resume. His influence on classroom assessment views and procedures are explored. Classic publications associated with validity, reliability, and score interpretation…
Descriptors: Test Theory, Educational Assessment, Psychometrics, Test Reliability
Handel, Richard W.; Arnau, Randolph C.; Archer, Robert P.; Dandy, Kristina L. – Assessment, 2006
The Minnesota Multiphasic Personality Inventory--Adolescent (MMPI-A) and Minnesota Multiphasic Personality Inventory--2 (MMPI-2) True Response Inconsistency (TRIN) scales are measures of acquiescence and nonacquiescence included among the standard validity scales on these instruments. The goals of this study were to evaluate the effectiveness of…
Descriptors: Adolescents, Protocol Analysis, Effect Size, Personality Measures
McDonald, Roderick P. – Alberta Journal of Educational Research, 2003
The concept of a behavior domain is a reasonable and essential foundation for psychometric work based on true score theory, the linear model of common factor analysis, and the nonlinear models of item response theory. Investigators applying these models to test data generally treat the true scores or factors or traits as abstractive psychological…
Descriptors: Factor Analysis, Error of Measurement, True Scores, Psychometrics
Dudley, Albert – Language Testing, 2006
This study examined the multiple true-false (MTF) test format in second language testing by comparing multiple-choice (MCQ) and multiple true-false (MTF) test formats in two language areas of general English: vocabulary and reading. Two counter-balanced experimental designs--one for each language area--were examined in terms of the number of MCQ…
Descriptors: Second Language Learning, Test Format, Validity, Testing
Lam, Peter; Foong, Yoke-Yeen – 1996
An important principle in constructing rating scales is to develop items that reflect various degrees of the "pro" (positive) and "contra" (negative) aspects of the trait being measured. Where both positive and negative items are pooled, they can be arranged in order along the trait continuum, but for classical and item…
Descriptors: Attitude Measures, Foreign Countries, Internship Programs, Item Response Theory
Thompson, Bruce; Crowley, Susan – 1994
Most training programs in education and psychology focus on classical test theory techniques for assessing score dependability. This paper discusses generalizability theory and explores its concepts using a small heuristic data set. Generalizability theory subsumes and extends classical test score theory. It is able to estimate the magnitude of…
Descriptors: Analysis of Variance, Cutting Scores, Decision Making, Error of Measurement

Direct link
