Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedStrein, William – Journal of School Psychology, 1990
Compared the Woodcock-Johnson Tests of Cognitive Ability (WJTCA) score profiles of different cultural groups, using 442 White and 435 non-White subjects drawn from the kindergarten through grade 12 subset of WJTCA standardization data. Determined that data allowed for classification of the subtests by both curve and cultural effects criteria.…
Descriptors: Classification, Cognitive Ability, Cognitive Measurement, Elementary School Students
Jacobson, Linda – American School Board Journal, 1996
Education standards are left to the discretion of individual states. However, efforts to help states and local school districts define world-class standards are intensifying. The U.S. Department of Education, the National Education Goals Panel, and New Standards, a partnership of 17 states and 6 school districts, are among those involved. (MLF)
Descriptors: Academic Standards, Benchmarking, Comparative Analysis, Educational Assessment
Peer reviewedMislevy, Robert J. – Psychometrika, 1994
Educational assessment concerns inference about student knowledge, skills, and accomplishments. Test theory has evolved in part to address questions of weight, coverage, and import of data. Resulting concepts and techniques can be viewed as applications of more general principles for inference in the presence of uncertainty. (SLD)
Descriptors: Bayesian Statistics, Cognitive Psychology, Educational Assessment, Inferences
Peer reviewedOrnstein, Allan C.; Gilman, David A. – Contemporary Education, 1991
Explains and contrasts the techniques and philosophies of norm-referenced (NRT) and criterion-referenced tests (CRT). NRTs are criticized for lack of useful information and control. CRTs are usually teacher-made and customized to fit the classroom needs, offering more control over test content, but few teachers are prepared to develop them. (SM)
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Norm Referenced Tests, Standardized Tests
Peer reviewedCook, Linda L.; Eignor, Daniel R. – Educational Measurement: Issues and Practice, 1991
This paper provides the basis for understanding score equating through item response theory (IRT). Theoretical justifications and practical advantages of IRT true-score test procedures are discussed. Three steps in the equating process are specified, and a self-test is included. (SLD)
Descriptors: Equated Scores, Equations (Mathematics), Item Response Theory, Mathematical Models
Peer reviewedMarsh, Herbert W.; And Others – Multivariate Behavioral Research, 1992
Results of a reanalysis of previously published data (B. M. Byrne, 1989) support the correlated uniqueness model, diagnostic tests of the validity of confirmatory factor analysis (CFA), multitrait multimethod (MTMM) solutions, inclusion of external validity in MTMM design, and application of factorial invariance to test stability of CFA-MTMM…
Descriptors: Academic Achievement, Construct Validity, Elementary Secondary Education, High Achievement
Peer reviewedBanerji, Madhabi; Ferron, John – Educational and Psychological Measurement, 1998
Three analytic approaches were used in a framework of classical test theory to examine the construct validity of a mathematics assessment of 16 constructed response items. Results from 280 elementary school students across four age groups suggest a developmental structure of tasks and subdomains that was generally consistent with the test's…
Descriptors: Age Differences, Child Development, Construct Validity, Constructed Response
Peer reviewedWheeler, Patricia H. – Evaluation Practice, 1995
This volume is the fourth in a series for college faculty and advanced graduate students, "Survival Skills for Scholars." It offers practical advice for developing, using, and grading classroom examinations, focusing on traditional multiple-choice and constructed-response tests rather than alternative assessments. (SLD)
Descriptors: College Faculty, Constructed Response, Grading, Higher Education
Fenna, Doug S. – European Journal of Engineering Education, 2004
Multiple-choice testing (MCT) has several advantages which are becoming more relevant in the current financial climate. In particular, they can be machine marked. As an objective testing method it is particularly relevant to engineering and other factual courses, but MCTs are not widely used in engineering because students can benefit from…
Descriptors: Guessing (Tests), Testing, Multiple Choice Tests, Engineering Education
Whitman, Glenn – History Teacher, 2003
In May 2001, students in the author's Advanced Placement (AP) United States History class were embroiled in a controversy surrounding the AP exam, in particular, having access to the exam's Document Based Question (DBQ) and free response portion prior to the test's administration. Prior to the exam, the College Board had provided a fifty-year time…
Descriptors: United States History, Standardized Tests, Advanced Placement Programs, Integrity
Watson, Kathy; Baranowski, Tom; Thompson, Debbe – Health Education Research, 2006
Perceived self-efficacy (SE) for eating fruit and vegetables (FV) is a key variable mediating FV change in interventions. This study applies item response modeling (IRM) to a fruit, juice and vegetable self-efficacy questionnaire (FVSEQ) previously validated with classical test theory (CTT) procedures. The 24-item (five-point Likert scale) FVSEQ…
Descriptors: Self Efficacy, Ethnic Groups, Questionnaires, Likert Scales
Cizek, Gregory J.; Crocker, Linda; Frisbie, David A.; Mehrens, William A.; Stiggins, Richard J. – Educational Measurement: Issues and Practice, 2006
The authors describe the significant contributions of Robert Ebel to educational measurement theory and its applications. A biographical sketch details Ebel's roots and professional resume. His influence on classroom assessment views and procedures are explored. Classic publications associated with validity, reliability, and score interpretation…
Descriptors: Test Theory, Educational Assessment, Psychometrics, Test Reliability
Handel, Richard W.; Arnau, Randolph C.; Archer, Robert P.; Dandy, Kristina L. – Assessment, 2006
The Minnesota Multiphasic Personality Inventory--Adolescent (MMPI-A) and Minnesota Multiphasic Personality Inventory--2 (MMPI-2) True Response Inconsistency (TRIN) scales are measures of acquiescence and nonacquiescence included among the standard validity scales on these instruments. The goals of this study were to evaluate the effectiveness of…
Descriptors: Adolescents, Protocol Analysis, Effect Size, Personality Measures
Dudley, Albert – Language Testing, 2006
This study examined the multiple true-false (MTF) test format in second language testing by comparing multiple-choice (MCQ) and multiple true-false (MTF) test formats in two language areas of general English: vocabulary and reading. Two counter-balanced experimental designs--one for each language area--were examined in terms of the number of MCQ…
Descriptors: Second Language Learning, Test Format, Validity, Testing
Zin, Than Than; Williams, John – 1991
Brief explanations are presented of some of the different methods used to score multiple-choice tests; and some studies of partial information, guessing strategies, and test-taking behaviors are reviewed. Studies are grouped in three categories of effort to improve scoring: (1) those that require extra effort from the examinee to answer…
Descriptors: Educational Research, Estimation (Mathematics), Guessing (Tests), Literature Reviews

Direct link
