Publication Date
| In 2026 | 0 |
| Since 2025 | 200 |
| Since 2022 (last 5 years) | 1070 |
| Since 2017 (last 10 years) | 2580 |
| Since 2007 (last 20 years) | 4941 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Dorans, Neil J.; Schmitt, Alicia P. – 1991
Differential item functioning (DIF) assessment attempts to identify items or item types for which subpopulations of examinees exhibit performance differentials that are not consistent with the performance differentials typically seen for those subpopulations on collections of items that purport to measure a common construct. DIF assessment…
Descriptors: Computer Assisted Testing, Constructed Response, Educational Assessment, Item Bias
Boldt, R. F. – 1992
The Test of Spoken English (TSE) is an internationally administered instrument for assessing nonnative speakers' proficiency in speaking English. The research foundation of the TSE examination described in its manual refers to two sources of variation other than the achievement being measured: interrater reliability and internal consistency.…
Descriptors: Adults, Analysis of Variance, Interrater Reliability, Language Proficiency
DeMauro, Gerald E. – 1990
Three papers describe the three stages of developing the National Teacher Examination (NTE) School Psychologist Specialty Area Test. The first stage is described in the paper entitled "Knowledge Areas Important to School Psychology." A survey of the membership of the National Association of School Psychologists helped determine knowledge…
Descriptors: Certification, Elementary Secondary Education, Job Analysis, Job Skills
Pommerich, Mary; And Others – 1995
The Mantel-Haenszel (MH) statistic for identifying differential item functioning (DIF) commonly conditions on the observed test score as a surrogate for conditioning on latent ability. When the comparison group distributions are not completely overlapping (i.e., are incongruent), the observed score represents different levels of latent ability…
Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Bias
Sireci, Stephen G. – 1995
The purpose of this paper is to clarify the seemingly discrepant views of test theorists and test developers about terminology related to the evaluation of test content. The origin and evolution of the concept of content validity are traced, and the concept is reformulated in a way that emphasizes the notion that content domain definition,…
Descriptors: Construct Validity, Content Validity, Definitions, Item Analysis
Messick, Samuel – 1992
Authentic and direct assessments of performances and products are conceptualized in terms of multiple distinctions having implications for validation. These include contrasts between performances and products, between assessment of performance per se and performance assessment of competence or other constructs, between structured and unstructured…
Descriptors: Cognitive Processes, Competence, Educational Assessment, Evaluation Methods
Wainer, Howard; And Others – 1993
The relationship between the multiple-choice and free-response sections of the Computer Science and Chemistry tests of the College Board's Advanced Placement program was studied. Confirmatory factor analysis showed that the free-response sections measure the same underlying proficiency as the multiple-choice sections for the most part. However,…
Descriptors: Advanced Placement, Chemistry, Computer Science, High School Students
Bode, Rita K. – 1995
This study describes the creation of measures of teachers' use of ability grouping in instruction using Rasch analysis. The dimensionality of the proposed construct was also investigated. Results of the Rasch analysis are compared to the results using composites to illustrate how the description of a construct can vary depending on the method used…
Descriptors: Ability Grouping, Classification, Educational Practices, Item Response Theory
Stuck, Ivan – 1995
By focusing on "appropriateness" and "adequacy" of inference and action, unified validity may be misused in rejecting valid test outcomes. The notion of levels of validity is challenged, the necessity of assumption is argued, and experience is proposed as the basis of validity. "Consequential validity" is interpreted as an optional predictive…
Descriptors: Evaluation Methods, Measurement Techniques, Measures (Individuals), Predictive Validity
van der Linden, Wim J. – 1995
Dichotomous item response theory (IRT) models can be viewed as families of stochastically ordered distributions of responses to test items. This paper explores several properties of such distributions. The focus is on the conditions under which stochastic order in families of conditional distributions is transferred to their inverse distributions,…
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Foreign Countries
Schnipke, Deborah L. – 1996
When running out of time on a multiple-choice test, some examinees are likely to respond rapidly to the remaining unanswered items in an attempt to get some items right by chance. Because these responses will tend to be incorrect, the presence of "rapid-guessing behavior" could cause these items to appear to be more difficult than they…
Descriptors: Difficulty Level, Estimation (Mathematics), Guessing (Tests), Item Response Theory
Taube, Kurt T.; Newman, Larry S. – 1996
A method of estimating Rasch-model difficulty calibrations from judges' ratings of item difficulty is described. The ability of judges to estimate item difficulty was assessed by correlating estimated and empirical calibrations on each of four examinations offered by the American Association of State Social Work Boards. Thirteen members of the…
Descriptors: Correlation, Cutting Scores, Difficulty Level, Estimation (Mathematics)
Johanson, George A.; Johanson, Susan N. – 1996
Differential item functioning (DIF), or item bias, occurs when individuals in a focal group respond differently to a test item than do individuals in a reference group even when comparisons are restricted to individuals with similar overall skill levels on the trait in question. It is common in constructing a questionnaire or survey to recommend…
Descriptors: Achievement Tests, Data Analysis, Evaluation Methods, Item Analysis
Alberta Dept. of Education, Edmonton. Language Services Branch. – 1995
The French as a Second Language model tests for advanced levels 7, 8, and 9 were designed to evaluate students' language performance, as outlined in the program of studies for Alberta, Canada, in listening and reading comprehension and oral and written production, communication skills, culture, language and general language knowledge. The tests…
Descriptors: Advanced Courses, Foreign Countries, French, Language Tests
Yang, Wen-Ling – 1997
Using an anchor-item design of test equating, the effects of three equating methods (Tucker linear and two three-parameter item-response-theory-based (3PL-IRT) methods), and the content representativeness of anchor items on the accuracy of equating were examined; and an innovative way of evaluating equating accuracy appropriate for the particular…
Descriptors: Equated Scores, Item Response Theory, Raw Scores, Test Construction


